17.2.202. MPI_Ialltoallv

MPI_Alltoallv, MPI_Ialltoallv, MPI_Alltoallv_init - All processes send different amount of data to, and receive different amount of data from, all processes

17.2.202.1. SYNTAX

17.2.202.1.1. C Syntax

#include <mpi.h>

int MPI_Alltoallv(const void *sendbuf, const int sendcounts[],
     const int sdispls[], MPI_Datatype sendtype,
     void *recvbuf, const int recvcounts[],
     const int rdispls[], MPI_Datatype recvtype, MPI_Comm comm)

int MPI_Ialltoallv(const void *sendbuf, const int sendcounts[],
     const int sdispls[], MPI_Datatype sendtype,
     void *recvbuf, const int recvcounts[],
     const int rdispls[], MPI_Datatype recvtype, MPI_Comm comm,
     MPI_Request *request)

int MPI_Alltoallv_init(const void *sendbuf, const int sendcounts[],
     const int sdispls[], MPI_Datatype sendtype,
     void *recvbuf, const int recvcounts[],
     const int rdispls[], MPI_Datatype recvtype, MPI_Comm comm,
     MPI_Info info, MPI_Request *request)

17.2.202.1.2. Fortran Syntax

USE MPI
! or the older form: INCLUDE 'mpif.h'
MPI_ALLTOALLV(SENDBUF, SENDCOUNTS, SDISPLS, SENDTYPE,
     RECVBUF, RECVCOUNTS, RDISPLS, RECVTYPE, COMM, IERROR)

     <type>  SENDBUF(*), RECVBUF(*)
     INTEGER SENDCOUNTS(*), SDISPLS(*), SENDTYPE
     INTEGER RECVCOUNTS(*), RDISPLS(*), RECVTYPE
     INTEGER COMM, IERROR

MPI_IALLTOALLV(SENDBUF, SENDCOUNTS, SDISPLS, SENDTYPE,
     RECVBUF, RECVCOUNTS, RDISPLS, RECVTYPE, COMM, REQUEST, IERROR)

     <type>  SENDBUF(*), RECVBUF(*)
     INTEGER SENDCOUNTS(*), SDISPLS(*), SENDTYPE
     INTEGER RECVCOUNTS(*), RDISPLS(*), RECVTYPE
     INTEGER COMM, REQUEST, IERROR

MPI_ALLTOALLV_INIT(SENDBUF, SENDCOUNTS, SDISPLS, SENDTYPE,
     RECVBUF, RECVCOUNTS, RDISPLS, RECVTYPE, COMM, INFO, REQUEST, IERROR)

     <type>  SENDBUF(*), RECVBUF(*)
     INTEGER SENDCOUNTS(*), SDISPLS(*), SENDTYPE
     INTEGER RECVCOUNTS(*), RDISPLS(*), RECVTYPE
     INTEGER COMM, INFO, REQUEST, IERROR

17.2.202.1.3. Fortran 2008 Syntax

USE mpi_f08
MPI_Alltoallv(sendbuf, sendcounts, sdispls, sendtype, recvbuf, recvcounts,
             rdispls, recvtype, comm, ierror)

     TYPE(*), DIMENSION(..), INTENT(IN) :: sendbuf
     TYPE(*), DIMENSION(..) :: recvbuf
     INTEGER, INTENT(IN) :: sendcounts(*), sdispls(*), recvcounts(*),
     rdispls(*)
     TYPE(MPI_Datatype), INTENT(IN) :: sendtype, recvtype
     TYPE(MPI_Comm), INTENT(IN) :: comm
     INTEGER, OPTIONAL, INTENT(OUT) :: ierror

MPI_Ialltoallv(sendbuf, sendcounts, sdispls, sendtype, recvbuf, recvcounts,
             rdispls, recvtype, comm, request, ierror)

     TYPE(*), DIMENSION(..), INTENT(IN), ASYNCHRONOUS :: sendbuf
     TYPE(*), DIMENSION(..), ASYNCHRONOUS :: recvbuf
     INTEGER, INTENT(IN), ASYNCHRONOUS :: sendcounts(*), sdispls(*),
     recvcounts(*), rdispls(*)
     TYPE(MPI_Datatype), INTENT(IN) :: sendtype, recvtype
     TYPE(MPI_Comm), INTENT(IN) :: comm
     TYPE(MPI_Request), INTENT(OUT) :: request
     INTEGER, OPTIONAL, INTENT(OUT) :: ierror

MPI_Alltoallv_init(sendbuf, sendcounts, sdispls, sendtype, recvbuf, recvcounts,
                     rdispls, recvtype, comm, info, request, ierror)

     TYPE(*), DIMENSION(..), INTENT(IN), ASYNCHRONOUS :: sendbuf
     TYPE(*), DIMENSION(..), ASYNCHRONOUS :: recvbuf
     INTEGER, INTENT(IN), ASYNCHRONOUS :: sendcounts(*), sdispls(*),
     recvcounts(*), rdispls(*)
     TYPE(MPI_Datatype), INTENT(IN) :: sendtype, recvtype
     TYPE(MPI_Comm), INTENT(IN) :: comm
     TYPE(MPI_Info), INTENT(IN) :: info
     TYPE(MPI_Request), INTENT(OUT) :: request
     INTEGER, OPTIONAL, INTENT(OUT) :: ierror

17.2.202.2. INPUT PARAMETERS

  • sendbuf: Starting address of send buffer.

  • sendcounts: Integer array, where entry i specifies the number of elements to send to rank i.

  • sdispls: Integer array, where entry i specifies the displacement (offset from sendbuf, in units of sendtype) from which to send data to rank i.

  • sendtype: Datatype of send buffer elements.

  • recvcounts: Integer array, where entry j specifies the number of elements to receive from rank j.

  • rdispls: Integer array, where entry j specifies the displacement (offset from recvbuf, in units of recvtype) to which data from rank j should be written.

  • recvtype: Datatype of receive buffer elements.

  • comm: Communicator over which data is to be exchanged.

  • info: Info (handle, persistent only)

17.2.202.3. OUTPUT PARAMETERS

  • recvbuf: Address of receive buffer.

  • request: Request (handle, non-blocking and persistent only).

  • ierror: Fortran only: Error status.

17.2.202.4. DESCRIPTION

MPI_Alltoallv is a generalized collective operation in which all processes send data to and receive data from all other processes. It adds flexibility to MPI_Alltoall by allowing the user to specify data to send and receive vector-style (via a displacement and element count). The operation of this routine can be thought of as follows, where each process performs 2n (n being the number of processes in communicator comm) independent point-to-point communications (including communication with itself).

MPI_Comm_size(comm, &n);
for (i = 0, i < n; i++)
    MPI_Send(sendbuf + sdispls[i] * extent(sendtype),
        sendcounts[i], sendtype, i, ..., comm);
for (i = 0, i < n; i++)
    MPI_Recv(recvbuf + rdispls[i] * extent(recvtype),
        recvcounts[i], recvtype, i, ..., comm);

Process j sends the k-th block of its local sendbuf to process k, which places the data in the j-th block of its local recvbuf.

When a pair of processes exchanges data, each may pass different element count and datatype arguments so long as the sender specifies the same amount of data to send (in bytes) as the receiver expects to receive.

Note that process i may send a different amount of data to process j than it receives from process j. Also, a process may send entirely different amounts of data to different processes in the communicator.

WHEN COMMUNICATOR IS AN INTER-COMMUNICATOR

When the communicator is an inter-communicator, the gather operation occurs in two phases. The data is gathered from all the members of the first group and received by all the members of the second group. Then the data is gathered from all the members of the second group and received by all the members of the first. The operation exhibits a symmetric, full-duplex behavior.

The first group defines the root process. The root process uses MPI_ROOT as the value of root. All other processes in the first group use MPI_PROC_NULL as the value of root. All processes in the second group use the rank of the root process in the first group as the value of root.

When the communicator is an intra-communicator, these groups are the same, and the operation occurs in a single phase.

17.2.202.5. USE OF IN-PLACE OPTION

When the communicator is an intracommunicator, you can perform an all-to-all operation in-place (the output buffer is used as the input buffer). Use the variable MPI_IN_PLACE as the value of sendbuf. In this case, sendcounts, sdispls, and sendtype are ignored. The input data of each process is assumed to be in the area where that process would receive its own contribution to the receive buffer.

17.2.202.6. NOTES

The specification of counts and displacements should not cause any location to be written more than once.

All arguments on all processes are significant. The comm argument, in particular, must describe the same communicator on all processes.

The offsets of sdispls and rdispls are measured in units of sendtype and recvtype, respectively. Compare this to MPI_Alltoallw, where these offsets are measured in bytes.

17.2.202.7. ERRORS

Almost all MPI routines return an error value; C routines as the return result of the function and Fortran routines in the last argument.

Before the error value is returned, the current MPI error handler associated with the communication object (e.g., communicator, window, file) is called. If no communication object is associated with the MPI call, then the call is considered attached to MPI_COMM_SELF and will call the associated MPI error handler. When MPI_COMM_SELF is not initialized (i.e., before MPI_Init/MPI_Init_thread, after MPI_Finalize, or when using the Sessions Model exclusively) the error raises the initial error handler. The initial error handler can be changed by calling MPI_Comm_set_errhandler on MPI_COMM_SELF when using the World model, or the mpi_initial_errhandler CLI argument to mpiexec or info key to MPI_Comm_spawn/MPI_Comm_spawn_multiple. If no other appropriate error handler has been set, then the MPI_ERRORS_RETURN error handler is called for MPI I/O functions and the MPI_ERRORS_ABORT error handler is called for all other MPI functions.

Open MPI includes three predefined error handlers that can be used:

  • MPI_ERRORS_ARE_FATAL Causes the program to abort all connected MPI processes.

  • MPI_ERRORS_ABORT An error handler that can be invoked on a communicator, window, file, or session. When called on a communicator, it acts as if MPI_Abort was called on that communicator. If called on a window or file, acts as if MPI_Abort was called on a communicator containing the group of processes in the corresponding window or file. If called on a session, aborts only the local process.

  • MPI_ERRORS_RETURN Returns an error code to the application.

MPI applications can also implement their own error handlers by calling:

Note that MPI does not guarantee that an MPI program can continue past an error.

See the MPI man page for a full list of MPI error codes.

See the Error Handling section of the MPI-3.1 standard for more information.