17.2.472. MPIX_Comm_ishrink

MPIX_Comm_shrink, MPIX_Comm_ishrink - Create a new communicator that includes all processes from the parent communicator that have not failed.

This is part of the User Level Fault Mitigation ULFM extension.

17.2.472.1. SYNTAX

17.2.472.1.1. C Syntax

#include <mpi.h>
#include <mpi-ext.h>

int MPIX_Comm_shrink(MPI_Comm comm, MPI_Comm *newcomm)

int MPIX_Comm_ishrink(MPI_Comm comm, MPI_Comm *newcomm, MPI_Request *request)

17.2.472.1.2. Fortran Syntax

USE MPI
USE MPI_EXT
! or the older form: INCLUDE 'mpif.h'

MPIX_COMM_SHRINK(COMM, NEWCOMM, IERROR)
     INTEGER COMM, NEWCOMM, IERROR

MPIX_COMM_ISHRINK(COMM, NEWCOMM, REQUEST, IERROR)
     INTEGER COMM, NEWCOMM, REQUEST, IERROR

17.2.472.1.3. Fortran 2008 Syntax

USE mpi_f08
USE mpi_ext_f08

MPIX_Comm_shrink(comm, newcomm, ierror)
     TYPE(MPI_Comm), INTENT(IN) :: comm
     TYPE(MPI_Comm), INTENT(OUT) :: newcomm
     INTEGER, OPTIONAL, INTENT(OUT) :: ierror

MPIX_Comm_ishrink(comm, newcomm, request, ierror)
     TYPE(MPI_Comm), INTENT(IN) :: comm
     TYPE(MPI_Comm), INTENT(OUT), ASYNCHRONOUS :: newcomm
     TYPE(MPI_Request), INTENT(OUT) :: request
     INTEGER, OPTIONAL, INTENT(OUT) :: ierror

17.2.472.2. INPUT PARAMETERS

  • comm: Communicator (handle).

17.2.472.3. OUTPUT PARAMETERS

  • newcomm: Communicator (handle).

  • request: Request (handle, non-blocking only).

  • ierror: Fortran only: Error status (integer).

17.2.472.4. DESCRIPTION

This collective operation creates a new intra- or intercommunicator newcomm from the intra- or intercommunicator comm, respectively, by excluding the group of failed MPI processes as shrinkd upon during the operation.

The groups of newcomm must include every MPI process that returns from MPIX_Comm_shrink, and it must exclude every MPI process whose failure caused an operation on comm to raise an MPI error of class MPIX_ERR_PROC_FAILED or MPIX_ERR_PROC_FAILED_PENDING at a member of the groups of newcomm, before that member initiated the shrink operation.

Said otherwise, this procedure is semantically equivalent to an MPI_Comm_split operation that would succeed despite failures, where members of the groups of newcomm participate with the same color and a key equal to their rank in comm.

MPIX_Comm_ishrink is the non-blocking variant of MPIX_Comm_shrink. Note that, as with MPI_Comm_idup, it is erroneous to use newcomm before request has completed.

17.2.472.5. WHEN THE COMMUNICATOR IS REVOKED OR CONTAINS FAILED PROCESSES

This function never raises an error of classes MPIX_ERR_REVOKED or MPIX_ERR_PROC_FAILED. The defined semantics of MPIX_Comm_shrink and MPIX_Comm_ishrink are maintained when comm is revoked, or when the group of comm contains failed MPI processes. In particular, MPIX_Comm_shrink and MPIX_Comm_ishrink are collective operations, even when comm is revoked.

The implementation will strive to detect all failures during the shrink operation, but in certain circumpstances, the group of newcomm may still contain failed MPI processes, whose failure will be detected in subsequent MPI operations on newcomm.

17.2.472.6. ERRORS

Almost all MPI routines return an error value; C routines as the return result of the function and Fortran routines in the last argument.

Before the error value is returned, the current MPI error handler associated with the communication object (e.g., communicator, window, file) is called. If no communication object is associated with the MPI call, then the call is considered attached to MPI_COMM_SELF and will call the associated MPI error handler. When MPI_COMM_SELF is not initialized (i.e., before MPI_Init/MPI_Init_thread, after MPI_Finalize, or when using the Sessions Model exclusively) the error raises the initial error handler. The initial error handler can be changed by calling MPI_Comm_set_errhandler on MPI_COMM_SELF when using the World model, or the mpi_initial_errhandler CLI argument to mpiexec or info key to MPI_Comm_spawn/MPI_Comm_spawn_multiple. If no other appropriate error handler has been set, then the MPI_ERRORS_RETURN error handler is called for MPI I/O functions and the MPI_ERRORS_ABORT error handler is called for all other MPI functions.

Open MPI includes three predefined error handlers that can be used:

  • MPI_ERRORS_ARE_FATAL Causes the program to abort all connected MPI processes.

  • MPI_ERRORS_ABORT An error handler that can be invoked on a communicator, window, file, or session. When called on a communicator, it acts as if MPI_Abort was called on that communicator. If called on a window or file, acts as if MPI_Abort was called on a communicator containing the group of processes in the corresponding window or file. If called on a session, aborts only the local process.

  • MPI_ERRORS_RETURN Returns an error code to the application.

MPI applications can also implement their own error handlers by calling:

Note that MPI does not guarantee that an MPI program can continue past an error.

See the MPI man page for a full list of MPI error codes.

See the Error Handling section of the MPI-3.1 standard for more information.