13.8. Internal frameworks

The Modular Component Architecture (MCA) is the backbone of Open MPI – most services and functionality are implemented through MCA components.

13.8.1. MPI layer frameworks

Here is a list of all the component frameworks in the MPI layer of Open MPI:

bml: BTL management layer
coll: MPI collective algorithms
fbtl: point to point file byte transfer layer: abstraction for individual read: collective read and write operations for MPI I/O
fcoll: collective file system functions for MPI I/O
fs: file system functions for MPI I/O
hook: Generic hooks into Open MPI
io: MPI I/O
mtl: Matching transport layer, used for MPI point-to-point messages on some types of networks
op: Back end computations for intrinsic MPI_Op operators
osc: MPI one-sided communications
pml: MPI point-to-point management layer
sharedfp: shared file pointer operations for MPI I/O
topo: MPI topology routines
vprotocol: Protocols for the “v” PML

13.8.2. OpenSHMEM component frameworks

atomic: OpenSHMEM atomic operations
memheap: OpenSHMEM memory allocators that support the PGAS memory model
scoll: OpenSHMEM collective operations
spml: OpenSHMEM “pml-like” layer: supports one-sided, point-to-point operations
sshmem: OpenSHMEM shared memory backing facility

13.8.3. Miscellaneous frameworks

allocator: Memory allocator
backtrace: Debugging call stack backtrace support
btl: Point-to-point Byte Transfer Layer
dl: Dynamic loading library interface
hwloc: Hardware locality (hwloc) versioning support
if: OS IP interface support
installdirs: Installation directory relocation services
memchecker: Run-time memory checking
memcpy: Memory copy support
memory: Memory management hooks
mpool: Memory pooling
patcher: Symbol patcher hooks
pmix: Process management interface (exascale)
rcache: Memory registration cache
reachable: Network reachability determination
shmem: Shared memory support (NOT related to OpenSHMEM)
smsc: Shared memory single-copy support
threads: OS and userspace thread support
timer: High-resolution timers

13.8.4. Framework notes

Each framework typically has one or more components that are used at run-time. For example, the btl framework is used by the MPI layer to send bytes across different types underlying networks. The tcp btl, for example, sends messages across TCP-based networks; the ucx pml sends messages across InfiniBand-based networks.

13.8.5. MCA parameter notes

Each component typically has some tunable parameters that can be changed at run-time. Use the ompi_info(1) command to check a component to see what its tunable parameters are. For example:

shell$ ompi_info --param btl tcp

shows some of the parameters (and default values) for the tcp btl component (use --all or --level 9 to show all the parameters).

Note that ompi_info (without --all or a specified level) only shows a small number a component’s MCA parameters by default. Each MCA parameter has a “level” value from 1 to 9, corresponding to the MPI-3 MPI_T tool interface levels. See the LEVELS section in the ompi_info(1) man page for an explanation of the levels and how they correspond to Open MPI’s code.

Here’s rules of thumb to keep in mind when using Open MPI’s levels:

Levels 1-3:
- These levels should contain only a few MCA parameters.
- Generally, only put MCA parameters in these levels that matter to users who just need to run Open MPI applications (and don’t know/care anything about MPI). Examples (these are not comprehensive):
  - Selection of which network interfaces to use.
  - Selection of which MCA components to use.
  - Selective disabling of warning messages (e.g., show warning message XYZ unless a specific MCA parameter is set, which disables showing that warning message).
  - Enabling additional stderr logging verbosity. This allows a user to run with this logging enabled, and then use that output to get technical assistance.
Levels 4-6:
- These levels should contain any other MCA parameters that are useful to expose to end users.
- There is an expectation that “power users” will utilize these MCA parameters — e.g., those who are trying to tune the system and extract more performance.
- Here’s some examples of MCA parameters suitable for these levels (these are not comprehensive):
  - When you could have hard-coded a constant size of a resource (e.g., a resource pool size or buffer length), make it an MCA parameter instead.
  - When there are multiple different algorithms available for a particular operation, code them all up and provide an MCA parameter to let the user select between them.
Levels 7-9:
- Put any other MCA parameters here.
- It’s ok for these MCA parameters to be esoteric and only relevant to deep magic / the internals of Open MPI.
- There is little expectation of users using these MCA parameters.

See this section for details on how to set MCA parameters at run time.