10.3. Shared Memory
10.3.1. What is the sm BTL?
sm BTL is a low-latency, high-bandwidth mechanism for
transferring data between two processes via shared memory. This BTL
can only be used between processes executing on the same node.
Between Open MPI version 1.8.0 and 4.1.x, the shared memory
BTL was named
vader. As of Open MPI version 5.0.0, the
BTL has been renamed
In Open MPI version 5.0.x, the name
vader is simply
an alias for the
sm BTL. Similarly, all
vader_-prefixed MCA parameters are automatically
aliased to their corresponding
This alias mechanism is a legacy transition device, and will likely disappear in a future release of Open MPI.
10.3.2. How do I specify use of sm for MPI messages?
Typically, it is unnecessary to do so; OMPI will use the best BTL available for each communication.
Nevertheless, you may use the MCA parameter
btl. You should also
self BTL for communications between a process and
itself. Furthermore, if not all processes in your job will run on the
same, single node, then you also need to specify a BTL for internode
communications. For example:
shell$ mpirun --mca btl self,sm,tcp -n 16 ./a.out
10.3.3. How can I tune these parameters to improve performance?
Mostly, the default values of the MCA parameters have already been chosen to give good performance. To improve performance further is a little bit of an art. Sometimes, it’s a matter of trading off performance for memory.
btl_sm_eager_limit: If message data plus header information fits within this limit, the message is sent “eagerly” — that is, a sender attempts to write its entire message to shared buffers without waiting for a receiver to be ready. Above this size, a sender will only write the first part of a message, then wait for the receiver to acknowledge its readiness before continuing. Eager sends can improve performance by decoupling senders from receivers.
btl_sm_max_send_size: Large messages are sent in fragments of this size. Larger segments can lead to greater efficiencies, though they could perhaps also inhibit pipelining between sender and receiver.
btl_sm_free_list_num: This is the initial number of fragments on each (eager and max) free list. The free lists can grow in response to resource congestion, but you can increase this parameter to pre-reserve space for more fragments.
Where is the shared memory mapped on the filesystem?
TODO Is this correct?
The file will be in the OMPI session directory, which is typically
The file itself will have the name
shared_mem_pool.HOSTNAME. For example, the full path could be
TODO The filename above will certainly be wrong.
To place the session directory in a non-default location, use the MCA parameter
TODO The MCA param name above is definitely wrong.