Using UCX
UCX is a communication library which is primarily used and recommended on Mellanox InfiniBand systems. The library can be used with both the Intel MPI and Open MPI distributions of Simcenter STAR-CCM+ on Linux systems.
UCX Library Selection on x86_64
Simcenter STAR-CCM+ comes with distributions of UCX versions 1.8.0 and 1.14.0. This is to avoid performance regresssion with newer UCX versions on older hardware. When using UCX, a suitable version is automatically selected by Simcenter STAR-CCM+, and by default is used with both Open MPI and Intel MPI on Linux. Which version is automatically selected depends on the available network hardware:
- On InfiniBand systems not supporting the Dynamically Connected (DC) transport (that is, predating ConnectX-4), UCX 1.8.0 is selected.
- On all other systems, UCX 1.14.0 is selected.
To find out which version of UCX is being used by Simcenter STAR-CCM+, include the option-fabricverbose
in the
starccm+
command and examine the output.
If you want to use a local installation of UCX 1.8.0 or newer, observe the following requirements:
- The local library location must be in the system-wide library path or be supplied to
Simcenter STAR-CCM+ using the
-ldlibpath
flag. - The binary location must be added to the system-wide
$PATH
environment variable so that Simcenter STAR-CCM+ can detect the correct UCX version. - To avoid using the bundled UCX distribution, you must pass the expert option
-xsystemucx
to Simcenter STAR-CCM+.
For example, on a system where the "legacy" UCX distribution (that is, 1.8.0) would be selected automatically, but you want to use the "modern" UCX distribution (that is, 1.14.0) instead, you can use the following:
PATH=$([INSTALLDIR]/star/bin/map_ucx -distrib modern -binpath):$PATH [INSTALLDIR]/star/bin/starccm+ -xsystemucx -ldlibpath $([INSTALLDIR]/star/bin/map_ucx -distrib modern -libpath) <...>
Each UCX version is distributed in different flavors. Each flavor refers to a specific version of Mellanox OFED (MLNX_OFED, MOFED) against which it is configured and built. At runtime, the version of MOFED installed on the system is detected and the most suitable UCX flavor is selected. The most suitable UCX flavor is the one built against the closest MOFED version that is not newer than the system-installed MOFED version. For example, on a system with MOFED 4.8 installed, the MOFED-4.0 flavor is selected. The list of flavors for UCX 1.8.0 is as follows:
- MOFED-1.5
- MOFED-4.0
- MOFED-4.9
- MOFED-5.0
- MOFED-5.1
The list of flavors for UCX 1.14.0 is as follows:
- MOFED-4.6
- MOFED-4.9
- MOFED-5.0
- MOFED-5.1
All flavors MOFED-4.9 and higher are built against a MOFED distribution with upstream libraries enabled (as opposed to legacy MLNX libraries), which is the default MOFED installation mode for these versions. If MOFED has been installed on the system in a non-default configuration, the system UCX library might have to be used (see above).
UCX Library Selection on aarch64
On aarch64, the same principles apply as on x86_64 (see above). However, there are the following exceptions:
- Only UCX 1.14.0 is distributed in the following flavors:
- MOFED-4.6
- MOFED-4.9
- MOFED-5.0
- MOFED-5.1
- A local installation used through
-xsystemucx
must be version 1.10.0 or later.
UCX Transport Selection
UCX supports different transports depending on the hardware and software at hand. When using Intel MPI, the UCX transport selection is managed by Intel MPI. When using Open MPI and a UCX version older than 1.14.0, the following applies:
- If the UCX fabric is used on systems with Mellanox InfiniBand hardware (which is the default if no other fabric is specified) with 256 processes or less, the UD transport is chosen explicitly, overriding the default behavior of UCX.
- Under the same circumstances but using more than 256 processes, no restriction to the UD transport is applied, that is, all UCX transports are allowed.
- This behavior should be kept in mind when observing differences in elapsed time or memory consumption at the 256-process threshold.
- Any Simcenter STAR-CCM+ settings applied to UCX
transports can be overridden by setting the
UCX_TLS
environment variable, for exampleUCX_TLS=all
to allow all available UCX transports. - For more information, please consult the UCX documentation (search for network capabilities under "Frequently Asked Questions (FAQ)").
When using Open MPI and a UCX version 1.14.0 or newer, the following applies:
- No changes to the UCX transport selection are made, meaning UCX will automatically
select the transport except when
UCX_TLS
has been specified by the user. - This approach optimizes performance, but it might lead to increased memory consumption compared to forcing the UD transport (see above).
- If memory consumption must be kept low (potentially at the cost of performance),
consider forcing the UD transport using
UCX_TLS=self,sm,ud
.
Further UCX Tuning
With UCX 1.14.0, PCI relaxed ordering is
disabled on InfiniBand systems by setting UCX_IB_PCI_RELAXED_ORDERING=n
.
This setting is made to avoid performance regressions. You can override this environment
variable setting.
Known Issues with UCX 1.8.0
UCX version 1.8.0 has a bug that may cause data corruption when the TCP transport is used in conjunction with the shared memory transport. In Simcenter STAR-CCM+ with default settings, UCX is only used in parallel runs on Mellanox InfiniBand systems, so this issue may only occur when special user settings are applied to force the use of UCX with the affected transports.
During simulation startup, an attempt is made to detect user settings potentially triggering this bug. When Simcenter STAR-CCM+ does detect such settings, it generates a warning in the Output window.