GPGPU Computation
The idea behind general-purpose computing on graphics processing units (GPGPU computation) is that you assign computations that are traditionally solved on a CPU (central processing unit) to a graphics processing unit (GPU). The nature of operations on the GPU means that a single unit can deliver computational performance comparable to using many CPUs in parallel. GPGPU computation is only supported for Simcenter STAR-CCM+ servers running under Linux.
The GPGPU capabilities of Simcenter STAR-CCM+ allow end-to-end computations on General-Purpose Graphics Processing Units (GPGPUs) for specific core physics solvers. In particular, the segregated flow solver is supported for GPGPU computation along with a wide array of turbulence models. Additional models and solvers are expected to become GPGPU-compatible in future releases.
- the discretization of the partial differential equations that describe the physics (for example, the Navier-Stokes equations)
- the solution of the resulting linear systems using the Krylov-accelerated algebraic multigrid (AMG) solver
- Simcenter STAR-CCM+ is launched with the GPGPU feature enabled.
- Only GPGPU-compatible models and solvers are chosen for the simulation. See Supported Solvers and Models.
Due to architectural differences between CPUs and GPGPUs, digit-by-digit reproducibility cannot be achieved in GPGPU runs. This difference should not affect the overall convergence behavior.
Selection of GPGPUs
Several options are provided by which you can specify the GPGPUs that Simcenter STAR-CCM+ is to use. These are detailed in the Command Line Reference, GPGPU Options. Simcenter STAR-CCM+ does not support restriction of the GPGPUs that can be used via the environment variables CUDA_VISIBLE_DEVICES
, HIP_VISIBLE_DEVICES
, GPU_DEVICE_ORDINAL
, or ROCR_VISIBLE_DEVICES
. Simcenter STAR-CCM+ only supports GPGPU restriction using the -gpgpu
command line argument.
GPGPU Supported Hardware
For GPGPU-enabled simulations, certain card models from NVIDIA and AMD are supported. Models with HBM (High Bandwidth Memory) are preferred as GDDR (graphics double data rate) memory tends to have insufficient bandwidth for CFD applications.
Cards with NVIDIA's Volta, Turing, Ampere, Hopper, and Ada Lovelace architectures are recommended. CUDA 11.8 or newer drivers are required (CUDA driver version 520.61.05 or newer). Hardware partitioning using NVIDIA Multi-Instance GPU (MIG) is not supported.
Cards from the AMD Instinct MI100 and MI200 series, as well as the Radeon PRO W6800, V620, W7800, and W7900, are recommended. The Instinct MI300 series is additionally supported, but may not deliver optimum performance. The AMDGPU driver is required, while the AMDGPU-PRO driver is unsupported.
If you encounter hangs on AMD GPUs and the kernel
log (dmesg
) reports messages like [drm:amdgpu_job_timedout
[amdgpu]] *ERROR* ring sdma0 timeout
, set the
HSA_ENABLE_SDMA=0
environment variable.
GPGPU Licensing
- Simcenter STAR-CCM+ Power Session Plus
- Simcenter STAR-CCM+ Power on Demand
Using MPS
- More CPU processes than GPGPUs are specified on the host
- MPS is not already running on the host
In managed compute environments, such as HPC clusters, MPS can be handled system-wide. If Simcenter STAR-CCM+ takes over MPS management, messages are printed to the console about MPS being automatically started and terminated. You can avoid automatic MPS handling by appending the :nomps
qualifier to the -gpgpu <...>
command line parameter, or through the user interface.
MPS is not relevant for AMD GPUs. No automatic MPS handling is performed and :nomps
is silently ignored.
- MPS Limitations
- On GPGPUs that predate the Volta architecture, only the MPS process itself appears in the GPGPU process report in
nvidia-smi
. On newer architectures, all Simcenter STAR-CCM+ processes appear directly. - When using an external version of Intel MPI, automatic MPS is not guaranteed to work. Please observe the command line output for a warning regarding MPS not running when expected to.
- On GPGPUs that predate the Volta architecture, only the MPS process itself appears in the GPGPU process report in
GPGPU-aware CPU Binding
When enabling GPGPUs using any of the options available on the -gpgpu
parameter, the CPU binding policy can be changed to the "gpgpuaware" policy (instead of the default "bandwidth" policy). You can change this policy using the command line option, -cpubind gpgpuaware
. The gpgpuaware binding policy implies that CPU processes are placed on cores which are physically close to the GPGPU they are driving. This policy minimizes data transfers over slow paths.
For fully populated hosts, that is, using all available CPU cores, all CPU binding policies result in the same binding—hence the specific option passed on the command line is redundant.
Process Distribution on Multiple Hosts
When running on multiple hosts with GPGPU, it can be beneficial to under-subscribe each compute node, for example, so there is only one process per GPGPU on each host. For more information and recommendations regarding the number of compute processes per GPGPU, see Running Simulations with GPGPU.
You can achieve the required CPU to GPGPU count by specifying the correct number of processes per host in the machinefile, or by setting the relevant attributes when submitting a job via a batch system. For more information, see Defining a Machine File for Parallel Hosts and Using Batch Management Systems.
When running with -gpgpu auto
, Simcenter STAR-CCM+ automatically distributes the processes evenly across all hosts to improve GPGPU utilization. To return to the default process distribution, use the -gpgpu auto:oversubscribe
command (see Command-Line Reference).