Skip to content

Using MPICH on Crusher@OLCF

Yanfei Guo edited this page Apr 1, 2023 · 5 revisions

This page describes how to build and use MPICH on the 'Crusher' machine at Oak Ridge. Crusher is a AMD CPU/GPU machine with Slingshot interconnect. The 'Libfabric' device works best here. Performance is on par with the Cray MPI on Crusher.

Prerequisite

  • MPICH dev
  • Libfabric 1.15.0.0 (part of Cray PE)
  • Cray PMI (part of Cray PE, only required if building with srun support)

MPICH needs the following tools (and their default version on Crusher as of 09/27/2022) to build on Crusher with GPU support.

  • gcc (gcc/7.5.0)
  • ROCm (rocm/5.1.0)

Build MPICH

Build ROCm-enabled MPICH with Cray PMI (srun/PALS)

module load rocm 
./configure --with-device=ch4:ofi --with-libfabric=/opt/cray/libfabric/1.15.0.0 --with-pmi=pmi2 --with-pmilib=oldcray --with-craypmi=/opt/cray/pe/pmi/default \
  --with-hip=$ROCM_PATH/hip 
make -j 8
make install

# $ROCM_PATH is set by the rocm module.

Build ROCm-enabled MPICH with hydra

module load rocm
./configure --with-device=ch4:ofi --with-libfabric=/opt/cray/libfabric/1.15.0.0 \
  --with-hip=$ROCM_PATH/hip 
make -j 8
make install

# $ROCM_PATH is set by the rocm module.

A correctly configured MPICH build should print the following message in confiugre output.

*****************************************************
***
*** device      : ch4:ofi
*** shm feature : auto
*** gpu support : HIP
***
*****************************************************

Running MPI Application

Running MPI with srun

module load rocm
export MPIR_CVAR_CH4_OFI_ENABLE_MULTI_NIC_STRIPING=0
# Launch two ranks each on a separate node and a separate GPU
srun -n2 --ntasks-per-node=1 --gpu-per-node=1 --gpu-bind-cloest \
    ./test/mpi/pt2pt/pingping \
    -type=MPI_INT -sendcnt=512 -recvcnt=1024 -seed=78 -testsize=4  -sendmem=device -recvmem=device

For more srun options, please check Crusher User Guide - Running Jobs

Running MPI with hydra

module load rocm
export MPIR_CVAR_CH4_OFI_ENABLE_MULTI_NIC_STRIPING=0
mpiexec -np 2 -ppn 1 -gpus-per-proc=1 \
    -genv MPIR_CVAR_CH4_OFI_ENABLE_MULTI_NIC_STRIPING=0 \
    ./test/mpi/pt2pt/pingping \
    -type=MPI_INT -sendcnt=512 -recvcnt=1024 -seed=78 -testsize=4  -sendmem=device -recvmem=device

Common Issues

  1. "key [-NONEXIST-KEY] was not found" message. It is common to see error messages like the following. It is expected.
Wed Sep 28 12:01:00 2022: [PE_0]:_pmi2_kvs_get:key [-NONEXIST-KEY] was not found.
Wed Sep 28 12:01:00 2022: [PE_0]:PMI2_KVS_Get:_pmi2_kvs_get failed
Wed Sep 28 12:01:00 2022: [PE_1]:_pmi2_kvs_get:key [-NONEXIST-KEY] was not found.
Wed Sep 28 12:01:00 2022: [PE_1]:PMI2_KVS_Get:_pmi2_kvs_get failed
Wed Sep 28 12:01:00 2022: [PE_0]:_pmi2_kvs_get:key [-NONEXIST-KEY] was not found.
Wed Sep 28 12:01:00 2022: [PE_0]:PMI2_KVS_Get:_pmi2_kvs_get failed
Wed Sep 28 12:01:00 2022: [PE_1]:_pmi2_kvs_get:key [-NONEXIST-KEY] was not found.
Wed Sep 28 12:01:00 2022: [PE_1]:PMI2_KVS_Get:_pmi2_kvs_get failed