Skip to content

Commit

Permalink
Merge pull request #25 from jjhursey/runtime-ci-testing
Browse files Browse the repository at this point in the history
Open MPI Runtime Testing Harness
  • Loading branch information
jjhursey authored Feb 24, 2023
2 parents 08ab736 + f4bdde1 commit dcc6ec7
Show file tree
Hide file tree
Showing 19 changed files with 1,131 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,5 @@ comm_split_type/cmsplit_type
singleton/hello_c
singleton/simple_spawn
singleton/simple_spawn_multiple

.vscode
15 changes: 15 additions & 0 deletions runtime/.ci-configure
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#
# Open MPI is built with the following options:
# ./configure --prefix=/opt/ci/support/exports/ompi --disable-cuda --disable-nvml --with-cuda=no --without-hcoll
#
# Additional options can be added by listing them.
# Options can be listed as either:
# - One option per line
# - Multiple options per line
# Note: Line continutions are not supported
#
# Enable Debug
--enable-debug
# With Python Bindings
# Need to install Cython on the CI machine
#--enable-python-bindings
2 changes: 2 additions & 0 deletions runtime/.ci-tests
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Start with the basics
hello_world
41 changes: 41 additions & 0 deletions runtime/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Test suite for Open MPI runtime

This test suite is meant to be able to be run stand-alone or under CI.

All of the tests that are intended for CI must be listed in the `.ci-tests` file.

If the Open MPI build needs additional `configure` options those can be added to the `.ci-configure` file.

## Running tests stand alone

1. Make sure that Open MPI and other required libraries are in your `PATH`/`LD_LIBRARY_PATH`
2. Drop into a directory:
- Use the `build.sh` script to build any test articles
- Use the `run.sh` script to run the test program


## CI Environment Variables

The CI infrastructure defines the following environment variables to be used in the test programs. These are defined during the `run.sh` phase and not the `build.sh` phase.

* `CI_HOSTFILE` : Absolute path to the hostfile for this run.
* `CI_NUM_NODES` : Number of nodes in this cluster.
* `CI_OMPI_SRC` : top level directory of the Open MPI repository checkout.
* `CI_OMPI_TESTS_PUBLIC_DIR` : Top level directory of the [Open MPI Public Test](https://github.com/open-mpi/ompi-tests-public) repository checkout
* `OMPI_ROOT` : Open MPI install directory.


### Adding a new test for CI

1. Create a directory with your test.
- **Note**: Please make your test scripts such that they can be easily run with or without the CI environment variables.
2. Create a build script named `build.sh`
- CI will call this exactly one time (with a timeout in case it hangs).
- If the script returns `0` then it is considered successful. Otherwise it is considered failed.
3. Create a run script named `run.sh`
- The script is responsible for running your test including any runtime setup/shutdown and test result inspection.
- CI will call this exactly one time (with a timeout in case it hangs).
- If the script returns `0` then it is considered successful. Otherwise it is considered failed.
4. Add your directory name to the `.ci-tests` file in this directory in the order that they should be executed.
- Note that adding the directory is not sufficient to have CI run the test, it must be in the `.ci-tests` file.
- Comments (starting with `#`) are allowed.
30 changes: 30 additions & 0 deletions runtime/bin/cleanup-scrub-local.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash

PROGS="prte prted prun mpirun timeout"

clean_files()
{
FILES=("pmix-*" "core*" "openmpi-sessions-*" "pmix_dstor_*" "ompi.*" "prrte.*" )

for fn in ${FILES[@]}; do
find /tmp/ -maxdepth 1 \
-user $USER -a \
-name $fn \
-exec rm -rf {} \;

if [ -n "$TMPDIR" ] ; then
find $TMPDIR -maxdepth 1 \
-user $USER -a \
-name $fn \
-exec rm -rf {} \;
fi
done
}

killall -q ${PROGS} > /dev/null
clean_files
killall -q -9 ${PROGS} > /dev/null

exit 0


62 changes: 62 additions & 0 deletions runtime/bin/cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#!/bin/bash

clean_server()
{
SERVER=$1
ITER=$2
MAX=$3
QUIET=$4

SCRIPTDIR=$PWD/`dirname $0`/

if [[ $QUIET == 0 ]] ; then
echo "Cleaning server ($ITER / $MAX): $SERVER"
fi
ssh -oBatchMode=yes ${SERVER} ${SCRIPTDIR}/cleanup-scrub-local.sh
}

if [[ "x" != "x$CI_HOSTFILE" && -f "$CI_HOSTFILE" ]] ; then
ALLHOSTS=(`cat $CI_HOSTFILE | sort | uniq`)
else
ALLHOSTS=(`hostname`)
fi
LEN=${#ALLHOSTS[@]}

# Use a background mode if running at scale
USE_BG=0
if [ $LEN -gt 10 ] ; then
USE_BG=1
fi

for (( i=0; i<${LEN}; i++ ));
do
if [ $USE_BG == 1 ] ; then
if [ $(($i % 100)) == 0 ] ; then
echo "| $i"
else
if [ $(($i % 10)) == 0 ] ; then
echo -n "|"
else
echo -n "."
fi
fi
fi

if [ $USE_BG == 1 ] ; then
clean_server ${ALLHOSTS[$i]} $i $LEN $USE_BG &
sleep 0.25
else
clean_server ${ALLHOSTS[$i]} $i $LEN $USE_BG
echo "-------------------------"
fi
done

if [ $USE_BG == 1 ] ; then
echo ""
echo "------------------------- Waiting"
wait
fi

echo "------------------------- Done"

exit 0
72 changes: 72 additions & 0 deletions runtime/bin/pretty-print-hwloc/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Prerequisites
*.d

# Object files
*.o
*.ko
*.obj
*.elf

# Linker output
*.ilk
*.map
*.exp

# Precompiled Headers
*.gch
*.pch

# Libraries
*.lib
*.a
*.la
*.lo

# Shared objects (inc. Windows DLLs)
*.dll
*.so
*.so.*
*.dylib

# Executables
*.exe
*.out
*.app
*.i*86
*.x86_64
*.hex

# Debug files
*.dSYM/
*.su
*.idb
*.pdb

# Kernel Module Compile Results
*.mod*
*.cmd
.tmp_versions/
modules.order
Module.symvers
Mkfile.old
dkms.conf

# Autoconf/Automake leftovers
autom4te.cache/
compile
depcomp
aclocal.m4
config.log
config.status
configure
install-sh
missing
.deps
.libs
*.in
Makefile
src/include/autogen/config.h*
src/include/autogen/stamp-h1

# Binary leftovers
src/get-pretty-cpu
9 changes: 9 additions & 0 deletions runtime/bin/pretty-print-hwloc/Makefile.am
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#
# High level Makefile
#
headers =
sources =
nodist_headers =
EXTRA_DIST =

SUBDIRS = . src
66 changes: 66 additions & 0 deletions runtime/bin/pretty-print-hwloc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Pretty Print HWLOC Process Binding

## Building

```shell
./autogen.sh
./configure --prefix=${YOUR_INSTALL_DIR} --with-hwloc=${HWLOC_INSTALL_PATH}
make
make install
````

## Running

### Default: Print HWLOC bitmap

```shell
shell$ get-pretty-cpu
0/ 0 on c660f5n18) Process Bound : 0xffffffff,0xffffffff,0xffffffff,0xffffffff,0xffffffff
```

```shell
shell$ hwloc-bind core:2 get-pretty-cpu
0/ 0 on c660f5n18) Process Bound : 0x00ff0000
```

```shell
shell$ mpirun -np 2 get-pretty-cpu
0/ 2 on c660f5n18) Process Bound : 0x000000ff
1/ 2 on c660f5n18) Process Bound : 0x0000ff00
```

### Full descriptive output

```shell
shell$ get-pretty-cpu -b -f
0/ 0 on c660f5n18) Process Bound : socket 0[core 0[hwt 0-7]],socket 0[core 1[hwt 0-7]],socket 0[core 2[hwt 0-7]],socket 0[core 3[hwt 0-7]],socket 0[core 4[hwt 0-7]],socket 0[core 5[hwt 0-7]],socket 0[core 6[hwt 0-7]],socket 0[core 7[hwt 0-7]],socket 0[core 8[hwt 0-7]],socket 0[core 9[hwt 0-7]],socket 1[core 10[hwt 0-7]],socket 1[core 11[hwt 0-7]],socket 1[core 12[hwt 0-7]],socket 1[core 13[hwt 0-7]],socket 1[core 14[hwt 0-7]],socket 1[core 15[hwt 0-7]],socket 1[core 16[hwt 0-7]],socket 1[core 17[hwt 0-7]],socket 1[core 18[hwt 0-7]],socket 1[core 19[hwt 0-7]]
```
```shell
shell$ hwloc-bind core:2 get-pretty-cpu -b -f
0/ 0 on c660f5n18) Process Bound : socket 0[core 2[hwt 0-7]]
```
```shell
shell$ mpirun -np 2 get-pretty-cpu -b -f
1/ 2 on c660f5n18) Process Bound : socket 0[core 1[hwt 0-7]]
0/ 2 on c660f5n18) Process Bound : socket 0[core 0[hwt 0-7]]
```
### Full descriptive bracketed output
```shell
shell$ get-pretty-cpu -b -m
0/ 0 on c660f5n18) Process Bound : [BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB][BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB/BBBBBBBB]
```
```shell
shell$ hwloc-bind core:2 get-pretty-cpu -b -m
0/ 0 on c660f5n18) Process Bound : [......../......../BBBBBBBB/......../......../......../......../......../......../........][......../......../......../......../......../......../......../......../......../........]
```
```shell
shell$ mpirun -np 2 get-pretty-cpu -b -m
1/ 2 on c660f5n18) Process Bound : [......../BBBBBBBB/......../......../......../......../......../......../......../........][......../......../......../......../......../......../......../......../......../........]
0/ 2 on c660f5n18) Process Bound : [BBBBBBBB/......../......../......../......../......../......../......../......../........][......../......../......../......../......../......../......../......../......../........]
```
3 changes: 3 additions & 0 deletions runtime/bin/pretty-print-hwloc/autogen.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash -e

autoreconf -ivf
Loading

0 comments on commit dcc6ec7

Please sign in to comment.