diff --git a/reproduce_container_bug.md b/reproduce_container_bug.md new file mode 100644 index 0000000..2707119 --- /dev/null +++ b/reproduce_container_bug.md @@ -0,0 +1,30 @@ +Create the container on the login node: + +```bash +export WS_DIR=$(ws_find workspace_dir) # adjust this +cd $WS_DIR +wget https://fex.hlrs.de/fop/FYaJqyzw/ray.tar # download the container archive +export CONTAINER_NAME=ray +export CONTAINER_TAG=latest +export UDOCKER_DIR="$WS_DIR/.udocker/" # to store the image layers +udocker images -l # this will create a repo the first time you use it +udocker rmi $CONTAINER_NAME:$CONTAINER_TAG # results in error since the image does not exist +udocker load -i $WS_DIR/$CONTAINER_NAME.tar $CONTAINER_NAME +rm /$WS_DIR/$CONTAINER_NAME.tar # you no longer need the tar archive +``` + +Allocate a CPU node: + +```bash +module load bigdata/udocker/1.3.4 +export WS_DIR=$(ws_find benchmarks) +udocker run --volume $WS_DIR:/workspace --volume /run/user/$PBS_JOBID/tmp:/tmp $CONTAINER_NAME +``` + +You should see a Python shell. + +```python +import ray +# ray.init(num_cpus=4) # Works with a small number of CPUs +ray.init() # But, it can't use all the available CPUs +```