dask_template/reproduce_container_bug.md
2024-02-07 17:19:05 +01:00

1.3 KiB

Create the container on the login node:

export WS_DIR=$(ws_find workspace_dir) # adjust this
cd $WS_DIR
wget https://fex.hlrs.de/fop/FYaJqyzw/ray.tar # download the container archive
export CONTAINER_NAME=ray
export CONTAINER_TAG=latest
export UDOCKER_DIR="$WS_DIR/.udocker/" # to store the image layers
udocker images -l # this will create a repo the first time you use it
udocker rmi $CONTAINER_NAME:$CONTAINER_TAG # results in error since the image does not exist
udocker load -i $WS_DIR/$CONTAINER_NAME.tar $CONTAINER_NAME
rm /$WS_DIR/$CONTAINER_NAME.tar # you no longer need the tar archive

Allocate a CPU node, and then:

module load bigdata/udocker/1.3.4
export WS_DIR=$(ws_find workspace_dir) # adjust this
export UDOCKER_DIR="$WS_DIR/.udocker/"
export UDOCKER_CONTAINERS="/run/user/$PBS_JOBID/udocker/containers"
mkdir -p $UDOCKER_CONTAINERS
mkdir -p /run/user/$PBS_JOBID/tmp
export CONTAINER_NAME=ray
export CONTAINER_TAG=latest
udocker create --name=$CONTAINER_NAME:$CONTAINER_TAG
udocker ps
udocker run --volume $WS_DIR:/workspace --volume /run/user/$PBS_JOBID/tmp:/tmp $CONTAINER_NAME

You should see a Python shell.

import ray
# ray.init(num_cpus=4) # Works with a small number of CPUs
ray.init() # But, it can't use all the available CPUs