# Dask: How to execute python workloads using a Dask cluster on Vulcan This repository looks at a deployment of a Dask cluster on Vulcan, and executing your programs using this cluster. ## Table of Contents - [Getting Started](#getting-started) - [Usage](#usage) - [Notes](#notes) ## Getting Started ### 1. Build and transfer the Conda environment to Hawk: Only the `main` and `r` channels are available using the Conda module on the clusters. To use custom packages, we need to move the local Conda environment to Hawk. Follow the instructions in [the Conda environment builder repository](https://code.hlrs.de/SiVeGCS/conda-env-builder), which includes a YAML file for building a test environment. ### 2. Allocate workspace on Hawk: Proceed to the next step if you have already configured your workspace. Use the following command to create a workspace on the high-performance filesystem, which will expire in 10 days. For more information, such as how to enable reminder emails, refer to the [workspace mechanism](https://kb.hlrs.de/platforms/index.php/Workspace_mechanism) guide. ```bash ws_allocate dask_workspace 10 ws_find dask_workspace # find the path to workspace, which is the destination directory in the next step ``` ### 3. Clone the repository on Hawk to use the deployment scripts and project structure: ```bash cd git clone ``` ### 4. Send all the code to the appropriate directory on Vulcan using `scp`: ```bash scp .py : ``` ### 5. SSH into Vulcan and start a job interactively using: ```bash qsub -I -N DaskJob -l select=1:node_type=clx-21 -l walltime=02:00:00 ``` Note: For multiple nodes, it is recommended to write a `.pbs` script and submit it using `qsub`. Follow section [Multiple Nodes](#multiple-nodes) for more information. ### 6. Go into the directory with all code: ```bash cd ``` ### 7. Initialize the Dask cluster: ```bash source deploy-dask.sh "$(pwd)" ``` Note: At the moment, the deployment is verbose, and there is no implementation to silence the logs. Note: Make sure all permissions are set using `chmod +x` for all scripts. ## Usage ### Single Node To run the application interactively on a single node, follow points 4, 5, 6 and, 7 from [Getting Started](#getting-started), and execute the following command after all the job has started: ```bash # Load the Conda module module load bigdata/conda source activate # activates the base environment # List available Conda environments for verification purposes conda env list # Activate a specific Conda environment. conda activate dask_environment # you need to execute `source activate` first, or use `source [ENV_PATH]/bin/activate` ``` After the environment is activated, you can run the python interpretor: ```bash python ``` Or to run a full script: ```bash python .py ``` ### Multiple Nodes To run the application on multiple nodes, you need to write a `.pbs` script and submit it using `qsub`. Follow lines 1-4 from the [Getting Started](#getting-started) section. Write a `submit-dask-job.pbs` script: ```bash #!/bin/bash #PBS -N dask-job #PBS -l select=3:node_type=rome #PBS -l walltime=1:00:00 #Go to the directory where the code is cd #Deploy the Dask cluster source deploy-dask.sh "$(pwd)" #Run the python script python .py ``` A more thorough example is available in the `deployment_scripts` directory under `submit-dask-job.pbs`. And then execute the following commands to submit the job: ```bash qsub submit-dask-job.pbs qstat -anw # Q: Queued, R: Running, E: Ending ls -l # list files after the job finishes cat dask-job.o... # inspect the output file cat dask-job.e... # inspect the error file ``` ## Notes Note: Dask Cluster is set to verbose, add the following to your code while connecting to the Dask cluster: ```python client = Client(..., silence_logs='error') ``` Note: Replace all filenames within `<>` with the actual values applicable to your project.