From 1e56496f56e41ac08f418eae1a7d8b9e75d44602 Mon Sep 17 00:00:00 2001 From: Rishabh Saxena Date: Thu, 25 Apr 2024 14:18:00 +0200 Subject: [PATCH] added changes according to #1 --- README.md | 57 +++++++++++-------- deployment_scripts/dask-worker.sh | 6 +- deployment_scripts/deploy-dask.sh | 2 +- .../deployment_scripts_reference.md | 55 +----------------- 4 files changed, 38 insertions(+), 82 deletions(-) diff --git a/README.md b/README.md index efffa1f..ef21140 100644 --- a/README.md +++ b/README.md @@ -1,35 +1,12 @@ # Dask: How to execute python workloads using a Dask cluster on Vulcan -Motivation: This document aims to show users how to launch a Dask cluster in our compute platforms and perform a simple workload using it. - -Structure: -- [ ] [Tutorial](https://diataxis.fr/tutorials/) -- [x] [How-to guide](https://diataxis.fr/how-to-guides/) -- [ ] [Reference](https://diataxis.fr/reference/) -- [ ] [Explanation](https://diataxis.fr/explanation/) - -To do: -- [x] Made scripts for environment creation and deployment in the folder `local_scripts` -- [x] Changed scripts to `deployment_scripts` -- [x] Added step about sending python file - ---- - This repository looks at a deployment of a Dask cluster on Vulcan, and executing your programs using this cluster. ## Table of Contents -- [Prerequisites](#prerequisites) - [Getting Started](#getting-started) - [Usage](#usage) - [Notes](#notes) -## Prerequisites - -Before running the application, make sure you have the following prerequisites installed in a conda environment: -- [Conda Installation](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html): Ensure that Conda is installed on your local system. For more information on, look at the documentation for Conda on [HLRS HPC systems](https://kb.hlrs.de/platforms/index.php/How_to_move_local_conda_environments_to_the_clusters). -- [Dask](https://dask.org/): Install Dask using conda. -- [Conda Pack](https://conda.github.io/conda-pack/): Conda pack is used to package the Conda environment into a single tarball. This is used to transfer the environment to Vulcan. - ## Getting Started ### 1. Build and transfer the Conda environment to Hawk: @@ -65,7 +42,7 @@ scp .py : ```bash qsub -I -N DaskJob -l select=1:node_type=clx-21 -l walltime=02:00:00 ``` -Note: For multiple nodes, it is recommended to write a `.pbs` script and submit it using `qsub`. +Note: For multiple nodes, it is recommended to write a `.pbs` script and submit it using `qsub`. Follow section [Multiple Nodes](#multiple-nodes) for more information. ### 6. Go into the directory with all code: @@ -83,7 +60,8 @@ source deploy-dask.sh "$(pwd)" ## Usage -To run the application interactively, execute the following command after all the cluster's nodes are up and running: +### Single Node +To run the application interactively on a single node, execute the following command after all the cluster's nodes are up and running: ```bash python @@ -100,6 +78,35 @@ conda activate ``` Do this before using the python interpretor. +### Multiple Nodes +To run the application on multiple nodes, you need to write a `.pbs` script and submit it using `qsub`. Follow lines 1-4 from the [Getting Started](#getting-started) section. Write a `submit-dask-job.pbs` script: + +```bash +#!/bin/bash +#PBS -N dask-job +#PBS -l select=3:node_type=rome-ai +#PBS -l walltime=1:00:00 + +#Go to the directory where the code is +cd + +#Deploy the Dask cluster +source deploy-dask.sh "$(pwd)" + +#Run the python script +python .py +``` + +And then execute the following commands to submit the job: + +```bash +qsub submit-dask-job.pbs +qstat -anw # Q: Queued, R: Running, E: Ending +ls -l # list files after the job finishes +cat dask-job.o... # inspect the output file +cat dask-job.e... # inspect the error file +``` + ## Notes Note: Dask Cluster is set to verbose, add the following to your code while connecting to the Dask cluster: diff --git a/deployment_scripts/dask-worker.sh b/deployment_scripts/dask-worker.sh index 4c806e8..767e9fd 100644 --- a/deployment_scripts/dask-worker.sh +++ b/deployment_scripts/dask-worker.sh @@ -6,13 +6,13 @@ export DASK_SCHEDULER_HOST=$2 # Path to localscratch echo "[$(date '+%Y-%m-%d %H:%M:%S') - Worker $HOSTNAME] INFO: Setting up Dask environment" -export DASK_ENV="/localscratch/${PBS_JOBID}/dask" +export DASK_ENV="$HOME/dask" mkdir -p $DASK_ENV # Extract Dask environment in localscratch echo "[$(date '+%Y-%m-%d %H:%M:%S') - Worker $HOSTNAME] INFO: Extracting Dask environment to $DASK_ENV" -tar -xzf $CURRENT_WORKSPACE/dask-env.tar.gz -C $DASK_ENV -chmod -R 700 $DASK_ENV +#tar -xzf $CURRENT_WORKSPACE/dask-env.tar.gz -C $DASK_ENV +#chmod -R 700 $DASK_ENV # Start the dask environment echo "[$(date '+%Y-%m-%d %H:%M:%S') - Worker $HOSTNAME] INFO: Setting up Dask environment" diff --git a/deployment_scripts/deploy-dask.sh b/deployment_scripts/deploy-dask.sh index 1d47302..9ddcc4a 100644 --- a/deployment_scripts/deploy-dask.sh +++ b/deployment_scripts/deploy-dask.sh @@ -24,7 +24,7 @@ export DASK_UI_PORT=8787 echo "[$(date '+%Y-%m-%d %H:%M:%S') - Master] INFO: Starting Dask cluster with $NUM_NODES nodes." # Path to localscratch -export DASK_ENV="/localscratch/${PBS_JOBID}/dask" +export DASK_ENV="$HOME/dask" mkdir -p $DASK_ENV echo "[$(date '+%Y-%m-%d %H:%M:%S') - Master] INFO: Extracting Dask environment to $DASK_ENV" diff --git a/deployment_scripts/deployment_scripts_reference.md b/deployment_scripts/deployment_scripts_reference.md index ac0afc2..2baef14 100644 --- a/deployment_scripts/deployment_scripts_reference.md +++ b/deployment_scripts/deployment_scripts_reference.md @@ -1,17 +1,5 @@ # Reference Guide: Dask Cluster Deployment Scripts -Motivation: This document aims to show users how to use additional Dask deployment scripts to streamline the deployment and management of a Dask cluster on a high-performance computing (HPC) environment. - -Structure: -- [ ] [Tutorial](https://diataxis.fr/tutorials/) -- [ ] [How-to guide](https://diataxis.fr/how-to-guides/) -- [x] [Reference](https://diataxis.fr/reference/) -- [ ] [Explanation](https://diataxis.fr/explanation/) - -To do: - ---- - ## Overview This repository contains a set of bash scripts designed to streamline the deployment and management of a Dask cluster on a high-performance computing (HPC) environment. These scripts facilitate the creation of Conda environments, deployment of the environment to a remote server, and initiation of Dask clusters on distributed systems. Below is a comprehensive guide on how to use and understand each script: @@ -34,40 +22,7 @@ Before using these scripts, ensure that the following prerequisites are met: 3. **SSH Setup**: Ensure that SSH is set up and configured on your system for remote server communication. -## 1. create-env.sh - -### Overview - -`create-env.sh` is designed to create a Conda environment. It checks for the existence of the specified environment and either creates it or notifies the user if it already exists. -Note: Define your Conda environment in `environment.yaml` before running this script. - -### Usage - -```bash -./create-env.sh -``` - -### Note - -- This script is intended to run on a local system where Conda is installed. - -## 2. deploy-env.sh - -### Overview - -`deploy-env.sh` is responsible for deploying the Conda environment to a remote server. If the tar.gz file already exists, it is copied; otherwise, it is created before being transferred. - -### Usage - -```bash -./deploy-env.sh -``` - -### Note - -- This script is intended to run on a local system. - -## 3. deploy-dask.sh +## 1. deploy-dask.sh ### Overview @@ -84,7 +39,7 @@ Note: Define your Conda environment in `environment.yaml` before running this sc - This script is designed for an HPC environment with PBS job scheduling. - Modifications may be necessary for different job schedulers. -## 4. dask-worker.sh +## 2. dask-worker.sh ### Overview @@ -94,9 +49,3 @@ Note: Define your Conda environment in `environment.yaml` before running this sc - Execute this script on each allocated node to connect them to the Dask scheduler. - Designed for use with PBS job scheduling. - -## Workflow - -1. **Create Conda Environment**: Execute `create-env.sh` to create a Conda environment locally. -2. **Deploy Conda Environment**: Execute `deploy-env.sh` to deploy the Conda environment to a remote server. -3. **Deploy Dask Cluster**: Execute `deploy-dask.sh` to start the Dask cluster on an HPC environment. \ No newline at end of file