added changes according to #1
This commit is contained in:
parent
3ceddcaed1
commit
1e56496f56
4 changed files with 38 additions and 82 deletions
57
README.md
57
README.md
|
@ -1,35 +1,12 @@
|
|||
# Dask: How to execute python workloads using a Dask cluster on Vulcan
|
||||
|
||||
Motivation: This document aims to show users how to launch a Dask cluster in our compute platforms and perform a simple workload using it.
|
||||
|
||||
Structure:
|
||||
- [ ] [Tutorial](https://diataxis.fr/tutorials/)
|
||||
- [x] [How-to guide](https://diataxis.fr/how-to-guides/)
|
||||
- [ ] [Reference](https://diataxis.fr/reference/)
|
||||
- [ ] [Explanation](https://diataxis.fr/explanation/)
|
||||
|
||||
To do:
|
||||
- [x] Made scripts for environment creation and deployment in the folder `local_scripts`
|
||||
- [x] Changed scripts to `deployment_scripts`
|
||||
- [x] Added step about sending python file
|
||||
|
||||
---
|
||||
|
||||
This repository looks at a deployment of a Dask cluster on Vulcan, and executing your programs using this cluster.
|
||||
|
||||
## Table of Contents
|
||||
- [Prerequisites](#prerequisites)
|
||||
- [Getting Started](#getting-started)
|
||||
- [Usage](#usage)
|
||||
- [Notes](#notes)
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before running the application, make sure you have the following prerequisites installed in a conda environment:
|
||||
- [Conda Installation](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html): Ensure that Conda is installed on your local system. For more information on, look at the documentation for Conda on [HLRS HPC systems](https://kb.hlrs.de/platforms/index.php/How_to_move_local_conda_environments_to_the_clusters).
|
||||
- [Dask](https://dask.org/): Install Dask using conda.
|
||||
- [Conda Pack](https://conda.github.io/conda-pack/): Conda pack is used to package the Conda environment into a single tarball. This is used to transfer the environment to Vulcan.
|
||||
|
||||
## Getting Started
|
||||
|
||||
### 1. Build and transfer the Conda environment to Hawk:
|
||||
|
@ -65,7 +42,7 @@ scp <your_script>.py <destination_host>:<destination_directory>
|
|||
```bash
|
||||
qsub -I -N DaskJob -l select=1:node_type=clx-21 -l walltime=02:00:00
|
||||
```
|
||||
Note: For multiple nodes, it is recommended to write a `.pbs` script and submit it using `qsub`.
|
||||
Note: For multiple nodes, it is recommended to write a `.pbs` script and submit it using `qsub`. Follow section [Multiple Nodes](#multiple-nodes) for more information.
|
||||
|
||||
### 6. Go into the directory with all code:
|
||||
|
||||
|
@ -83,7 +60,8 @@ source deploy-dask.sh "$(pwd)"
|
|||
|
||||
## Usage
|
||||
|
||||
To run the application interactively, execute the following command after all the cluster's nodes are up and running:
|
||||
### Single Node
|
||||
To run the application interactively on a single node, execute the following command after all the cluster's nodes are up and running:
|
||||
|
||||
```bash
|
||||
python
|
||||
|
@ -100,6 +78,35 @@ conda activate <your-env>
|
|||
```
|
||||
Do this before using the python interpretor.
|
||||
|
||||
### Multiple Nodes
|
||||
To run the application on multiple nodes, you need to write a `.pbs` script and submit it using `qsub`. Follow lines 1-4 from the [Getting Started](#getting-started) section. Write a `submit-dask-job.pbs` script:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#PBS -N dask-job
|
||||
#PBS -l select=3:node_type=rome-ai
|
||||
#PBS -l walltime=1:00:00
|
||||
|
||||
#Go to the directory where the code is
|
||||
cd <destination_directory>
|
||||
|
||||
#Deploy the Dask cluster
|
||||
source deploy-dask.sh "$(pwd)"
|
||||
|
||||
#Run the python script
|
||||
python <your-script>.py
|
||||
```
|
||||
|
||||
And then execute the following commands to submit the job:
|
||||
|
||||
```bash
|
||||
qsub submit-dask-job.pbs
|
||||
qstat -anw # Q: Queued, R: Running, E: Ending
|
||||
ls -l # list files after the job finishes
|
||||
cat dask-job.o... # inspect the output file
|
||||
cat dask-job.e... # inspect the error file
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
Note: Dask Cluster is set to verbose, add the following to your code while connecting to the Dask cluster:
|
||||
|
|
|
@ -6,13 +6,13 @@ export DASK_SCHEDULER_HOST=$2
|
|||
|
||||
# Path to localscratch
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S') - Worker $HOSTNAME] INFO: Setting up Dask environment"
|
||||
export DASK_ENV="/localscratch/${PBS_JOBID}/dask"
|
||||
export DASK_ENV="$HOME/dask"
|
||||
mkdir -p $DASK_ENV
|
||||
|
||||
# Extract Dask environment in localscratch
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S') - Worker $HOSTNAME] INFO: Extracting Dask environment to $DASK_ENV"
|
||||
tar -xzf $CURRENT_WORKSPACE/dask-env.tar.gz -C $DASK_ENV
|
||||
chmod -R 700 $DASK_ENV
|
||||
#tar -xzf $CURRENT_WORKSPACE/dask-env.tar.gz -C $DASK_ENV
|
||||
#chmod -R 700 $DASK_ENV
|
||||
|
||||
# Start the dask environment
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S') - Worker $HOSTNAME] INFO: Setting up Dask environment"
|
||||
|
|
|
@ -24,7 +24,7 @@ export DASK_UI_PORT=8787
|
|||
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S') - Master] INFO: Starting Dask cluster with $NUM_NODES nodes."
|
||||
# Path to localscratch
|
||||
export DASK_ENV="/localscratch/${PBS_JOBID}/dask"
|
||||
export DASK_ENV="$HOME/dask"
|
||||
mkdir -p $DASK_ENV
|
||||
|
||||
echo "[$(date '+%Y-%m-%d %H:%M:%S') - Master] INFO: Extracting Dask environment to $DASK_ENV"
|
||||
|
|
|
@ -1,17 +1,5 @@
|
|||
# Reference Guide: Dask Cluster Deployment Scripts
|
||||
|
||||
Motivation: This document aims to show users how to use additional Dask deployment scripts to streamline the deployment and management of a Dask cluster on a high-performance computing (HPC) environment.
|
||||
|
||||
Structure:
|
||||
- [ ] [Tutorial](https://diataxis.fr/tutorials/)
|
||||
- [ ] [How-to guide](https://diataxis.fr/how-to-guides/)
|
||||
- [x] [Reference](https://diataxis.fr/reference/)
|
||||
- [ ] [Explanation](https://diataxis.fr/explanation/)
|
||||
|
||||
To do:
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This repository contains a set of bash scripts designed to streamline the deployment and management of a Dask cluster on a high-performance computing (HPC) environment. These scripts facilitate the creation of Conda environments, deployment of the environment to a remote server, and initiation of Dask clusters on distributed systems. Below is a comprehensive guide on how to use and understand each script:
|
||||
|
@ -34,40 +22,7 @@ Before using these scripts, ensure that the following prerequisites are met:
|
|||
|
||||
3. **SSH Setup**: Ensure that SSH is set up and configured on your system for remote server communication.
|
||||
|
||||
## 1. create-env.sh
|
||||
|
||||
### Overview
|
||||
|
||||
`create-env.sh` is designed to create a Conda environment. It checks for the existence of the specified environment and either creates it or notifies the user if it already exists.
|
||||
Note: Define your Conda environment in `environment.yaml` before running this script.
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
./create-env.sh <conda_environment_name>
|
||||
```
|
||||
|
||||
### Note
|
||||
|
||||
- This script is intended to run on a local system where Conda is installed.
|
||||
|
||||
## 2. deploy-env.sh
|
||||
|
||||
### Overview
|
||||
|
||||
`deploy-env.sh` is responsible for deploying the Conda environment to a remote server. If the tar.gz file already exists, it is copied; otherwise, it is created before being transferred.
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
./deploy-env.sh <environment_name> <destination_directory>
|
||||
```
|
||||
|
||||
### Note
|
||||
|
||||
- This script is intended to run on a local system.
|
||||
|
||||
## 3. deploy-dask.sh
|
||||
## 1. deploy-dask.sh
|
||||
|
||||
### Overview
|
||||
|
||||
|
@ -84,7 +39,7 @@ Note: Define your Conda environment in `environment.yaml` before running this sc
|
|||
- This script is designed for an HPC environment with PBS job scheduling.
|
||||
- Modifications may be necessary for different job schedulers.
|
||||
|
||||
## 4. dask-worker.sh
|
||||
## 2. dask-worker.sh
|
||||
|
||||
### Overview
|
||||
|
||||
|
@ -94,9 +49,3 @@ Note: Define your Conda environment in `environment.yaml` before running this sc
|
|||
|
||||
- Execute this script on each allocated node to connect them to the Dask scheduler.
|
||||
- Designed for use with PBS job scheduling.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Create Conda Environment**: Execute `create-env.sh` to create a Conda environment locally.
|
||||
2. **Deploy Conda Environment**: Execute `deploy-env.sh` to deploy the Conda environment to a remote server.
|
||||
3. **Deploy Dask Cluster**: Execute `deploy-dask.sh` to start the Dask cluster on an HPC environment.
|
Loading…
Reference in a new issue