2.2 KiB
Reference Guide: Dask Cluster Deployment Scripts
Overview
This repository contains a set of bash scripts designed to streamline the deployment and management of a Dask cluster on a high-performance computing (HPC) environment. These scripts facilitate the creation of Conda environments, deployment of the environment to a remote server, and initiation of Dask clusters on distributed systems. Below is a comprehensive guide on how to use and understand each script:
Note: Permissions
Ensure that execution permissions (chmod +x
) are granted to these scripts before attempting to run them. This can be done using the following command:
chmod +x script_name.sh
Prerequisites
Before using these scripts, ensure that the following prerequisites are met:
-
Conda Installation: Ensure that Conda is installed on your local system. Follow the official Conda installation guide if not already installed.
-
PBS Job Scheduler: The deployment scripts (
deploy-dask.sh
anddask-worker.sh
) are designed for use with the PBS job scheduler. Modify accordingly if using a different job scheduler. -
SSH Setup: Ensure that SSH is set up and configured on your system for remote server communication.
1. deploy-dask.sh
Overview
deploy-dask.sh
initiates the Dask cluster on an HPC environment using the PBS job scheduler. It extracts the Conda environment, activates it, and starts the Dask scheduler and workers on allocated nodes.
Usage
./deploy-dask.sh <current_workspace_directory>
Notes
- This script is designed for an HPC environment with PBS job scheduling.
- Modifications may be necessary for different job schedulers.
2. dask-worker.sh
Overview
dask-worker.sh
is a worker script designed to be executed on each allocated node. It sets up the Dask environment, extracts the Conda environment, activates it, and starts the Dask worker to connect to the scheduler. This script is not directly executed by the user.
Notes
- Execute this script on each allocated node to connect them to the Dask scheduler.
- Designed for use with PBS job scheduling.