ray_template/README.md

110 lines
4 KiB
Markdown
Raw Normal View History

2023-12-07 09:26:25 +00:00
# Dask: How to execute python workloads using a Dask cluster on Vulcan
Motivation: This document aims to show users how to launch a Dask cluster in our compute platforms and perform a simple workload using it.
Structure:
- [ ] [Tutorial](https://diataxis.fr/tutorials/)
- [x] [How-to guide](https://diataxis.fr/how-to-guides/)
- [ ] [Reference](https://diataxis.fr/reference/)
- [ ] [Explanation](https://diataxis.fr/explanation/)
To do:
- [x] Made scripts for environment creation and deployment in the folder `local_scripts`
- [x] Changed scripts to `deployment_scripts`
2024-01-03 08:23:41 +00:00
- [x] Added step about sending python file
2023-12-07 09:26:25 +00:00
---
This repository looks at a deployment of a Dask cluster on Vulcan, and executing your programs using this cluster.
## Table of Contents
- [Prerequisites](#prerequisites)
- [Getting Started](#getting-started)
- [Usage](#usage)
- [Notes](#notes)
## Prerequisites
Before running the application, make sure you have the following prerequisites installed in a conda environment:
2024-01-03 08:23:41 +00:00
- [Conda Installation](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html): Ensure that Conda is installed on your local system. For more information on, look at the documentation for Conda on [HLRS HPC systems](https://kb.hlrs.de/platforms/index.php/How_to_move_local_conda_environments_to_the_clusters).
2023-12-07 09:26:25 +00:00
- [Dask](https://dask.org/): Install Dask using conda.
- [Conda Pack](https://conda.github.io/conda-pack/): Conda pack is used to package the Conda environment into a single tarball. This is used to transfer the environment to Vulcan.
## Getting Started
2024-04-23 13:55:58 +00:00
1. Build and transfer the Conda environment to Hawk:
Only the `main` and `r` channels are available using the Conda module on the clusters. To use custom packages, we need to move the local Conda environment to Hawk.
2023-12-07 09:26:25 +00:00
2024-04-23 13:59:14 +00:00
Follow the instructions in [the Conda environment builder repository](https://code.hlrs.de/SiVeGCS/conda-env-builder). The YAML file to create a test environment is available in the `deployment_scripts` directory.
2023-12-07 09:26:25 +00:00
2024-04-23 13:55:58 +00:00
2. Allocate workspace on Hawk:
2023-12-07 09:26:25 +00:00
2024-04-23 13:55:58 +00:00
Proceed to the next step if you have already configured your workspace. Use the following command to create a workspace on the high-performance filesystem, which will expire in 10 days. For more information, such as how to enable reminder emails, refer to the [workspace mechanism](https://kb.hlrs.de/platforms/index.php/Workspace_mechanism) guide.
```bash
ws_allocate dask_workspace 10
ws_find dask_workspace # find the path to workspace, which is the destination directory in the next step
```
2023-12-07 09:26:25 +00:00
2024-04-23 13:55:58 +00:00
3. Clone the repository on Hawk to use the deployment scripts and project structure:
2023-12-07 09:26:25 +00:00
2024-04-23 13:55:58 +00:00
```bash
cd <workspace_directory>
git clone <repository_url>
```
2023-12-07 09:26:25 +00:00
2024-01-03 08:23:41 +00:00
4. Send all the code to the appropriate directory on Vulcan using `scp`:
2024-04-23 13:55:58 +00:00
```bash
scp <your_script>.py <destination_host>:<destination_directory>
```
2024-01-03 08:23:41 +00:00
2024-04-23 13:55:58 +00:00
5. SSH into Vulcan and start a job interactively using:
2023-12-07 09:26:25 +00:00
2024-04-23 13:55:58 +00:00
```bash
qsub -I -N DaskJob -l select=1:node_type=clx-21 -l walltime=02:00:00
```
Note: For multiple nodes, it is recommended to write a `.pbs` script and submit it using `qsub`.
2023-12-07 09:26:25 +00:00
2024-01-03 08:23:41 +00:00
6. Go into the directory with all code:
2023-12-07 09:26:25 +00:00
2024-04-23 13:55:58 +00:00
```bash
cd <destination_directory>
```
2023-12-07 09:26:25 +00:00
2024-01-03 08:23:41 +00:00
7. Initialize the Dask cluster:
2023-12-07 09:26:25 +00:00
2024-04-23 13:55:58 +00:00
```bash
source deploy-dask.sh "$(pwd)"
```
2023-12-07 09:26:25 +00:00
Note: At the moment, the deployment is verbose, and there is no implementation to silence the logs.
Note: Make sure all permissions are set using `chmod +x` for all scripts.
## Usage
To run the application interactively, execute the following command after all the cluster's nodes are up and running:
```bash
python
```
Or to run a full script:
```bash
python <your-script>.py
```
Note: If you don't see your environment in the python interpretor, then manually activate it using:
```bash
conda activate <your-env>
```
Do this before using the python interpretor.
## Notes
Note: Dask Cluster is set to verbose, add the following to your code while connecting to the Dask cluster:
```python
client = Client(..., silence_logs='error')
```
Note: Replace all filenames within `<>` with the actual values applicable to your project.