multiple node test successful
This commit is contained in:
parent
ef8058ea34
commit
5a8bf27936
2 changed files with 8 additions and 4 deletions
10
README.md
10
README.md
|
@ -131,14 +131,16 @@ Then, launch Firefox web browser using the configured profile. Open `localhost:8
|
||||||
|
|
||||||
## Launch a Ray Cluster in Batch Mode
|
## Launch a Ray Cluster in Batch Mode
|
||||||
|
|
||||||
1. Add execution permissions to `start-ray-worker.sh`
|
Let us [estimate the value of π](https://docs.ray.io/en/releases-2.8.0/ray-core/examples/monte_carlo_pi.html) as an example application.
|
||||||
|
|
||||||
|
**Step 1.** Add execution permissions to `start-ray-worker.sh`
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
cd deployment_scripts
|
cd deployment_scripts
|
||||||
chmod +x start-ray-worker.sh
|
chmod +x start-ray-worker.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Submit a job to launch the head and worker nodes.
|
**Step 2.** Submit a job to launch the head and worker nodes.
|
||||||
|
|
||||||
You must modify the following lines in `submit-ray-job.sh`:
|
You must modify the following lines in `submit-ray-job.sh`:
|
||||||
- Line 3 changes the cluster size. The default configuration launches a 3 node cluster.
|
- Line 3 changes the cluster size. The default configuration launches a 3 node cluster.
|
||||||
|
@ -155,4 +157,6 @@ qstat -anw # Q: Queued, R: Running, E: Ending
|
||||||
ls -l # list files after the job finishes
|
ls -l # list files after the job finishes
|
||||||
cat ray-job.o... # inspect the output file
|
cat ray-job.o... # inspect the output file
|
||||||
cat ray-job.e... # inspect the error file
|
cat ray-job.e... # inspect the error file
|
||||||
```
|
```
|
||||||
|
|
||||||
|
If you need to delete the job, use `qdel <job-id>`. If this doesn't work, use the `-W force` option: `qdel -W force <job-id>`
|
|
@ -40,7 +40,7 @@ ray start --disable-usage-stats \
|
||||||
export NUM_NODES=$(sort $PBS_NODEFILE |uniq | wc -l)
|
export NUM_NODES=$(sort $PBS_NODEFILE |uniq | wc -l)
|
||||||
|
|
||||||
for ((i=1;i<$NUM_NODES;i++)); do
|
for ((i=1;i<$NUM_NODES;i++)); do
|
||||||
pbsdsh -n $i -- bash -l -c "'$DEPLOYMENT_SCRIPTS/ray-start-worker.sh' '$WS_DIR' '$ENV_ARCHIVE' '$RAY_ADDRESS' '$REDIS_PASSWORD' '$OBJECT_STORE_MEMORY'" &
|
pbsdsh -n $i -- bash -l -c "'$DEPLOYMENT_SCRIPTS/start-ray-worker.sh' '$WS_DIR' '$ENV_ARCHIVE' '$RAY_ADDRESS' '$REDIS_PASSWORD' '$OBJECT_STORE_MEMORY'" &
|
||||||
done
|
done
|
||||||
|
|
||||||
python3 $PYTHON_FILE
|
python3 $PYTHON_FILE
|
||||||
|
|
Loading…
Reference in a new issue