Add guide about storage on Hawk
This commit is contained in:
parent
ffdd2f08c8
commit
5eab5cc4a4
2 changed files with 136 additions and 1 deletions
10
README.md
10
README.md
|
@ -1,3 +1,11 @@
|
||||||
# SPMT
|
# SPMT
|
||||||
|
|
||||||
Anything related to SPMT
|
Anything related to SPMT.
|
||||||
|
|
||||||
|
## Dashboard
|
||||||
|
|
||||||
|
Guides:
|
||||||
|
- [Best Practice -- Storage on Hawk](guides/Best_practise--Storage_on_Hawk.md)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
127
guides/Best_practice--Storage_on_Hawk.md
Normal file
127
guides/Best_practice--Storage_on_Hawk.md
Normal file
|
@ -0,0 +1,127 @@
|
||||||
|
# Best practice -- Storage on Hawk
|
||||||
|
|
||||||
|
Change history:
|
||||||
|
- Initial version ; Jose Gracia, 7 May 2024
|
||||||
|
|
||||||
|
TODOs:
|
||||||
|
- [Cleanup after parallel job](Best_practice--Storage_on_Hawk.md#Cleanup%20after%20parallel%20job): check if there is a HLRS recommendation to copy files from workspaces.
|
||||||
|
|
||||||
|
|
||||||
|
## Available filesystems
|
||||||
|
|
||||||
|
`$HOME`
|
||||||
|
- get current quota: `$ na_quota`
|
||||||
|
- group quota: 200 GB, no files limit
|
||||||
|
- user quota: 50 GB, no files limit
|
||||||
|
- mounted via NFS; relatively slow
|
||||||
|
|
||||||
|
Workspaces
|
||||||
|
- get current quota: `$ ws_quota`
|
||||||
|
- group quota: 3 TB, 100k files
|
||||||
|
- user quota: none
|
||||||
|
- parallel file system Lustre; metadata slow, parallel access is fast
|
||||||
|
- hitting quota limit, disables queues for whole group
|
||||||
|
|
||||||
|
`\localscratch\$UID`
|
||||||
|
- total size: `df -h /localscratch` -> 22 TB
|
||||||
|
- temporary scratch space
|
||||||
|
- deleted at logout
|
||||||
|
- local SSD; fast
|
||||||
|
- available only on login nodes
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What to put where?
|
||||||
|
|
||||||
|
Persistent data on $HOME. E.g:
|
||||||
|
- source code
|
||||||
|
- installed programs
|
||||||
|
|
||||||
|
Temporary data/builds on `/localscratch/$UID`. E.g:
|
||||||
|
- anything temporary which does not fit on $HOME
|
||||||
|
-
|
||||||
|
|
||||||
|
Data for parallel jobs on workspace. E.g.
|
||||||
|
- Input and outputs of jobs
|
||||||
|
- anything which is access through MPI-IO or similar
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Typical workflows
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Large software project with autotools, e.g. OpenMPI
|
||||||
|
|
||||||
|
```bash
|
||||||
|
SRCFS=$HOME; BUILDFS=/localscratch/$UID; INSTALLFS=$HOME/opt
|
||||||
|
|
||||||
|
git clone --depth=1 git@github.com:open-mpi/ompi.git $SRCFS/ompi.git
|
||||||
|
cd $SRCFS/ompi.git; autoreconf -fiv
|
||||||
|
|
||||||
|
mkdir -p $INSTALLFS/ompi_test_347; mkdir -p $BUILDFS/build_ompi
|
||||||
|
|
||||||
|
cd $BUILDFS/build_ompi
|
||||||
|
$SRCFS/ompi.git/configure --prefix $INSTALLFS/ompi_test_347
|
||||||
|
make && make install
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Large software project with CMake, e.g. targetDART
|
||||||
|
|
||||||
|
```bash
|
||||||
|
SRCFS=$HOME; BUILDFS=/localscratch/$UID; INSTALLFS=$HOME/opt
|
||||||
|
|
||||||
|
git clone --depth=1 git@github.com:targetDART/llvm-project.git $SRCFS/TD.git
|
||||||
|
|
||||||
|
mkdir -p $INSTALLFS/TD_buggy_again; mkdir -p $BUILDFS/build_TD
|
||||||
|
|
||||||
|
cd $BUILDFS/build_TD
|
||||||
|
cmake $SRCFS/TD.git/llvm Ninja -DCMAKE_INSTALL_PREFIX=$INSTALLFS/TD_buggy_again
|
||||||
|
make && make install
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Avoiding large git checkouts
|
||||||
|
|
||||||
|
Do you really need all history? Do shallow clones.
|
||||||
|
```bash
|
||||||
|
git clone --depth=1 ...
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Mirror git repo to $BUILDFS
|
||||||
|
|
||||||
|
Assuming you have a (possibly large) git repo `my_repo.git` on `$HOME`.
|
||||||
|
Copy/clone source to `$BUILDFS` for faster access.
|
||||||
|
```bash
|
||||||
|
SRCFS=$HOME; BUILDFS=/localscratch/$UID; INSTALLFS=$HOME/opt
|
||||||
|
|
||||||
|
ls $SRCFS/my_repo.git
|
||||||
|
git clone --depth=1 file://$SRCFS/my_repo.git $BUILDFS/mirror_repo.git
|
||||||
|
mkdir $BUILDFS/build
|
||||||
|
|
||||||
|
cd $BUILDFS/build
|
||||||
|
cmake $BUILDFS/mirror_repo.git/ -DCMAKE_INSTALL_PREFIX=$INSTALLFS/whatever
|
||||||
|
$BUILDFS/mirror_repo.git/configure --prefix $INSTALLFS/whatever
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Cleanup after parallel job
|
||||||
|
|
||||||
|
Rsynch results of parallel job into `$HOME`.
|
||||||
|
```cat job.pbs
|
||||||
|
PERMANENTFS=$HOME/results/
|
||||||
|
|
||||||
|
cd $(ws_allocate my_job 1)
|
||||||
|
mpirun ./app --resultsdir=results
|
||||||
|
rsynch -a results $PERMANENTFS/
|
||||||
|
rm -rf results
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
Loading…
Reference in a new issue