127 lines
2.9 KiB
Markdown
127 lines
2.9 KiB
Markdown
# Best practice -- Storage on Hawk
|
|
|
|
Change history:
|
|
- Initial version ; Jose Gracia, 7 May 2024
|
|
|
|
TODOs:
|
|
- [Cleanup after parallel job](Best_practice--Storage_on_Hawk.md#Cleanup%20after%20parallel%20job): check if there is a HLRS recommendation to copy files from workspaces.
|
|
|
|
|
|
## Available filesystems
|
|
|
|
`$HOME`
|
|
- get current quota: `$ na_quota`
|
|
- group quota: 200 GB, no files limit
|
|
- user quota: 50 GB, no files limit
|
|
- mounted via NFS; relatively slow
|
|
|
|
Workspaces
|
|
- get current quota: `$ ws_quota`
|
|
- group quota: 3 TB, 100k files
|
|
- user quota: none
|
|
- parallel file system Lustre; metadata slow, parallel access is fast
|
|
- hitting quota limit, disables queues for whole group
|
|
|
|
`\localscratch\$UID`
|
|
- total size: `df -h /localscratch` -> 22 TB
|
|
- temporary scratch space
|
|
- deleted at logout
|
|
- local SSD; fast
|
|
- available only on login nodes
|
|
|
|
---
|
|
|
|
## What to put where?
|
|
|
|
Persistent data on $HOME. E.g:
|
|
- source code
|
|
- installed programs
|
|
|
|
Temporary data/builds on `/localscratch/$UID`. E.g:
|
|
- anything temporary which does not fit on $HOME
|
|
-
|
|
|
|
Data for parallel jobs on workspace. E.g.
|
|
- Input and outputs of jobs
|
|
- anything which is access through MPI-IO or similar
|
|
|
|
---
|
|
|
|
## Typical workflows
|
|
|
|
---
|
|
|
|
### Large software project with autotools, e.g. OpenMPI
|
|
|
|
```bash
|
|
SRCFS=$HOME; BUILDFS=/localscratch/$UID; INSTALLFS=$HOME/opt
|
|
|
|
git clone --depth=1 git@github.com:open-mpi/ompi.git $SRCFS/ompi.git
|
|
cd $SRCFS/ompi.git; autoreconf -fiv
|
|
|
|
mkdir -p $INSTALLFS/ompi_test_347; mkdir -p $BUILDFS/build_ompi
|
|
|
|
cd $BUILDFS/build_ompi
|
|
$SRCFS/ompi.git/configure --prefix $INSTALLFS/ompi_test_347
|
|
make && make install
|
|
```
|
|
|
|
---
|
|
|
|
### Large software project with CMake, e.g. targetDART
|
|
|
|
```bash
|
|
SRCFS=$HOME; BUILDFS=/localscratch/$UID; INSTALLFS=$HOME/opt
|
|
|
|
git clone --depth=1 git@github.com:targetDART/llvm-project.git $SRCFS/TD.git
|
|
|
|
mkdir -p $INSTALLFS/TD_buggy_again; mkdir -p $BUILDFS/build_TD
|
|
|
|
cd $BUILDFS/build_TD
|
|
cmake $SRCFS/TD.git/llvm Ninja -DCMAKE_INSTALL_PREFIX=$INSTALLFS/TD_buggy_again
|
|
make && make install
|
|
```
|
|
|
|
---
|
|
|
|
### Avoiding large git checkouts
|
|
|
|
Do you really need all history? Do shallow clones.
|
|
```bash
|
|
git clone --depth=1 ...
|
|
```
|
|
|
|
---
|
|
|
|
### Mirror git repo to $BUILDFS
|
|
|
|
Assuming you have a (possibly large) git repo `my_repo.git` on `$HOME`.
|
|
Copy/clone source to `$BUILDFS` for faster access.
|
|
```bash
|
|
SRCFS=$HOME; BUILDFS=/localscratch/$UID; INSTALLFS=$HOME/opt
|
|
|
|
ls $SRCFS/my_repo.git
|
|
git clone --depth=1 file://$SRCFS/my_repo.git $BUILDFS/mirror_repo.git
|
|
mkdir $BUILDFS/build
|
|
|
|
cd $BUILDFS/build
|
|
cmake $BUILDFS/mirror_repo.git/ -DCMAKE_INSTALL_PREFIX=$INSTALLFS/whatever
|
|
$BUILDFS/mirror_repo.git/configure --prefix $INSTALLFS/whatever
|
|
|
|
```
|
|
|
|
---
|
|
|
|
### Cleanup after parallel job
|
|
|
|
Rsynch results of parallel job into `$HOME`.
|
|
```cat job.pbs
|
|
PERMANENTFS=$HOME/results/
|
|
|
|
cd $(ws_allocate my_job 1)
|
|
mpirun ./app --resultsdir=results
|
|
rsynch -a results $PERMANENTFS/
|
|
rm -rf results
|
|
```
|
|
|
|
---
|