diff --git a/README.md b/README.md index 6e06c62..b381b30 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,11 @@ # SPMT -Anything related to SPMT \ No newline at end of file +Anything related to SPMT. + +## Dashboard + +Guides: + - [Best Practice -- Storage on Hawk](guides/Best_practise--Storage_on_Hawk.md) + + + diff --git a/guides/Best_practice--Storage_on_Hawk.md b/guides/Best_practice--Storage_on_Hawk.md new file mode 100644 index 0000000..4c20481 --- /dev/null +++ b/guides/Best_practice--Storage_on_Hawk.md @@ -0,0 +1,127 @@ +# Best practice -- Storage on Hawk + + Change history: + - Initial version ; Jose Gracia, 7 May 2024 + +TODOs: + - [Cleanup after parallel job](Best_practice--Storage_on_Hawk.md#Cleanup%20after%20parallel%20job): check if there is a HLRS recommendation to copy files from workspaces. + + +## Available filesystems + +`$HOME` + - get current quota: `$ na_quota` + - group quota: 200 GB, no files limit + - user quota: 50 GB, no files limit + - mounted via NFS; relatively slow + +Workspaces + - get current quota: `$ ws_quota` + - group quota: 3 TB, 100k files + - user quota: none + - parallel file system Lustre; metadata slow, parallel access is fast + - hitting quota limit, disables queues for whole group + +`\localscratch\$UID` + - total size: `df -h /localscratch` -> 22 TB + - temporary scratch space + - deleted at logout + - local SSD; fast + - available only on login nodes + +--- + +## What to put where? + +Persistent data on $HOME. E.g: + - source code + - installed programs + +Temporary data/builds on `/localscratch/$UID`. E.g: + - anything temporary which does not fit on $HOME + - + +Data for parallel jobs on workspace. E.g. + - Input and outputs of jobs + - anything which is access through MPI-IO or similar + +--- + +## Typical workflows + +--- + +### Large software project with autotools, e.g. OpenMPI + +```bash +SRCFS=$HOME; BUILDFS=/localscratch/$UID; INSTALLFS=$HOME/opt + +git clone --depth=1 git@github.com:open-mpi/ompi.git $SRCFS/ompi.git +cd $SRCFS/ompi.git; autoreconf -fiv + +mkdir -p $INSTALLFS/ompi_test_347; mkdir -p $BUILDFS/build_ompi + +cd $BUILDFS/build_ompi +$SRCFS/ompi.git/configure --prefix $INSTALLFS/ompi_test_347 +make && make install +``` + +--- + +### Large software project with CMake, e.g. targetDART + +```bash +SRCFS=$HOME; BUILDFS=/localscratch/$UID; INSTALLFS=$HOME/opt + +git clone --depth=1 git@github.com:targetDART/llvm-project.git $SRCFS/TD.git + +mkdir -p $INSTALLFS/TD_buggy_again; mkdir -p $BUILDFS/build_TD + +cd $BUILDFS/build_TD +cmake $SRCFS/TD.git/llvm Ninja -DCMAKE_INSTALL_PREFIX=$INSTALLFS/TD_buggy_again +make && make install +``` + +--- + +### Avoiding large git checkouts + +Do you really need all history? Do shallow clones. +```bash +git clone --depth=1 ... +``` + +--- + +### Mirror git repo to $BUILDFS + +Assuming you have a (possibly large) git repo `my_repo.git` on `$HOME`. +Copy/clone source to `$BUILDFS` for faster access. +```bash +SRCFS=$HOME; BUILDFS=/localscratch/$UID; INSTALLFS=$HOME/opt + +ls $SRCFS/my_repo.git +git clone --depth=1 file://$SRCFS/my_repo.git $BUILDFS/mirror_repo.git +mkdir $BUILDFS/build + +cd $BUILDFS/build +cmake $BUILDFS/mirror_repo.git/ -DCMAKE_INSTALL_PREFIX=$INSTALLFS/whatever +$BUILDFS/mirror_repo.git/configure --prefix $INSTALLFS/whatever + +``` + +--- + +### Cleanup after parallel job + +Rsynch results of parallel job into `$HOME`. +```cat job.pbs +PERMANENTFS=$HOME/results/ + +cd $(ws_allocate my_job 1) +mpirun ./app --resultsdir=results +rsynch -a results $PERMANENTFS/ +rm -rf results +``` + +---