Slurm

Starting in May 2019, we're testing our new Slurm setup. Slurm is similar to GridEngine: it manages a cluster, distributing user jobs in (hopefully) a fair and efficient way.

The concepts are comparable, but the syntax is not.

This page will hopefully grow organically. Feel free to make corrections and add your tips, tricks and insights.

Defaults

Some defaults:

  • There is one "partition" (like a "queue" in GridEngine). It is called "one" (suggestions welcome; it cannot be called "default").
  • Default runtime is 10 minutes.
  • Default memory is 10 GB.
  • By default, your job gets 1 GB of "scratch" local diskspace in "$TMPDIR".

Running jobs

You can run jobs using "srun" (interactively) or "sbatch" (like qsub).

srun

srun will execute the command given, and wait for it to finish. Some examples:

  • srun sleep 60

  • srun -n 4 bash -c "hostname; stress -c 10". This will start 4 seperate "tasks", each getting 1 CPU (2 cores on each). Eight threads in total.

This is different from:

  • srun -c 4 bash -c "hostname; stress -c 10". This will start 1 task, getting 4 cores (2 CPU's, 2 cores on each).

To me, the number of tasks, CPU's and cores is sometimes slightly surprising. I guess it will make sense after a while...

You can also use srun to get an interactive shell on a compute node (like qlogin):

  • srun -n 2 --mem 5G --time 01:00:00 --pty bash

Or on a specific node:

  • srun -n 2 --mem 5G --time 01:00:00 --nodelist n0014 --pty bash

sbatch

sbatch is like qsub. Commandline options are similar to srun, and can be embedded in a script file:

#!/bin/bash

#SBATCH -t 00:05:00
#SBATCH --mem=20G
#SBATCH -o log.out
#SBATCH -e errlog.out
#SBATCH --mail-type=FAIL
#SBATCH --mail-user=youremail@some.where #Email to which notifications will be sent

env
echo "Hello World" 

-- Martin Marinus - 2019-05-07

Comments

Edit | Attach | Watch | Print version | History: r23 | r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2019-05-07 - MartinMarinus
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback