Line: 1 to 1 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Slurm | ||||||||||
Line: 136 to 136 | ||||||||||
| ||||||||||
Added: | ||||||||||
> > |
| |||||||||
|
Line: 1 to 1 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Slurm | ||||||||||
Line: 135 to 135 | ||||||||||
| ||||||||||
Changed: | ||||||||||
< < |
| |||||||||
> > |
| |||||||||
|
Line: 1 to 1 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Slurm | ||||||||||
Line: 63 to 63 | ||||||||||
env echo "Hello World" | ||||||||||
Changed: | ||||||||||
< < | < | |||||||||
> > |
sbatch Force job to run on a compute node#SBATCH --nodelist=n0065 # force to run job on | |||||||||
salloc/srun | ||||||||||
Line: 131 to 138 | ||||||||||
| ||||||||||
Added: | ||||||||||
> > |
| |||||||||
examples :
submit job named 1.sh |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 139 to 139 | ||||||||
slurm = sbatch 1.sh | ||||||||
Deleted: | ||||||||
< < | More Slurm Infohttps://wiki.bioinformatics.umcutrecht.nl/HPC/SlurmLimits | |||||||
FAQ about SLURM |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 139 to 139 | ||||||||
slurm = sbatch 1.sh | ||||||||
Added: | ||||||||
> > | More Slurm Infohttps://wiki.bioinformatics.umcutrecht.nl/HPC/SlurmLimits | |||||||
FAQ about SLURM |
Line: 1 to 1 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Slurm | ||||||||||
Line: 61 to 61 | ||||||||||
env echo "Hello World" | ||||||||||
Changed: | ||||||||||
< < | ||||||||||
> > | < | |||||||||
salloc/srun | ||||||||||
Line: 125 to 125 | ||||||||||
| ||||||||||
Changed: | ||||||||||
< < |
| |||||||||
> > |
| |||||||||
| ||||||||||
Line: 148 to 149 | ||||||||||
Slurm Tutorials![]() SGE to SLURM conversion ![]() SLURM user statistics ![]() | ||||||||||
Added: | ||||||||||
> > | Slurm commands![]() | |||||||||
Line: 1 to 1 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Slurm | ||||||||||
Line: 125 to 125 | ||||||||||
| ||||||||||
Added: | ||||||||||
> > |
| |||||||||
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 134 to 134 | ||||||||
= slurm = sbatch 1.sh | ||||||||
Added: | ||||||||
> > |
FAQ about SLURM | |||||||
Extra information : Official Slurm Workload Manager documentation ![]() Slurm Tutorials ![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 138 to 138 | ||||||||
Official Slurm Workload Manager documentation![]() Slurm Tutorials ![]() SGE to SLURM conversion ![]() | ||||||||
Added: | ||||||||
> > | SLURM user statistics![]() | |||||||
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 113 to 113 | ||||||||
Something like: | ||||||||
Changed: | ||||||||
< < | srun -p gpu -n 12 --gres=tmpspace:10G --gpus-per-node=1 --time 24:00:00 --mem 100G --pty bash | |||||||
> > | srun -p gpu -n 2 --gres=tmpspace:10G --gpus-per-node=1 --time 24:00:00 --mem 100G --pty bash | |||||||
will give you an interactive session with 1 GPU.
srun -p gpu --gpus-per-node=RTX2080Ti:1 --pty bash | ||||||||
Changed: | ||||||||
< < | will request a specific type of GPU. Currently we have 1 "RTX2080Ti" and 4 "TeslaV100". | |||||||
> > | will request a specific type of GPU. Currently we have 1 "RTX2080Ti" and 4 "TeslaV100", and one node with 4 RTX6000's. | |||||||
SGE versus SLURM
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 14 to 14 | ||||||||
Some defaults: | ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
| ||||||||
Line: 113 to 113 | ||||||||
Something like: | ||||||||
Changed: | ||||||||
< < | srun -n 12 --gres=tmpspace:10G --gpus-per-node=1 --time 24:00:00 --mem 100G --pty bash | |||||||
> > | srun -p gpu -n 12 --gres=tmpspace:10G --gpus-per-node=1 --time 24:00:00 --mem 100G --pty bash | |||||||
will give you an interactive session with 1 GPU. | ||||||||
Changed: | ||||||||
< < | srun --gpus-per-node=RTX2080Ti:1 --pty bash | |||||||
> > | srun -p gpu --gpus-per-node=RTX2080Ti:1 --pty bash | |||||||
will request a specific type of GPU. Currently we have 1 "RTX2080Ti" and 4 "TeslaV100". |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 119 to 119 | ||||||||
srun --gpus-per-node=RTX2080Ti:1 --pty bash | ||||||||
Changed: | ||||||||
< < | will request a specific type of GPU. | |||||||
> > | will request a specific type of GPU. Currently we have 1 "RTX2080Ti" and 4 "TeslaV100". | |||||||
SGE versus SLURM
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 113 to 113 | ||||||||
Something like: | ||||||||
Changed: | ||||||||
< < | srun -n 12 --gres=tmpspace:10G --gpus=1 --time 24:00:00 --mem 100G --pty bash | |||||||
> > | srun -n 12 --gres=tmpspace:10G --gpus-per-node=1 --time 24:00:00 --mem 100G --pty bash | |||||||
will give you an interactive session with 1 GPU. | ||||||||
Changed: | ||||||||
< < | srun --gpus=RTX2080Ti:1 --pty bash | |||||||
> > | srun --gpus-per-node=RTX2080Ti:1 --pty bash | |||||||
will request a specific type of GPU. |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 113 to 113 | ||||||||
Something like: | ||||||||
Changed: | ||||||||
< < | srun -n 12 --gres=tmpspace:10G --gres=gpu:1 --time 24:00:00 --mem 100G --pty bash | |||||||
> > | srun -n 12 --gres=tmpspace:10G --gpus=1 --time 24:00:00 --mem 100G --pty bash | |||||||
will give you an interactive session with 1 GPU. | ||||||||
Changed: | ||||||||
< < | srun --gres=gpu:RTX2080Ti:1 --pty bash | |||||||
> > | srun --gpus=RTX2080Ti:1 --pty bash | |||||||
will request a specific type of GPU. |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 117 to 117 | ||||||||
will give you an interactive session with 1 GPU. | ||||||||
Added: | ||||||||
> > | srun --gres=gpu:RTX2080Ti:1 --pty bash
will request a specific type of GPU. | |||||||
SGE versus SLURM
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 109 to 109 | ||||||||
Of course, this works for all the commands. The scratch disk space will be made available in $TMPDIR (/scratch/$SLURM_JOB_ID) and will be erased automatically when your job is finished. | ||||||||
Added: | ||||||||
> > | Using a GPUSomething like:srun -n 12 --gres=tmpspace:10G --gres=gpu:1 --time 24:00:00 --mem 100G --pty bash
will give you an interactive session with 1 GPU. | |||||||
SGE versus SLURM
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 118 to 118 | ||||||||
examples :
submit job named 1.sh | ||||||||
Changed: | ||||||||
< < | qsub 1.sh = sbatch 1.sh | |||||||
> > | sge = qsub 1.sh = slurm = sbatch 1.sh | |||||||
Extra information : Official Slurm Workload Manager documentation ![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 121 to 120 | ||||||||
submit job named 1.sh qsub 1.sh = sbatch 1.sh | ||||||||
Changed: | ||||||||
< < |
Extra information :
https://srcc.stanford.edu/sge-slurm-conversion![]() | |||||||
> > | Extra information : Official Slurm Workload Manager documentation ![]() Slurm Tutorials ![]() SGE to SLURM conversion ![]() | |||||||
-- ![]() | ||||||||
Deleted: | ||||||||
< < |
Comments |
Line: 1 to 1 | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Slurm | |||||||||||||
Line: 8 to 8 | |||||||||||||
This page will hopefully grow organically. Feel free to make corrections and add your tips, tricks and insights. | |||||||||||||
Added: | |||||||||||||
> > | |||||||||||||
DefaultsSome defaults: | |||||||||||||
Line: 15 to 17 | |||||||||||||
| |||||||||||||
Changed: | |||||||||||||
< < |
| ||||||||||||
> > |
| ||||||||||||
Running jobs | |||||||||||||
Changed: | |||||||||||||
< < | You can run jobs using "srun" (interactively) or "sbatch" (like qsub). | ||||||||||||
> > | You can run jobs using "srun" (interactively), "sbatch" (like qsub), or use "salloc" to allocate resources and then "srun" your commands in that allocation. | ||||||||||||
srun | |||||||||||||
Line: 61 to 63 | |||||||||||||
echo "Hello World" | |||||||||||||
Changed: | |||||||||||||
< < | SGE versus SLURM | ||||||||||||
> > | salloc/srunQuoting from the documentation: The final mode of operation is to create a resource allocation and spawn job steps within that allocation. The salloc command is used to create a resource allocation and typically start a shell within that allocation. One or more job steps would typically be executed within that allocation using the srun command to launch the tasks. Finally the shell created by salloc would be terminated using the exit command. Be very careful to use srun to run the commands within your allocation. Otherwise, the commands will run on the machine that you're logged in on! See:# Allocate two compute nodes: [mmarinus@hpcm05 ~]$ salloc -N 2 salloc: Pending job allocation 1635 salloc: job 1635 queued and waiting for resources salloc: job 1635 has been allocated resources salloc: Granted job allocation 1635 salloc: Waiting for resource configuration salloc: Nodes n[0009-0010] are ready for job # I got n0009 and n0010 [mmarinus@hpcm05 ~]$ srun hostname n0009.compute.hpc n0010.compute.hpc # But this command just runs on the machine I started the salloc command from! [mmarinus@hpcm05 ~]$ hostname hpcm05.manage.hpc # Even if you "srun" something, be careful where (e.g.) variable expansion is done: [mmarinus@hpcm05 ~]$ srun echo "running on $(hostname)" running on hpcm05.manage.hpc running on hpcm05.manage.hpc # Exit the allocation [mmarinus@hpcm05 ~]$ exit exit salloc: Relinquishing job allocation 1635 Local (scratch) disk spaceIf your job benefits from (faster) local disk space (like "qsub -l tmpspace=xxx"), request it like this:srun --gres=tmpspace:250M --pty bash
Of course, this works for all the commands. The scratch disk space will be made available in $TMPDIR (/scratch/$SLURM_JOB_ID) and will be erased automatically when your job is finished.
SGE versus SLURM | ||||||||||||
| |||||||||||||
Changed: | |||||||||||||
< < |
| ||||||||||||
> > |
| ||||||||||||
examples : |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Slurm | ||||||||
Line: 71 to 71 | ||||||||
examples :
submit job named 1.sh | ||||||||
Changed: | ||||||||
< < | qsub 1.sh sbatch 1.sh | |||||||
> > | qsub 1.sh = sbatch 1.sh | |||||||
Added: | ||||||||
> > | Extra information :
https://srcc.stanford.edu/sge-slurm-conversion![]() | |||||||
Line: 1 to 1 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Slurm | ||||||||||||||||
Line: 61 to 61 | ||||||||||||||||
echo "Hello World" | ||||||||||||||||
Added: | ||||||||||||||||
> > | SGE versus SLURM
qsub 1.sh sbatch 1.sh | |||||||||||||||
-- ![]() Comments |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Added: | ||||||||
> > |
SlurmStarting in May 2019, we're testing our new Slurm setup. Slurm is similar to GridEngine: it manages a cluster, distributing user jobs in (hopefully) a fair and efficient way. The concepts are comparable, but the syntax is not. This page will hopefully grow organically. Feel free to make corrections and add your tips, tricks and insights.DefaultsSome defaults:
Running jobsYou can run jobs using "srun" (interactively) or "sbatch" (like qsub).srunsrun will execute the command given, and wait for it to finish. Some examples:
sbatchsbatch is like qsub. Commandline options are similar to srun, and can be embedded in a script file:#!/bin/bash #SBATCH -t 00:05:00 #SBATCH --mem=20G #SBATCH -o log.out #SBATCH -e errlog.out #SBATCH --mail-type=FAIL #SBATCH --mail-user=youremail@some.where #Email to which notifications will be sent env echo "Hello World"-- ![]() Comments |