|
|
This page describes the commands used to run jobs with slurm on SRVOAD. Note that running a program on SRVOAD requires allocating resources to avoid interfering with other users' experiments, the resource allocation is managed by slurm when you use one of these three commands.
|
|
|
|
|
|
## [salloc](https://slurm.schedmd.com/salloc.html)
|
|
|
*salloc* is used to allocate a Slurm job allocation, which is a set of resources (cpus), possibly with some set of constraints (e.g. time limit, memory per cpu). When salloc successfully obtains the requested allocation, it then runs the command specified by the user. Finally, when the user specified command is complete, salloc relinquishes the job allocation.
|
|
|
|
|
|
The command may be any program the user wishes. Some typical commands are xterm, a shell script containing srun commands. If no command is specified, then salloc runs the user's default shell.
|
|
|
|
|
|
```
|
|
|
$ salloc --cpus-per-task=1 --mem-per-cpu=100mb
|
|
|
salloc: Granted job allocation 130214
|
|
|
$ exit
|
|
|
exit
|
|
|
salloc: Relinquishing job allocation 130214
|
|
|
salloc: Job allocation 130214 has been revoked.
|
|
|
|
|
|
```
|
|
|
|
|
|
## [srun](https://slurm.schedmd.com/srun.html)
|
|
|
Run a parallel job on SRVOAD. If necessary, srun will first create a resource allocation in which to run the parallel job (More details [here](https://slurm.schedmd.com/cpu_management.html)).
|
|
|
|
|
|
To run your code you need to specify different parameters value to allocate resources:
|
|
|
| Option | short | Description |
|
|
|
| ------ | ----- | ----------- |
|
|
|
| --nodes=1 | -N1 | Number of node used (SRVOAD have only one node) |
|
|
|
| --ntasks=1 | -n1 | Number of tasks to be run |
|
|
|
| --cpus-per-task=1 | -c1 | Number of cpus used by each task |
|
|
|
| --time=0-00:30:00 | -t0-00:30:00 | Time limit (d-hh-mm-ss) |
|
|
|
| --mem-per-cpus=100mb | ... | Maximum amount of memory allocated per CPU |
|
|
|
| --mem=100mb | ... | minimum amount of real memory |
|
|
|
|
|
|
```
|
|
|
$ srun --nodes=1 --ntasks=1 --cpus-per-task=1 --mem-per-cpu=100mb --time=0-00:00:30 sleep 20 &
|
|
|
|
|
|
$ srun -N1 -n1 -c1 -t1 --mem=100mb sleep 20 &
|
|
|
```
|
|
|
|
|
|
## [sbatch](https://slurm.schedmd.com/sbatch.html)
|
|
|
*sbatch* submits a batch script to Slurm. The batch script may be given to sbatch through a file name on the command line, or if no file name is specified, sbatch will read in a script from standard input. The batch script may contain options preceded with "#SBATCH" before any executable commands in the script. sbatch will stop processing further #SBATCH directives once the first non-comment non-whitespace line has been reached in the script.
|
|
|
|
|
|
```
|
|
|
sbatch myscript.sh &
|
|
|
```
|
|
|
Here is an example of file:
|
|
|
```
|
|
|
#!/bin/bash
|
|
|
#SBATCH --job-name=parallel_job_test # Job name
|
|
|
#SBATCH --mail-type=END,FAIL # Mail events (Not available on SRVOAD)
|
|
|
#SBATCH --mail-user=email@imt-atlantique.fr # Where to send mail
|
|
|
#SBATCH --nodes=1 # Run all processes on a single node
|
|
|
#SBATCH --ntasks=1 # Number of processes
|
|
|
#SBATCH --cpus-per-task=4 # Number of CPU per task
|
|
|
#SBATCH --mem=1gb # Total memory limit
|
|
|
#SBATCH --time=01:00:00 # Time limit hrs:min:sec
|
|
|
#SBATCH --output=example_%j.log # Standard output and error log
|
|
|
|
|
|
./my-app arg1 arg2
|
|
|
|
|
|
```
|
|
|
This file will run one task using at most 4 cpus, 1gb of memory for at most 1 hour and print the output in example_%j.log (%j will be replaced by the job allocation id). |
|
|
\ No newline at end of file |