Changes

LOGER Benoit · e2bb735c
--- a/Slurm-job-arrays.md
+++ b/Slurm-job-arrays.md
+An other approach to run several independent tasks in parallel is to use Slurm array of jobs. It can be done by adding a single *#SBATCH* option to your sbatch file. Here is a simple example that uses the *--array* option: 
+
+```
+#!/bin/bash
+#SBATCH --job-name=myarray_job
+#SBATCH --ntasks=1
+#SBATCH --cpus-per-task=1
+#SBATCH --array=1-10%5
+#SBATCH --mem-per-cpu=1gb
+#SBATCH --time=0-00:30:00
+#SBATCH --output=0-00:01:00
+
+./myapp arg1 arg2
+```
+This script (note the --array=1-10 option) will run one *parent* job that will run 10 *child* jobs performing a single task.
+
+**What is different ?**
+- The *parent* job will run a new *child* job everytime there is enough computation ressources available (instead of waiting to be able to run them all in parallel)
+- You can specify how many *child* jobs (at most) should be run in parallel *--array=1-10%5* will run at most 5 *child* jobs simultaneously
+- You can define the set of identifiers for your *child* jobs (i.e. use --array=1,5,6)
+
+## Separated outputs
+The use of array of Jobs is an opportunity to define a specific output file for every job.
+```
+#SBATCH --output=slurm-%A-%a.log
+```
+this command line will create one output file for every *child* job (%A will be replaced by the ID of the *parent* job and %a by the ID of the *child* job)
+
+## Using configuration files
+Using arrays of jobs will instantiate different slurm variables and in particular *SLURM_ARRAY_TASK_ID*.
+This variable can then be used in your sbatch file to extract information from a configuration file.
+Here is an example of how you can use slurm variables to configure the execution of your jobs.
+
+```
+#!/bin/bash
+#SBATCH --job-name=myarray_job             # Name of the parent job
+#SBATCH --ntasks=1                         # Each child job run 1 task
+#SBATCH --cpus-per-task=1                  # Each task require 1 cpu
+#SBATCH --array=1-10%5                     # Running 50 child jobs with IDs in [1,50]
+#SBATCH --mem-per-cpu=1gb                  # Using at most 1gb of memory per cpu
+#SBATCH --time=0-00:30:00                  # Child jobs will be killed if longer than 30 minutes
+#SBATCH --output=logs/array_%A-%a.logs     # One log file per child job
+
+
+# Specify the path to the config file
+config=config.txt
+
+# Extract the instance number for the current $SLURM_ARRAY_TASK_ID
+inst=$(awk -v ArrayTaskID=$SLURM_ARRAY_TASK_ID '$1==ArrayTaskID {print $2}' $config)
+
+# Extract the value of a parameter for the current $SLURM_ARRAY_TASK_ID
+param=$(awk -v ArrayTaskID=$SLURM_ARRAY_TASK_ID '$1==ArrayTaskID {print $3}' $config)
+
+# Execute my application/code with the parameters specified in my configuration file
+./myapp $inst $param 
+
+```  
+
+And the corresponding configuration file *config.txt*:
+
+```
+ArrayTaskID Instance Parameter
+1           1        15
+2           2        15
+3           3        15
+4           4        15
+5           5        20
+6           6        20
+7           7        20
+8           8        30
+9           9        30
+10          10       30
+```
+## Running the example
+```
+$ sbatch script_array_job.sh &
+  $ Submitted batch job 130995
+$ squeue
+             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
+   130995_[6-10%5]      ls2n myarray_ b19loger PD       0:00      1 (JobArrayTaskLimit)
+          130995_1      ls2n myarray_ b19loger  R       0:02      1 srvoad
+          130995_2      ls2n myarray_ b19loger  R       0:02      1 srvoad
+          130995_3      ls2n myarray_ b19loger  R       0:02      1 srvoad
+          130995_4      ls2n myarray_ b19loger  R       0:02      1 srvoad
+          130995_5      ls2n myarray_ b19loger  R       0:02      1 srvoad
+
+```
+
+For more specific usage and more information check the [slurm documentation](https://slurm.schedmd.com/job_array.html).
\ No newline at end of file