Batch Jobs
A Slurm cluster is primarily designed to execute batch jobs i.e. workloads that are time-limited and non-interactive.
To submit a batch job, define it in a batch script (a standard shell script) and submit it with the sbatch command. After submission, Slurm allocates the requested resources, creating a job allocation (referred to as a job), and runs the script on one of the allocated compute nodes.
You specify all job requirements (time, CPUs, memory, GPUs, etc.) as #SBATCH directives at the top of your script.
Slurm will:
- Place your job in the queue.
- Wait for resources to become available.
- Run the job automatically on the assigned nodes.
Example:
#!/usr/bin/env bash
#SBATCH --job-name=test_job
#SBATCH --output=output.%j.out
#SBATCH --time=00:10:00
#SBATCH --partition=cpu
#SBATCH --ntasks=1
echo "Running on $HOSTNAME"
sbatch test_job.slurm
Use sbatch when:
- You want non-interactive job execution.
- You don’t need to monitor output in real time.
- You’re running longer or repeatable workloads (training, simulations, etc.).