The tutorial below shows you how to run a basic job on Doane's Onyx supercomputer. This tutorial is intended for users who are new to the HPC environment and leverages a Slurm batch (sbatch) script.
Table of Contents
📝 Note: Do not execute jobs on the login nodes; only use the login nodes to access your compute nodes. Processor-intensive, memory-intensive, or otherwise disruptive processes running on login nodes will be killed without warning.
Open a Bash terminal (or MobaXterm for Windows users).
ssh [email protected].
If promted by a security message, type
yes to continue connection.
When prompted, enter your password.
Once you have connected to the head node, you can proceed to Step 2 and begin assembling your sbatch script.
Below is the sbatch script we are using to run an MPI "hostname" program as a batch job. sbatch scripts use variables to specify things like the number of nodes and cores used to execute your job, estimated walltime for your job, and which compute resources to use (e.g., GPU vs. CPU). The sections below feature an example sbatch script for HPC resources, show you how to create and save your own sbatch script, and show you how store the sbatch script on an HPC file system.
Consult the official Slurm documentation for a complete list of sbatch variables.
Here is an example sbatch script for running a batch job on Onyx. We break down each command in the section below.
#!/bin/bash#SBATCH -n 16#SBATCH -o test_%A.out#SBATCH --error test_%A.err#SBATCH --mail-user $CHANGE_TO_YOUR_EMAIL#SBATCH --mail-type ALLsrun -l hostnamemodule availmodule purgemodule load gnu/5.4.0module load spack-appsmodule load openmpi-3.0.0-gcc-5.4.0-clbdgmfmodule listmpirun -np 16 hostname
Here, we break down the essential elements of the above PBS script.
#!/bin/bash: sets the script type
#SBATCH -n 8: sets the number of processors that you want to use to run your job; use -N to specify nodes instead of processors
#SBATCH -o test_%A.out: sets the name of the output file; here
%A will be replaced by slurm with the job number;
#SBATCH --error test_%A.err: sets the name of the error output file; here
%A will be replaced by slurm with the job number;
#SBATCH --mail-user $CHANGE_TO_YOUR_EMAIL: add your email address if you would like your job's status to be emailed to you
#SBATCH --mail-type ALL: specifies which job status changes you want to be notified about; options include: NONE, BEGIN, END, FAIL, REQUEUE, ALL
srun -l hostname: Run a parallel job on cluster managed by Slurm, in this case run
hostname in parallel with
-l prepending task numbers to lines of output stdout/err.
module avail: Lists the currently available software modules.
module purge: Clears any modules currently loaded that might result in a conflict.
module load gnu: Loads the gnu module version 5.4.0.
module load spack-apps: Loads additional applications managed by the spack software.
module load openmpi-3.0.0-gcc-5.4.0-clbdgmf: Loads the openmpi module.
module list: confirms the modules that were loaded.
mpirun -np 16 hostname:
hostname is called again, this time Slurm calls OpenMPI to run
hostname with the number of processors we specified earlier.
Now that we have covered the basics of a sbatch script in the context of an HPC, we will now talk about actually creating and using the script on Onyx.
When creating and editing your sbatch script, we will be working on the head node using the text editor, nano.
From the login node, change your working directory to your home directory.
Use nano to create and edit your sbatch script.
Write your sbatch script within nano or paste the contents of your sbatch script into nano.
Copy sbacth script from this page
Hit Control/Command + p key to paste into nano from Windows/MacOS
When finished, hit
^X (control + x key) to exit.
Y to save your changes, and press
Return to save your file and return to the Bash shell.
With the sbatch script in place, you can now move on to running the script in Step 3.
Before proceeding, ensure that you are still in your working directory (using
We need to be in the same path/directory as our sbatch script. Use
ls -al to confirm its presence.
sbatch to schedule your batch job in the queue.
This command will automatically queue your job using Slurm and produce a job number.
You can check the status of your job at any time with the
squeue --job <jobnumber>
squeue --job <jobnumber> is likely to not return any information, as this test job takes only a second to complete.
You can also stop your job at any time with the
scancel --job <jobnumber>
View your results.
Once your job completes, slurm will produce an output/data file. This output/data file, unless otherwise specified in the sbatch script, are placed in the same path as your binary.
The file (
test_<jobnumber>.out) contains the results of the binary you just executed.
Replace "myscript" with the name of your script and "<jobnumber>" with your job number.
You can view the contents of these files using the
more command followed by the file name.
Your output should look something like this:
2: compute-110: compute-37: compute-213: compute-41: compute-13: compute-10: compute-18: compute-39: compute-311: compute-35: compute-26: compute-24: compute-215: compute-412: compute-414: compute-4--------------------- /opt/ohpc/pub/moduledeps/gnu-openmpi ---------------------scipy/0.19.1------------------------- /opt/ohpc/pub/moduledeps/gnu -------------------------R_base/3.3.3 openblas/0.2.20 python3/3.7.1numpy/1.12.1 openmpi/1.10.7 (L)-------------------------- /opt/ohpc/pub/modulefiles ---------------------------cmake/3.11.1 ohpc (L) singularity/2.5.1gnu/5.4.0 (L) pmix/2.1.1 spack-apps/0.11.2gnu7/7.3.0 prun/1.2 (L) valgrind/3.13.0Where:L: Module is loadedUse "module spider" to find all possible modules.Use "module keyword key1 key2 ..." to search for all possible modules matchingany of the "keys".Currently Loaded Modules:1) gnu/5.4.0 2) openmpi/1.10.7compute-1compute-1compute-1compute-3compute-1compute-3compute-3compute-3compute-2compute-2compute-2compute-2compute-4compute-4compute-4compute-4
📝 Note: The number and order of the hostnames will be different for you. If you see any errors, try typing in the sbatch script by hand instead of copying and pasting it. Sometimes the clipboard of your OS will bring along extra hidden characters that confuse Bash and Slurm.
Download your results (using the
scp command or an SFTP client) or move them to persistent storage. See our moving data section for help.