Execute a Job
The tutorial below shows you how to run a basic job on Doane's Onyx supercomputer. This tutorial is intended for users who are new to the HPC environment and leverages a Slurm batch (sbatch) script.
Additional examples can be found in C++, Fortran or Python sections.
Table of Contents
📝 Note: Do not execute jobs on the login nodes; only use the login nodes to access your compute nodes. Processor-intensive, memory-intensive, or otherwise disruptive processes running on login nodes will be killed without warning.
Step 1: Access the Onyx HPC
Open a Bash terminal (or MobaXterm for Windows users).
Execute
ssh doaneusername@onyx.doane.edu
.If promted by a security message, type
yes
to continue connection.When prompted, enter your password.
Once you have connected to the head node, you can proceed to Step 2 and begin assembling your sbatch script.
Step 2: Create an sbatch Script
Below is the sbatch script we are using to run an MPI "hostname" program as a batch job. sbatch scripts use variables to specify things like the number of nodes and cores used to execute your job, estimated walltime for your job, and which compute resources to use (e.g., GPU vs. CPU). The sections below feature an example sbatch script for HPC resources, show you how to create and save your own sbatch script, and show you how store the sbatch script on an HPC file system.
Consult the official Slurm documentation for a complete list of sbatch variables.
Example sbatch Script
Here is an example sbatch script for running a batch job on Onyx. We break down each command in the section below.
sbatch Script Breakdown
Here, we break down the essential elements of the above PBS script.
#!/bin/bash
: sets the script type#SBATCH -n 8
: sets the number of processors that you want to use to run your job; use -N to specify nodes instead of processors#SBATCH -o test_%A.out
: sets the name of the output file; here%A
will be replaced by slurm with the job number;#SBATCH --error test_%A.err
: sets the name of the error output file; here%A
will be replaced by slurm with the job number;#SBATCH --mail-user $CHANGE_TO_YOUR_EMAIL
: add your email address if you would like your job's status to be emailed to you#SBATCH --mail-type ALL
: specifies which job status changes you want to be notified about; options include: NONE, BEGIN, END, FAIL, REQUEUE, ALLsrun -l hostname
: Run a parallel job on cluster managed by Slurm, in this case runhostname
in parallel with-l
prepending task numbers to lines of output stdout/err.module avail
: Lists the currently available software modules.module purge
: Clears any modules currently loaded that might result in a conflict.module load gnu
: Loads the gnu module version 5.4.0.module load spack-apps
: Loads additional applications managed by the spack software.module load openmpi-3.0.0-gcc-5.4.0-clbdgmf
: Loads the openmpi module.module list
: confirms the modules that were loaded.mpirun -np 16 hostname
:hostname
is called again, this time Slurm calls OpenMPI to runhostname
with the number of processors we specified earlier.
sbatch Procedure
Now that we have covered the basics of a sbatch script in the context of an HPC, we will now talk about actually creating and using the script on Onyx.
When creating and editing your sbatch script, we will be working on the head node using the text editor, nano.
From the login node, change your working directory to your home directory.
Use nano to create and edit your sbatch script.
Write your sbatch script within nano or paste the contents of your sbatch script into nano.
Copy sbacth script from this page
Hit Control/Command + p key to paste into nano from Windows/MacOS
When finished, hit
^X
(control + x key) to exit.Enter
Y
to save your changes, and pressReturn
to save your file and return to the Bash shell.
With the sbatch script in place, you can now move on to running the script in Step 3.
Step 3: Run the Job
Before proceeding, ensure that you are still in your working directory (using
pwd
).We need to be in the same path/directory as our sbatch script. Use
ls -al
to confirm its presence.
Use
sbatch
to schedule your batch job in the queue.This command will automatically queue your job using Slurm and produce a job number.
You can check the status of your job at any time with the
squeue
command.squeue --job <jobnumber>
is likely to not return any information, as this test job takes only a second to complete.You can also stop your job at any time with the
scancel
command.View your results. Once your job completes, slurm will produce an output/data file. This output/data file, unless otherwise specified in the sbatch script, are placed in the same path as your binary. The file (
test_<jobnumber>.out
) contains the results of the binary you just executed. Replace "myscript" with the name of your script and "<jobnumber>" with your job number. You can view the contents of these files using themore
command followed by the file name.Your output should look something like this:
📝 Note: The number and order of the hostnames will be different for you. If you see any errors, try typing in the sbatch script by hand instead of copying and pasting it. Sometimes the clipboard of your OS will bring along extra hidden characters that confuse Bash and Slurm.
Download your results (using the
scp
command or an SFTP client) or move them to persistent storage. See our moving data section for help.
Additional Examples
Last updated