PBS jobs and Gadi-specific commands

Overview

Teaching: 20 min
Exercises: 0 min

Questions

PBS jobs on Gadi and compute node local disk

Objectives

Understand what PBS directives are required on Gadi

Become familiar with some Gadi-specific commands for job submission and monitoring

Use of jobfs

Key differences between Gadi and Artemis

Like Artemis, Gadi uses the PBS Pro job scheduler. Scripts that you have used on Artemis are thus easily portable to Gadi. There are some key directive and command differences on Gadi compared to Artemis, as described below:

PBS job arrays: these are not permitted on Gadi. Other means of parallel task execution are required. An example using OpenMPI and the custom utility ‘nci-parallel’ are demonstrated at the end of this material in Section 8 Example parallel job.

Max walltime: 48 hours (fewer for increasing numbers of cores). Requests to lift the walltime on a per-job basis are rigorously assessed and not encouraged. Gadi is optimally suited for “shorter, wider” jobs, ie those that use more nodes for fewer hours, compared to fewer nodes for more hours. For some jobs, Artemis may be more suitable given the 14 day walltime limit. Another alternative is UQ Flashlite - contact SIH for more information on this facility.

lstorage directive: this essential directive specifies which filesystem locations your job needs read/write access to. Failure to include a filesystem location that is required by your job will cause the job to fail. The below example (both are equivalent) requires three locations. Note the lack of leading ‘/’ on the paths! Also note that the path gdata/ is not valid for other commands - use instead /g/data/ for ls, cd etc.

#PBS -l storage=scratch/er01+scratch/ch81+gdata/er01

#PBS -lstorage=scratch/er01+scratch/ch81+gdata/er01

wd directive: this optional directive is similar to the Artemis command cd $PBS_O_WORKDIR. PBS assumes that the working directory is the user’s home directory, which is often not the case (/home is only 10 GB, backed up). Including this directive changes the working directory to the directory from where the qsub command is issued.

#PBS -l wd

jobfs directive: this optional directive specifies the amount of local-to-the-node storage your job requires. This provides scratch space on the compute nodes that persists only for the duration of the job. Resumable jobs should thus not write checkpoint files to jobfs. Jobs that can benefit from using jobfs perform large numbers of I/O operations per second. I/O can be faster on jobfs than the shared filesystems. For multi-node jobs, the total jobfs requested is divided among the nodes. To specify the jobfs path within the job script, use the PBS environment variable $PBS_JOBFS. Within the job, standard commands like mkdir ${PBS_JOBFS}/temp are available. Note that for multi-node jobs, while the filepath for jobfs may appear to be the same for each node (ie, $PBS_JOBFS), the jobfs for each node is a physically different drive, so compute nodes cannot read the jobfs of any other compute node, even if they are within the same job. When you are requesting entire nodes, you may as well request all of the available jobfs for those nodes, as it cannot be utilised by any other jobs or users during your job.

#PBS -l jobfs=500GB

out=${PBS_JOBFS}/<myoutfile>

qls command: this command lists the files inside $PBS_JOBFS, for each node. For jobs with many outputs, this can produce a lot of STDOUT so use more, grep etc to manage the output.

qls <jobID>

qls <jobID> | grep -A 20 "Node <N>"

qcp command: this command enables you to copy files from jobfs via the login nodes during a running job. Node number can be obtained from qls output. To copy files from jobfs to one of the distributed filesystems from the running job itself, cp can be used as usual.

qcp -n <NodeNumber> <jobID>/<myoutfile> <destinationPath>

qps command: this command displays the status (% CPU, time, RSS) of each process running inside a job. For massively parallel jobs, this can produce a lot of STDOUT so use more, grep etc to manage the output.

qps <jobID>

qps <jobID> | grep -A 20 "Node <N>"

nqstat_anu: this command reports % CPU, time and memory information for each of your running jobs across all queues.

nqstat_anu

Example job script

Other PBS directives that you are familiar with on Artemis (N, P, walltime etc) remain the same on Gadi. The below example requests 66 nodes (48 CPU per node in the ‘normal’ queue), with 500 GB jobfs, for 1.5 hours, using the present directory as working directory, and redirects the .o and .e PBS logs to a specific location and file name (optional). The umask=022 directive changes the permissions of the PBS job logs so that others in the group can read (but not write) them; this is optional, but recommended for working collaboratively.

Please note that the select statement used in Artemis jobs is not accepted on Gadi, and that the value selected is NCPUS not nodes. When requesting >1 node, you must request entire nodes, eg for a job requiring 80 CPU in normal queue, you would need to request 96 CPU (2 nodes). Your job will be immediately rejected by the scheduler if this condition is not met.

#!/bin/bash

#PBS -P <project>
#PBS -N align
#PBS -l walltime=01:30:00
#PBS -l ncpus=3168
#PBS -l mem=12540GB
#PBS -l jobfs=500GB
#PBS -l wd
#PBS -q normal
#PBS -W umask=022
#PBS -o ./Logs/align-N43.o 
#PBS -e ./Logs/align-N43.e 
#PBS -l storage=scratch/<project>

module load <software>/<version>

<commands to run the job>

The above directives could also be written in this way:

#PBS -P <project>
#PBS -N align
#PBS -l walltime=01:30:00,ncpus=3168,mem=12540GB,jobfs=500GB,wd,storage=scratch/<project>
#PBS -q normal
#PBS -W umask=022
#PBS -o ./Logs/align-N43.o 
#PBS -e ./Logs/align-N43.e 

Job submission and monitoring

Submitting a job on Gadi uses the same command as submitting a job on Artemis:

qsub <jobscript>

or, any number of the PBS directives can be included on the qsub command line:

qsub -P <project> -N <jobName> -l walltime=01:30:00,ncpus=3168,mem=12540GB,jobfs=500GB,wd,storage=scratch/<project> -q <queue> <script>

Along with the job monitoring commands described above, standard qstat commands are available, eg

qstat <jobID>
qstat -xf <jobID>
qstat -u <user>
qdel <jobID>
qselect -u <user> | xargs qdel

Questions

Does anyone have any questions about PBS jobs on Gadi or Gadi-specific commands?

Key Points

Your Artemis PBS scripts will be easily portable to Gadi, with a few small changes

Gadi has some custom commands and directives

previous episode

Introduction to NCI Gadi

next episode