Software options on Gadi
Global apps
On Gadi, there are shared (global) apps that are installed and managed by the system administrators. On Artemis, these were in /usr/local/
directory, on Gadi they are in /apps/
directory.
You can use the same module
commands that you are familiar with on Artemis to query and load apps on Gadi.
# List all global apps starting with 'p':
ls /apps/p*
# List all modules for python3:
module avail python3
# Load a specific version of python3:
module load python3/3.12.1
Each global app has a default version, so if you run without specifying a version, the default version will be loaded. While this is OK in some circumstances, it is typically recommended to specify the version you know works for your code. Default versions of global apps will change over time without warning, so reproducibility and functionality is best maintained by explicitly stating the version when you load a module within your script.
Software groups and specialised environments
NCI provides a range of software groups and specialised environments for different research fields. Before attempting to self-install a tool, you should check if the tool you need is available through global apps or one of these environments.
Please visit the NCI pages for more details:
Self-installed tools
Unlike Artemis, request of new apps to be installed are not always agreed to. NCI limits global apps to those with a high user base. This is to ensure good maintenance and curation of global apps.
Users are encouraged to either self-install apps from source into their /home
or /g/data
locations, or (recommended) use singularity containers. Installing into /scratch
is not recommended due to the 100-day file purge policy. Install into /g/data
is ideal when other members of your project need to use the same tool.
NCI may provide support for users through the self-install process; to request assistance, please email your detailed request including what software tool and version you are attempting to install and describe the issues you are having with the installation.
Once a tool has been installed, you can make that tool available to the module
commands, by following the steps described here. This is not essential, but can be helpful when managing tools that multiple group members will use.
To avoid the burden of installing software that is not available through global apps or specialised environments, the use of singularity containers is recommended.
Running singularity containers on Gadi
Singularity
can be used to execute containerised applications on Gadi. It is installed as a global app:
module load singularity
singularity version
# 3.11.3
Note that the singularity
project was recently migrated to apptainer. Please continue to run singularity
commands on Gadi for now.
Singularity (Apptainer) is a “container platform. It allows you to create and run containers that package up pieces of software in a way that is portable and reproducible. You can build a container using Apptainer on your laptop, and then run it on many of the largest HPC clusters in the world, local university or company clusters, a single server, in the cloud, or on a workstation down the hall. Your container is a single file, and you don’t have to worry about how to install all the software you need on each different operating system.” (from apptainer.org)
There are numerous container repositories, for example Docker Hub or quay.io. You can search these repositories, or build your own container if the tool or tool version you need is not yet available.
Seqera have greatly simplified the process of building custom containers with their build-your-own container tool! Simply search for the tool(s) you want in your container, and click Get Container
. This tool will manage the build for you, and host the created container file.
Example
Let’s assume you want to use the tool FoldSeek. Below are the steps to search for, obtain, test, and use this container in a PBS job script.
- Run
module load singularity
in your Gadi terminal (orcopyq
container download script) - Visit quay.io and search for this tool by typing the tool name into the search bar at the top right of the page. There may be multiple containers available. These may reflect different contributors, different tool versions, etc. Look for containers with recent updates and a high star rating where possible.
- Select the
biocontainers
container, and then select theTags
page from the icons on the left (options areInformation
,Tags
,Tag history
). - On the far right of the most recent tag, select the
Fetch tag
icon and then change image format toDocker Pull (by tag)
. - Copy the command. We need to made some changes to the command before we execute it.
- Paste the command into your terminal (or
copyq
container download script) and changedocker
tosingularity
and add prefixdocker://
to the container path, as shown below:
# Default command:
docker pull quay.io/biocontainers/foldseek:10.941cd33--h5021889_1
# Change to:
singularity pull docker://quay.io/biocontainers/foldseek:10.941cd33--h5021889_1
- Run the
pull
command (note: you need to have the singularity module loaded in your terminal or download script). This will download the docker container to a Singularity Image File (.sif
). Most containers are lightweight enough to pull directly on the login node. If a container is bulky and slow to download (or you need many containers), you may need to submit the download as acopyq
job. - Test the container with a basic help or version command. When running a container, typical usage involves the following command structure:
singularity exec <container> <command> [args]
So everything after the container name is the same as you would run when using a locally installed version of the tool. For foldseek
, we would use foldseek version
to view the tool version or foldseek help
to view the help menu. To run these commands via the container:
singularity exec foldseek_10.941cd33--h5021889_1.sif foldseek version
singularity exec foldseek_10.941cd33--h5021889_1.sif foldseek help
Note that the version of foldseek corresponds to the tag name: in this case, v. 10.941cd33.
- Test run the full tool command that you need to use in your analysis. Where possible, use a small subset of your data to test the command (as you would routinely do for new software). You may need to use the interactive job queue or a compute job script for testing if the test command exceeds resource limits on the login nodes.
- Add the command to your job script. Ensure to include
module load singularity
and call the container using the structuresingularity exec <container> <command> [args]
.
Below is an example Gadi job script running the foldseek
container:
#!/bin/bash
#PBS -P aa00
#PBS -q normal
#PBS -l ncpus=48
#PBS -l mem=190GB
#PBS -l walltime=02:00:00
#PBS -l storage=scratch/aa00+gdata/aa00
#PBS -l wd
module load singularity
input=/g/data/aa00/foldseek-inputs/input.fasta
database=/g/data/aa00/foldseek-inputs/database
output_dir=./foldseek-run/output
results=./foldseek-run/foldseek_result.tsv
log=./foldseek-run/foldseek.log
singularity exec \
\
foldseek_10.941cd33--h5021889_1.sif ${input} \
foldseek easy-search ${database} \
${output_dir} \
--threads ${PBS_NCPUS} \
${results} > ${log} 2>&1