Why use Artemis
Artemis is ideal for calculations that require:
- A long time to complete (long walltime)
- High RAM usage
- Big data input or outputs
- Multiple iterations of program inputs.
- Are using specific software that is designed to be run on HPC Systems (multiple cores or nodes run in parallel)
Artemis is available free of charge to all University of Sydney researchers. You require a Unikey, and a valid Research Data Management Plan via DASHR with Artemis HPC access enabled.
Artemis is also a great incentive to funding bodies to view your projects favourably – as they know you have the resources required to get the work done.
What is Artemis
Artemis is the University of Sydney’s High Performance Computer. Technically, Artemis is a computing cluster, which is a whole lot of individual computers networked together. At present, Artemis consists of:
- 7,636 cores (CPUs)
- 45 TB of RAM
- 108 NVIDIA V100 GPUs
- 1 PB of storage
- 56 Gbps FDR InfiniBand (networking)
Artemis computers (which we call machines or nodes) run a Linux operating system, CentOS v6.10. Computing performed on Artemis nodes is managed by a scheduler, and ours is an instance of PBS Pro.
Navigating Artemis
As Artemis runs on a Linux operating system, many of the folders and structure of them adhere to standard Linux file systems. This includes a root folder /
and a tree like structure from it that includes things like binaries /bin
, temporary files /tmp
, configuration files / etc
etc etc…There are a few exceptions that are specific to Artmemis, notably /home, /project, and /scratch.
/home
/home
is your home directory with 10gb of storage. Every user on Artemis (and any Linux system) has a home directory located at /home/<username>
. Your username may be your Unikey, or it may be the training accounts we’re using today. Every time you log in to Artemis or submit a job, by default you will end up here, `/home/```
/project
The /project
branch is where researchers on a project should keep all the data and programs or scripts that are currently being used. We say ‘currently’ because project directories are allocated only 1 TB of storage space – that may sound like a lot, but this space is shared between all users of your project, and it runs out faster than you think. Data that you’re not currently working on should be deleted or moved back to its permanent storage location (which should be the Research Data Store1).
This also means that everyone working on your project can see all files in the project directory by default.
Project directories are all subfolders of /project, and will have names like:
/project/RDS-CORE-Training-RW
which take the form /RDS-<Faculty>-<Short_name>-RW
/scratch
Every project also has a /scratch
directory, at /scratch/<Short_name>
. Short_name
is the short name of the project as defined by dashr. /scratch
is where you should actually perform your computations. What does this mean? If your workflow:
- Has big data inputs
- Generates big data outputs
- Generates lots of intermediate data files which are not needed afterwards
then you should put and save this data in your /scratch
space. The reasons are that you are not limited to 1 TB of space in /scratch
as you are in /project
, and there is also no reason to clutter up your shared project directories with temporary files. Once your computation is complete, simply copy the important or re-used inputs and outputs back to your /project
folder (or better yet, your /rds
folder), and delete your data in /scratch
– that’s why it’s called scratch!
Key Point: Inactive data in /project
is wiped after 6 months, and in /scratch
after 3 months! And Artemis is not backed up, at all.
The reason an automatic cleanup process occurs is that HPC’s are shared resources, and while its resources are large, if resources become consumed it start to not run properly for all users.
Data you want to keep should be moved onto the Research Data Store (RDS) - More on that later.
Artemis specific applications
While not exhaustive, the below is a high level list of applications available on artemis to submit and manage your scripts. We will use these commands in the Running Code on Artemis section.
module - loads modules pre-installed on artemis to prime your scripts for running. Used in conjunction with avail (list available modules), load (loading modules), list (listing those which have been loaded) i.e. module load R
for loading R, or module avail py
for listing available modules whose name starts with py (i.e. python)
qstat - investigate the queue qstat -x [JOB NUMBER]
qsub - submit a script to the scheduler qsub script.pbs
pquota - Investigate usage and available memory on main folders.
dt-script - This script is a ‘wrapper’ for running rsync in a PBS script. It enables you to perform data transfers as a “one-liner” without the need for a script. i.e. dt-script -P <Project> -f <from> -t <to>
When specifying which queue you want your scripts to be via the pbs declarations (defaultQ
is an option if you want Artemis to pick the queue based on your resources), its useful to have in mind resource limits.
small |
1 day |
24 / 128 / 24 |
123 / < 20 |
normal |
7 days |
96 / 128 / 32 |
123 / < 20 |
large |
21 days |
288 / 288 / 32 |
123 / < 20 |
highmem |
21 days |
192 / 192 / 64 |
123-6144 / > 20 |
gpu |
7 days |
252 / 252 / 36 |
185 / – |
dtq |
10 days |
2 / 16 / 2 |
16 / – |
interactive |
4 hours |
4 / 4 / 4 |
123 / – |