If you are new to bioinformatics and the world of high computing, this page explains what cloud computing is, when to use it, and why BioShell makes it easy to get started.
What is a virtual machine?
A virtual machine is a computer that runs inside another computer, delivered to you over the internet. You connect to it from your laptop or desktop, and it behaves like a full computer you have complete control over. You do not need to worry about your own hardware specs, operating system, or software installation.
Think of it like a fully-equipped lab bench that is always ready, you just sit down and start working, then leave when you are done. The bench does not belong to you, but it is set up exactly the way you need it.
With a virtual machine you can:
- Get more computing power than your laptop for data analysis
- Share the same environment with colleagues or workshop participants
- Run long jobs overnight without worrying about power outages or closing your laptop
- Use a Linux environment even if you normally use Windows or macOS
- Have administrator access to install software and configure the system
- Host services such as databases or web servers
Cloud vs HPC - which should I use?
Both cloud computing and High-Performance Computing (HPC) systems such as NCI Gadi are powerful research tools, but they suit different types of work.
| Cloud (BioShell) | HPC (e.g. NCI Gadi) | |
|---|---|---|
| Session type | Interactive - work in real time | Batch - submit jobs and come back later |
| Interfaces | JupyterLab, RStudio, CLI | CLI only (PBS/Slurm schedulers) |
| Setup required | None - environment is preconfigured | Familiarity with job schedulers required |
| Best for | Learning, workshops, exploratory analysis | Large-scale, tightly coupled, or long-running workloads |
| Scaling | Flexible - choose your VM size on demand | Fixed allocations managed through project quotas |
Not sure which to use? If you are learning bioinformatics, running a workshop, or exploring a new analysis, start with cloud. BioShell is the right choice. You can move to HPC later once your workflow is established and you need more compute power.
Why BioShell?
Getting started on the command line is often the hardest part. Software installs break. Environments differ between computers. Trainers spend hours configuring machines instead of teaching. BioShell solves this.
| Feature | What it means for you |
|---|---|
| Ready on any device | Every participant gets the same preconfigured environment regardless of whether they are on Windows, macOS, or Linux with no device-specific setup required |
| More power than your laptop | Use familiar tools like RStudio and JupyterLab, but backed by cloud compute that is not competing with your browser, email, or background applications |
| Tool discovery built in | Bio-Shelley, BioShell’s command-line assistant, makes finding and installing tools from a catalogue of over 20,000 containers as simple as shelley-bio build samtools, no container knowledge needed |
| Reproducible workshops | Version-controlled environments mean you can build a workshop once and rerun it identically next year, or share it with another trainer at another institution |
| Easier access than HPC | Getting time on a national HPC requires project allocation, queue estimates, and job scheduler knowledge. BioShell access is fast, self-service, and does not require you to know in advance exactly what you will run |
| Admin access, safely sandboxed | You have full administrator privileges on your own VM, install anything and break things without affecting other users or needing HPC sysadmin approval |
How does BioShell fit with HPC?
BioShell is not a replacement for large-scale HPC, it complements it. Think of it as the place where you learn, develop, and test your workflows before scaling up. It helps fill the gap between working on your laptop and moving onto a HPC
- Learn and explore on BioShell - get comfortable with the command line, tools, and your data in a low-pressure environment.
- Develop and test your workflow - build and validate your Nextflow or Snakemake pipeline using BioShell’s pre-installed tools and example datasets.
- Scale up to HPC when ready - once your workflow is confirmed and your data needs are clear, move to NCI Gadi or another HPC system for large-scale runs.