Day 1: raw sequence reads to gene counts

In part 1 of this workshop series, we will introduce you to the data pre-processing pipeline for bulk RNA sequencing (RNA-seq) using the command line. To perform differential gene expression analysis, we first need to process raw sequence data to generate a gene count matrix that represents gene expression levels. To do this, we will run the nf-core/rnaseq Nextflow pipeline on Pawsey’s Nimbus Cloud.

Target audience

This workshop series is suitable for people who are familiar with working at command line interface and may be new to RNA-seq for differential expression analysis. The course is beginner friendly and intended for those interested in using the command line interface for their analysis. It is also suitable for those who want to learn about and use workflows.

How to navigate the webpages

Please use the Menu bar to move between lessons. We will start Day 1 with the Setup page and then naviate to the right. Each main menu header has sub-header which will be visible when you click on the main-menu, as shown below.

Follow-on courses

This course is part one of the two-part RNA sequencing training series. The second part of the series uses R/R Studio on Pawsey Nimbus Cloud to perform differential expression and pathway analysis using raw count data generated in part 1.

We recommend attending both courses if you would like an end-to-end overview of how to perform the analysis. If you are only interested in the technologies used, you may be able to get away with attending only the parts of interest.

Course survey!

Please fill out our course survey before you leave! Help us help you!


This workshop series developed by the Sydney Informatics Hub, University of Sydney in partnership with the Pawsey Supercomputing Research Centre, AARnet, QCIF, and the Australian BioCommons.

All materials copyright Sydney Informatics Hub, University of Sydney