Basic pathway analysis using ToppGene
Overview
Teaching: 30 min
Exercises: 10 minQuestions
How do we quickly assess what the gene list we obtained actually means?
What online tools can we use for pathway analysis?
What are the limitations of online pathway analysis tools?
Objectives
Quickly assess whether our experiment worked, i.e. was able to enrich for miR29b targets
Introduce ToppGene and interpret it’s output
Mention other approaches to pathway analysis
The most basic approach to pathway analysis of RNA-seq data involves:
- Calculating overlaps between the list of differentially expressed genes and the list of genes annotated to a specific pathway
- Adjusting for number of genes in each
- Calculating a statistic that reflects how likely it is that this number of genes from this pathway was observed by chance vs is meaningful, and adjusting for multiple comparisons
We will use ToppGene as a first-pass tool to explore our data.
- Go to the ToppGene website
- Enter the list of upregulated genes (their HGNC ids)
- Run the analysis
- You can either explore the data online, and/or download the analysis results as a text file.
Challenge 1
Which of the results do you think are most important for this experiment? Do you think the experiment worked?
Solution
toppgene_upreg <- read_tsv("finaltables/toppgeneresults_upregulated.txt")
Error in read_tsv("finaltables/toppgeneresults_upregulated.txt"): could not find function "read_tsv"
toppgene_upreg %>%
dplyr::filter(Category == "MicroRNA") %>%
mutate(proportion_detected = `Hit Count in Query List`/`Hit Count in Genome`) %>%
dplyr::select(ID, proportion_detected) %>%
head(n = 20L) %>%
ggplot(aes(y= proportion_detected, x = ID, fill = ID)) + geom_bar(stat = "identity") + coord_flip() + theme(legend.position="none")
Error in toppgene_upreg %>% dplyr::filter(Category == "MicroRNA") %>% : could not find function "%>%"
If we were preparing this for publication, we could further filter the list of genes annotated as targets to only include genes expressed in our study.
If we run the analysis on the “depleted” genes, no such result is observed, and very few miRNA targets are detected.
Note that more advanced and appropriate tools to carry out pathway analysis, including:
- goana: takes into account gene length bias; tests for molecular function, biological process and cellular component
- CAMERA: tests using the Broad’s MSigDB
- specialised tools for miRNA analysis
- IPA/GeneGo MetaCore
Key Points
Exploratory pathway analysis can be performed using a wide range of online tools
ToppGene allows us to quickly assess what’s going on in our data
If a formal pathway analysis needs to be carried out, tools like goana and camera nicely fit within the limma ecosystem