Understanding GO Analysis in RNA-Seq: How to Conduct GO Analysis

GO analysis, short for Gene Ontology analysis or Gene Ontology enrichment analysis, is a method for identifying gene functions that are significantly overrepresented in a specific gene set compared to a background gene set. It is commonly utilized to interpret the functions of differentially expressed genes obtained from RNA-Seq analysis results.

What is Gene Ontology (GO)?

Gene Ontology, abbreviated as GO, is a common vocabulary for describing gene functions.

It is divided into three categories:

Biological Process(BP)

This category describes the metabolic and signaling pathways in which the gene product is involved, such as apoptosis and the cell cycle.

Cellular Component(CC)

This category describes the location where the gene product can be found, such as cell membrane and mitochondria.

Molecular Function(MF)

This category describes the biochemical activity of the gene product, such as enzyme activity and ligand binding.

Each concept in Gene Ontology is assigned an identifier known as a GO term, which is hierarchically structured. When a gene is associated with a lower-level GO term, it is also considered to be associated with related higher-level GO terms.

The relationships between GO terms are also defined, and the following terms are primarily used:

is aThe relation 'B is a A' means that B is a subtype of A.mitotic cell cycle is a cell cycle
part ofThe part of relation is used to represent part-whole relationships.inner mitochondrial membrane is part of mitochondrion

Example of Gene Ontology

Example of GO Analysis

What is GO Analysis?

GO analysis identifies GO terms that are significantly overrepresented in a specific gene set compared to the background gene set. Suppose that an RNA-Seq analysis reveals 357 differentially expressed genes, of which 13 out of 357 are associated with the GO term "GO:0007156", and in a background gene set of 9975 genes, 71 out of 9975 are associated with "GO:0007156". In this case, the term "GO:0007156" would be considered significantly overrepresented compared to its expected frequency of a random sampling of 357 genes from the background set.

Example of GO Analysis Result

Example of GO Analysis Result

How to Counduct GO Analysis

GO analysis can be conducted using packages like clusterProfiler, topGO, and GOseq. When working with raw data of RNA-Seq analysis, it is necessary to quantify gene expression and identify differentially expressed genes before proceeding with GO analysis. We provide RNA-Seq Data Analysis Software that can perform GO analysis starting from raw data of RNA-Seq analysis.