How to Use Ballgown: Detecting Differentially Expressed Genes in RNA-Seq Analysis
Introduction
When performing RNA-Seq analysis using next-generation sequencing, raw sequencing data called FASTQ files are obtained. After mapping the reads to a reference genome, gene expression levels are quantified by counting the reads mapped to each gene. By comparing expression levels across multiple samples between groups, differentially expressed genes can be detected.
This page explains how to use Ballgown, a software tool for detecting differentially expressed genes. If you would like to understand the overall workflow of RNA-Seq analysis, please refer to the RNA-Seq analysis workflow overview.
Installation
First, if R is not installed on your system, install it using the following command:
Launch R and run the following:
If the following command runs without errors, the installation was successful:
In my environment, however, the installation failed with the following error:
This seemed to be caused by conda.
After deactivating conda and retrying, Ballgown was installed successfully.
Preparing the Data
Use StringTie to generate transcript expression data that can be imported into Ballgown. For more details, please refer to this page about StringTie.
After running the analysis, the output directory will have the following structure. Although GTF files are included, Ballgown uses only the *.ctab files.
Loading the Data
Run the following command to load the data:
Specify the results directory using dataDir, and define the sample folder name pattern using a regular expression with samplePattern.
Let’s display transcript-level FPKM values:
Let’s display gene-level FPKM values:
Extracting Differentially Expressed Genes
In this example, we perform a comparison between two groups. Add group information as follows:
Next, perform the statistical test:
Sort the results by p-value:
The results will look like this:
RNA-Seq Data Analysis Software
This is an RNA-Seq Data Analysis Software recommended for those who:
✔︎ Seeking to avoid outsourcing or collaboration for RNA-Seq data analysis.
✔︎ Lacking time to learn RNA-Seq data analysis.
✔︎ Frustrated by the complexity of existing tools.

Users can perform gene expression quantification, identification of differentially expressed genes, gene ontology(GO) analysis, pathway analysis, as well as drawing volcano plots, MA plots, and heatmaps.
About the Author
BxINFO LLC
A research support company specializing in bioinformatics.
We provide tools and information to support life science research, with a focus on RNA-Seq analysis.