>
RSEM Tutorial: Gene & Isoform Quantification for RNA-Seq

RSEM Tutorial: Gene & Isoform Quantification for RNA-Seq

Last updated: March 13, 2026

Introduction

When performing RNA-Seq analysis with a next-generation sequencer, you obtain raw data in the form of FASTQ files. After mapping these reads to a reference sequence, gene expression levels are quantified by counting the reads that align to each gene.

This page explains how to use RSEM, a tool for estimating expression levels at both the gene and isoform levels from alignment results.

Installation

RSEM can be installed via Bioconda.

$ conda install -c bioconda rsem

Try displaying the help message to verify the installation.

$ rsem-prepare-reference -h

If you see output similar to the following, the installation was successful.

NAME rsem-prepare-reference - Prepare transcript references for RSEM and optionally build BOWTIE/BOWTIE2/STAR/HISAT2(transcriptome) indices. SYNOPSIS rsem-prepare-reference [options] reference_fasta_file(s) reference_name ARGUMENTS reference_fasta_file(s) Either a comma-separated list of Multi-FASTA formatted files OR a directory name. If a directory name is specified, RSEM will read all files with suffix ".fa" or ".fasta" in this directory. The files should contain either the sequences of transcripts or an entire genome, depending on whether the '--gtf' option is used. ...

Building the Index

Build an index using the following command. Since RSEM can handle the mapping step internally, you can also create the alignment index at this stage. You can choose from Bowtie, Bowtie2, STAR, or HISAT2 as the aligner; this example uses HISAT2.

$ rsem-prepare-reference --gtf annotation.gtf --hisat2-hca --hisat2-path [path/to/hisat2] reference.fa index/reference

Read Counting

The following command performs both the mapping and the read counting in a single step.

$ rsem-calculate-expression --hisat2-hca --hisat2-path [path/to/hisat2] --paired-end read_1.fastq.gz read_2.fastq.gz index/reference sample1

The gene-level results are saved in "sample1.genes.results", and the isoform-level results are saved in "sample1.isoforms.results".

Gene-level Results
RSEM Results (Gene)
Isoform-level Results
RSEM Results (Isoform)

Merging Results

Once you have analyzed multiple samples, you can merge the results into a single matrix. Here we merge the results for sample1 through sample4.

$ rsem-generate-data-matrix sample1.genes.results sample2.genes.results sample3.genes.results sample4.genes.results > all.genes.results $ rsem-generate-data-matrix sample1.isoforms.results sample2.isoforms.results sample3.isoforms.results sample4.isoforms.results > all.isoforms.results

The merged results will look like this:

Merged Gene-level Results
RSEM Merged Results (Gene)
Merged Isoform-level Results
RSEM Merged Results (Isoform)

RNA-Seq Data Analysis Software

This is an RNA-Seq Data Analysis Software recommended for those who:

✔︎ Seeking to avoid outsourcing or collaboration for RNA-Seq data analysis.

✔︎ Lacking time to learn RNA-Seq data analysis.

✔︎ Frustrated by the complexity of existing tools.

overview

Users can perform gene expression quantification, identification of differentially expressed genes, gene ontology(GO) analysis, pathway analysis, as well as drawing volcano plots, MA plots, and heatmaps.

BxINFO LLC logo

BxINFO LLC

A research support company specializing in bioinformatics.

We provide tools and information to support life science research, with a focus on RNA-Seq analysis.

→ Learn more