RSEM Tutorial: Gene & Isoform Quantification for RNA-Seq

Last updated: February 8, 2026

Introduction

RNA-Seq analysis using next-generation sequencing produces raw data known as FASTQ files. After mapping these raw reads to a reference sequence, gene expression levels are quantified by counting the number of reads mapped to each gene.

This page explains how to use RSEM, a software package for estimating gene and isoform expression levels from mapping results. If you would like to understand the overall workflow of RNA-Seq analysis, please refer to the RNA-Seq analysis workflow overview.

Installation

RSEM can be installed via Bioconda.

$ conda install -c bioconda rsem

Check the help message:

$ rsem-prepare-reference -h

If you see the following output, the installation was successful:

NAME rsem-prepare-reference - Prepare transcript references for RSEM and optionally build BOWTIE/BOWTIE2/STAR/HISAT2(transcriptome) indices. SYNOPSIS rsem-prepare-reference [options] reference_fasta_file(s) reference_name ARGUMENTS reference_fasta_file(s) Either a comma-separated list of Multi-FASTA formatted files OR a directory name. If a directory name is specified, RSEM will read all files with suffix ".fa" or ".fasta" in this directory. The files should contain either the sequences of transcripts or an entire genome, depending on whether the '--gtf' option is used. ...

Preparing the Index

Use the following command to create an index. Since RSEM can also handle the mapping process, you can build the alignment index at this stage. While you can choose from Bowtie, Bowtie2, STAR, or HiSAT2 for mapping, this example uses HiSAT2.

$ rsem-prepare-reference --gtf annotation.gtf --hisat2-hca --hisat2-path [path/to/hisat2] reference.fa index/reference

Read Counting

The following command performs both the mapping and the read counting.

$ rsem-calculate-expression --hisat2-hca --hisat2-path [path/to/hisat2] --paired-end read_1.fastq.gz read_2.fastq.gz index/reference sample1

The gene-level results are saved in "sample1.genes.results", and the isoform-level results are saved in "sample1.isoforms.results".

Gene-level Results

Isoform-level Results

Merging Results

After analyzing multiple samples, you can merge the results into a single matrix. In this example, we merge results for sample1 through sample4.

$ rsem-generate-data-matrix sample1.genes.results sample2.genes.results sample3.genes.results sample4.genes.results > all.genes.results $ rsem-generate-data-matrix sample1.isoforms.results sample2.isoforms.results sample3.isoforms.results sample4.isoforms.results > all.isoforms.results

The merged results will look like this:

Merged Gene-level Results

Merged Isoform-level Results

RNA-Seq Data Analysis Software

This is an RNA-Seq Data Analysis Software recommended for those who:

✔︎ Seeking to avoid outsourcing or collaboration for RNA-Seq data analysis.

✔︎ Lacking time to learn RNA-Seq data analysis.

✔︎ Frustrated by the complexity of existing tools.

Users can perform gene expression quantification, identification of differentially expressed genes, gene ontology(GO) analysis, pathway analysis, as well as drawing volcano plots, MA plots, and heatmaps.

About the Author

BxINFO LLC

A research support company specializing in bioinformatics.

We provide tools and information to support life science research, with a focus on RNA-Seq analysis.

→ Learn more

Recommended Pages