>
How to Use TopHat2: Mapping in RNA-Seq Analysis

How to Use TopHat2: Mapping in RNA-Seq Analysis

Last updated: January 20, 2026

Introduction

When quantifying gene expression levels using sequencing data obtained from RNA-Seq analysis, a mapping step is generally required. Mapping refers to the process of aligning read sequences (FASTQ files) to matching positions on a reference sequence. Commonly used mapping software for RNA-Seq includes HISAT2, STAR, and Bowtie2. This page explains how to use TopHat2.

Note that TopHat2 is an older software tool, so in most cases it is recommended to use HISAT2 or STAR instead.

Please refer to the RNA-Seq analysis workflow overview.

Installation

To use TopHat2, a Bowtie2 index is required. If you install TopHat2 using Conda, Bowtie2 will be installed automatically.

$ conda install -c bioconda tophat

Let’s check the Bowtie2 help message.

$ bowtie2 -h

If the following output is displayed, the installation was successful.

bowtie2 -h Bowtie 2 version 2.4.1 by Ben Langmead (langmea@cs.jhu.edu, www.cs.jhu.edu/~langmea) Usage: bowtie2 [options]* -x <bt2-idx> {-1 <m1> -2 <m2> | -U <r> | --interleaved <i> | -b <bam>} [-S <sam>] ...

Next, display the TopHat2 help message.

$ tophat

If the following output is shown, TopHat2 is ready to use.

tophat tophat: TopHat maps short sequences from spliced transcripts to whole genomes. Usage: tophat [options] <bowtie_index> <reads1[,reads2,...]> [reads1[,reads2,...]] ...

Index Construction (Build)

First, create an index of the reference genome using the following command. Here, Bowtie2 is used.

$ bowtie2-build -f genome.fa genome

genome.fa is the reference sequence you want to map reads to, provided as a FASTA file. Gzip-compressed files can also be used.

This command generates several files such as genome.1.bt2 through genome.4.bt2, as well as genome.rev.1.bt2 and genome.rev.2.bt2. Index files are required for fast sequence searching and must be created in advance for almost all mapping software, not just Bowtie2.

Mapping

Next, map the read sequences to the reference genome using TopHat2.

$ tophat -o output genome reads1.fastq.gz reads2.fastq.gz

As a result, a BAM file named accepted_hits.bam is generated in theoutput directory.

You can visualize the mapping results using a genome browser such as IGV, as shown below.

Mapping result

RNA-Seq Data Analysis Software: Accelerate Your Publication

With our RNA-Seq data analysis software, you won't need to outsource or rely on collaborators. You can start analyzing the data yourself right away, without the need for high-spec computers or knowledge of Linux commands.

overview

Users can perform gene expression quantification, identification of differentially expressed genes, gene ontology(GO) analysis, pathway analysis, as well as drawing volcano plots, MA plots, and heatmaps.

BxINFO LLC logo

BxINFO LLC

A research support company specializing in bioinformatics.

We provide tools and information to support life science research, with a focus on RNA-Seq analysis.

→ Learn more