>
BWA Tutorial: Read Mapping for Genome Analysis

BWA Tutorial: Read Mapping for Genome Analysis

Last updated: February 17, 2026

Introduction

BWA (Burrows-Wheeler Aligner) is a software tool for mapping read sequences (FASTQ files) obtained from next-generation sequencers to a reference genome. BWA is widely used for mapping DNA sequencing data, and is particularly common in genome resequencing and ChIP-Seq analysis.

BWA includes several algorithms:

  • BWA-backtrack: An algorithm suitable for Illumina reads of 100 bp or shorter.
  • BWA-SW: Supports reads from 70 bp to 1 Mbp in length, with support for long reads and split alignment.
  • BWA-MEM: Supports reads from 70 bp to 1 Mbp in length, and like BWA-SW, supports long reads and split alignment, but is faster and more accurate. It also outperforms BWA-backtrack for 70-100 bp Illumina reads, making it the most recommended algorithm today.

This page focuses on how to use BWA-MEM, the most commonly used algorithm.

For mapping in RNA-Seq, splice-junction-aware tools such as HISAT2 and STAR are commonly used. Since BWA does not account for splice junctions, it is primarily used for mapping DNA sequencing data such as whole genome sequencing (WGS), exome sequencing, and ChIP-Seq.

Installation

You can install BWA via Bioconda.

$ conda install -c bioconda bwa

Let's display the help message.

$ bwa

If the following output is displayed, the installation was successful.

Program: bwa (alignment via Burrows-Wheeler transformation) Version: 0.7.17-r1188 Contact: Heng Li <lh3@sanger.ac.uk> Usage: bwa <command> [options] Command: index index sequences in the FASTA format mem BWA-MEM algorithm fastmap identify super-maximal exact matches pemerge merge overlapping paired ends (EXPERIMENTAL) aln gapped/ungapped alignment samse generate alignment (single ended) sampe generate alignment (paired ended) bwasw BWA-SW for long queries shm manage indices in shared memory fa2pac convert FASTA to PAC format pac2bwt generate BWT from PAC pac2bwtgen alternative algorithm for generating BWT bwtupdate update .bwt to the new format bwt2sa generate SA from BWT and Occ Note: To use BWA, you need to first index the genome with `bwa index`. There are three alignment algorithms in BWA: `mem`, `bwasw`, and `aln/samse/sampe`. If you are not sure which to use, try `bwa mem` first. Please `man ./bwa.1` for the manual.

Building the Index

Before mapping, you need to build an index of the reference sequence.

$ bwa index genome.fa

genome.fa is the FASTA file of the reference sequence you want to map to.

This operation creates five files: genome.fa.amb, genome.fa.ann, genome.fa.bwt, genome.fa.pac, and genome.fa.sa. Index files are necessary for fast string searching, and pre-building them is required for virtually all mapping software, not just BWA.

Mapping (BWA-MEM)

Use the following command to map paired-end reads.

$ bwa mem -t 4 genome.fa reads1.fastq.gz reads2.fastq.gz > output.sam

The -t option specifies the number of threads. This operation outputs a SAM file.

For single-end reads, specify only one FASTQ file.

$ bwa mem -t 4 genome.fa reads.fastq.gz > output.sam

It is convenient to convert the output SAM file to a BAM file and sort it, so run the following commands.

$ samtools view -bS output.sam > output.bam $ samtools sort output.bam > output.sorted.bam $ samtools index output.sorted.bam

The index created by samtools index is required for viewing in genome browsers such as IGV and for many downstream analysis tools.

About BWA-MEM2

BWA-MEM2 is the successor to BWA-MEM and can perform mapping faster than BWA-MEM. The usage is nearly identical to BWA-MEM.

You can install it via Bioconda.

$ conda install -c bioconda bwa-mem2

Build the index and run mapping as follows.

$ bwa-mem2 index genome.fa $ bwa-mem2 mem -t 4 genome.fa reads1.fastq.gz reads2.fastq.gz > output.sam

It produces the same results as BWA-MEM, but with significantly improved processing speed. Consider using BWA-MEM2 especially when working with large-scale data.

RNA-Seq Data Analysis Software

This is an RNA-Seq Data Analysis Software recommended for those who:

✔︎ Seeking to avoid outsourcing or collaboration for RNA-Seq data analysis.

✔︎ Lacking time to learn RNA-Seq data analysis.

✔︎ Frustrated by the complexity of existing tools.

overview

Users can perform gene expression quantification, identification of differentially expressed genes, gene ontology(GO) analysis, pathway analysis, as well as drawing volcano plots, MA plots, and heatmaps.

BxINFO LLC logo

BxINFO LLC

A research support company specializing in bioinformatics.

We provide tools and information to support life science research, with a focus on RNA-Seq analysis.

→ Learn more