What is a FASTA File and the FASTA Format?
Introduction
A FASTA file is a file written in FASTA format that describes nucleotide sequences or amino acid sequences. Extensions such as ".fasta", ".fa", ".fna", and ".fas" are used.
Lines that start with ">" contain the sequence name, ID, or description. The lines following this header, up until the next line starting with ">", contain a single sequence's information.
Line breaks can be freely inserted within a sequence. When line breaks are included, each line often contains about a few dozen characters. When searching for sequences in a standard text editor, keep in mind that line breaks within the sequence may cause searches to fail.
A file that includes not only nucleotide sequences but also their quality scores is called a FASTQ file. For a detailed explanation of FASTQ files, please see here.
Example of a FASTA File
Nucleic Acid Bases Used in FASTA Files
Characters | Nucleic Acid Bases |
---|---|
G | Guanine |
C | Cytosine |
A | Adenine |
T | Thymine |
M | Adenine or Cytosine |
R | Adenine or Guanine |
W | Adenine or Thymine |
S | Cytosine or Guanine |
Y | Cytosine or Thymine |
K | Guanine or Thymine |
V | Adenine or Cytosine or Guanine |
H | Adenine or Cytosine or Thymine |
D | Adenine or Guanine or Thymine |
B | Cytosine or Guanine or Thymine |
N | Adenine or Cytosine or Guanine or Thymine |
- | Gap |
RNA-Seq Data Analysis Software
With our RNA-Seq data analysis software, you won't need to outsource or rely on collaborators. You can start analyzing the data yourself right away, without the need for high-spec computers or knowledge of Linux commands.
Users can perform gene expression quantification, identification of differentially expressed genes, gene ontology(GO) analysis, pathway analysis, as well as drawing volcano plots, MA plots, and heatmaps.