What is a FASTA File and the FASTA Format?

Introduction

A FASTA file is a file written in FASTA format that describes nucleotide sequences or amino acid sequences. Extensions such as ".fasta", ".fa", ".fna", and ".fas" are used.

Lines that start with ">" contain the sequence name, ID, or description. The lines following this header, up until the next line starting with ">", contain a single sequence's information.

Line breaks can be freely inserted within a sequence. When line breaks are included, each line often contains about a few dozen characters. When searching for sequences in a standard text editor, keep in mind that line breaks within the sequence may cause searches to fail.

A file that includes not only nucleotide sequences but also their quality scores is called a FASTQ file. For a detailed explanation of FASTQ files, please see here.

Example of a FASTA File

>NC_003070.9 Arabidopsis thaliana chromosome 1 sequence CCCTAAACCCTAAACCCTAAACCCTAAACCTCTGAATCCTTAATCCCTAAATCCCTAAATCTTTAAATCC TACATCCATGAATCCCTAAATACCTAATTCCCTAAACCCGAAACCGGTTTCTCTGGTTGAAAATCATTGT GTATATAATGATAATTTTATCGTTTTTATGTAATTGCTTATTGTTGTGTGTAGATTTTTTAAAAATATCA ... >NC_003076.8 Arabidopsis thaliana chromosome 5 sequence TATACCATGTACCCTCAACCTTAAAACCCTAAAACCTATACTATAAATCTTTAAAACCTATACTCTAAAC CATAGGGTTTGTGAGTTTGCATAAAGTGTCACGTATAAGTGTTTCTAACATGTGAGTTTGCATAAGAGTC TCGACTATGTGTTTGTTCAAAAGTGACGTAAGTGTTTAGACTAGAGCCGGCCGTGAGCACAAGCGGGCCA ...

Nucleic Acid Bases Used in FASTA Files

CharactersNucleic Acid Bases
GGuanine
CCytosine
AAdenine
TThymine
MAdenine or Cytosine
RAdenine or Guanine
WAdenine or Thymine
SCytosine or Guanine
YCytosine or Thymine
KGuanine or Thymine
VAdenine or Cytosine or Guanine
HAdenine or Cytosine or Thymine
DAdenine or Guanine or Thymine
BCytosine or Guanine or Thymine
NAdenine or Cytosine or Guanine or Thymine
-Gap

RNA-Seq Data Analysis Software

With our RNA-Seq data analysis software, you won't need to outsource or rely on collaborators. You can start analyzing the data yourself right away, without the need for high-spec computers or knowledge of Linux commands.

概要

Users can perform gene expression quantification, identification of differentially expressed genes, gene ontology(GO) analysis, pathway analysis, as well as drawing volcano plots, MA plots, and heatmaps.