FASTA Format
Description line starting by '>' followed by name and then
description;
Sequence in standard IUB/IUPAC amino acid and nucleic acid codes
starting on the next line until description line of next sequence
or end of file is reached. '-' often represents a gap of
indeterminated length.
Example:
>albumin of human origin
MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDLGEENFKALVLIAFAQYLQQCPFEDHVKL
VNEVTEFAKTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCAKQEPERNECFLQHKDD
NPNLPRLVRPEVDVMCTAFHDNEETFLKKYLYEIARRHPYFYAPELLFFAKRYKAAFTECCQAADK
[...]
The description line (or header line) is often used to add
information, but without clear consensus; here are a few usages:
>name
>name description
>name accession description
>namespace|accession.version|name description
>gi|identifier|namespace|accession.version|name description (NCBI)
Example:
>gi|412163|emb|CAA00606.1| albumin [Homo sapiens]