API Overview

The API is designed for sharing genomic data. It currently has support for sharing sequencing reads, genetic variants, and reference genomes. The following sections give an overview of the API, including general API patterns, individual data types, and links to detailed schema documentation.


Reads are genetic data generated by a DNA sequencing instrument, including nucleotides and quality scores. Reads may optionally be aligned to a reference sequence. (The data model for reads is similar to SAM/BAM.)


Variants are genetic differences between an experimental sample and a reference sequence. (The data model for variants is similar to VCF.)


References are standard genome sequences, used to provide a coordinate system for reads and variants.

Sequence Annotations

Sequence annotations describe genomic features such as genes and exons, using terms from an established sequence ontology.

Allele Annotations

Allele annotations are additional pieces of data often generated by algorithms which help to describe, classify, and understand variants.

RNA Quantification

The RNA quantifications provides a means of obtaining feature level quantifications derived from a set of RNA reads.


GA4GH services can communicate with each other about the services they offer over network protocols. This includes the Peer Service.