Variants API

See Variants schema for a detailed reference.

Variants Data Model

The Variants data model, although based on the VCF format, allows for more versatile interaction with the data. Instead of sending whole VCF files, the server can send information on specific variants or genomic regions instead. And instead of getting the whole genotype matrix, it’s possible to just get details for one or more specified individuals.

The API uses four main entities to represent variants. The following diagram illustrates how these entities relate to each other to constitute the genotype matrix.


The lowest-level entity is a Call:

  • a Call encodes the genotype of an individual with respect to a variant, as determined by some analysis of experimental data.

The other entities can be thought of as collections of Calls that have something in common:

  • a VariantSet supports working with a collection of Calls intended to be analyzed together.

  • a Variant supports working with the subset of Calls in a VariantSet that are at the same site and are described using the same set of alleles. The Variant entity contains:

    • a variant description: a potential difference between experimental DNA and a reference sequence, including the site (position of the difference) and alleles (how the bases differ)
    • variant observations: a collection of Calls describing evidence for actual instances of that difference, as seen in analyses of experimental data
  • a CallSet supports working with the subset of Calls in a VariantSet that were generated by the same analysis of the same sample. The CallSet includes information about which sample was analyzed and how it was analyzed, and is linked to information about what differences were found.

The following diagram shows the relationship of these four entities to each other and to other GA4GH API entities. It shows which entities contain other entities (such as VariantSetMetadata), and which contain IDs that can be used to get information from other entities (such as Variant’s variantSetId). The arrow points from the entity that contains the ID to the entity that can be identified by that ID.

FIXME: remove the Sample object from the graphic; that object isn’t (yet) defined in the API.