This protocol defines the associations between genotype and phenotype (G2P). Associations can be made as a result of literature curation, computational modeling, inference, etc., and modeled and shared using this schema.
Here, we follow the dogma of: Genotype + Environment = Phenotype
Where a G2P association is between the G(enotype) in the context of some E(environment), which gives rise to a P(henotype). These associations have further evidence, provenance, and attribution. We leverage the GenomicFeature in the sequenceAnnotation schema here as it can accomodate any genomic feature from a single nucleotide variation (SNV), up through a gene, and/or complex rearrangements. Each can be modeled as genomic features, and generally linked to a phenotype. Collections of these features can represent a genotype at different levels of completeness. Therefore, we can represent single allelic variation, allelic complement, and multiple variants in a genotype that can each or collectively be associated with a phenotype. To enable standardized integration, this schema relies heavily on OntologyTerms, for typing phenotype, genomic features, and levels of evidence. Suggested ontologies to leverage include (with browser links):
- Human Phenotype Ontology (HPO): http://www.ontobee.org/browser/index.php?o=hp
- Disease Ontology (DO): http://purl.obolibrary.org/obo/DOID_4
- Sequence Ontology (SO): http://www.sequenceontology.org/browser/
- Evidence Code Ontology (ECO): http://www.ontobee.org/browser/index.php?o=ECO
- Phenotypic Qualities (PATO): http://www.ontobee.org/browser/index.php?o=PATO