Feb 222014

The Seventh International Biocuration Conference (ISB2014) will be held at the University of Toronto in Toronto, Canada, from April 6-9, 2014.

Hosted by the Ontario Institute for Cancer Research, this meeting will provide a forum for curators and developers of biological databases to discuss their work, promote collaboration and foster a sense of community in this very active and growing area of research. Participants from academia, government and industry interested in the methods and tools employed in curation of biological and medical data are encouraged to attend.

Early bird registration ends March 7, 2014

Registration ends March 24, 2014

The Conference will be preceded by a workshop on the practical use of Uberon, the integrated cross-species anatomy ontology.

Feb 212014

Dear all,

We are pleased to announce a Uberon workshop, satellite of the Biocuration2014 meeting, to be held in Toronto, Canada, on the 5th and 6th April 2014.

Uberon is an integrated cross-species anatomy ontology, representing a variety of entities classified according to traditional anatomical criteria such as structure, function, and developmental lineage. Uberon provides a necessary bridge between anatomical structures in different taxa for cross-species inference, allowing integration of model organism, human, and comparative morphology data.

This workshop will be user-oriented, and will be devoted to introducing and training participants to the use of Uberon. For more information and schedule, please see the workshop website:

Attendance is limited, and places will be allocated on a first come, first served basis. The estimated registration fee is C$200. The registration page is available at:

For more information about the Biocuration conference, please see:

Looking forward to meeting you in Toronto,

The organizers: Chris Mungall, Melissa Haendel, and Frederic Bastian

Feb 172014

By Pier Luigi Buttigieg

We’re happy to announce that a description of the Environment Ontology (ENVO; has been published in the Biomedical Ontologies series of the Journal of Biomedical Semantics:

Capturing the environmental context of biological and biomedical entities is key to fully understanding their qualities, behaviour, and composition. That being said, concisely but meaningfully describing an “environment” with a small number of ontology classes quickly becomes a complex undertaking. This is further complicated by variability in the understanding of the boundaries and parts of environments as well as their relations to other entities of interest, even within a single discipline.

ENVO aims to offer an approachable and easily applicable ontology of environments, with classes that capture key elements of a given environmental context. In particular, ENVO focuses on the environmental system an entity is embedded in, the environmental features that causally influence it, and the environmental material that surrounds it. These complementary perspectives provide a compact but informative contextualisation that can readily aid, for example, data discovery and comparative studies. Work is underway to represent entities such as habitats, niches, and environmental conditions in order to further tease apart and define aspects of the environment. Further, instances of ENVO classes are being linked to classes in the Gazetteer (GAZ) to support (among other efforts) environmental contextualisation of place names.

To promote interoperability across existing ontologies, ENVO is being developed towards compliance with the OBO Foundry Principals. Linking ENVO with phenotypic ontologies and data offers great potential in enabling biological investigation using ontological resources and we look forward to exploring this at Phenotype RCN 2014!

Feb 102014


The Phenotype day [International Conference on Intelligent Systems for Molecular Biology] is an initiative developed jointly with the Bio-Ontologies and BioLINK Special Interest Groups.

The systematic description of phenotype variation has gained increasing importance since the discovery of the causal relationship between a genotype placed in a certain environment and a phenotype. It plays not only a role when accessing and mining medical records but also for the analysis of model organism data, genome sequence analysis and translation of knowledge across species. Accurate phenotyping has the potential to be the bridge between studies that aim to advance the science of medicine (such as a better understanding of the genomic basis of diseases), and studies that aim to advance the practice of medicine (such as phase IV surveillance of approved drugs).

Various research activities that attempt to understand the underlying domain knowledge exist, but they are rather restrictively applied and not very well synchronized. In this Phenotype Day we propose to trigger a comprehensive and coherent approach to studying (and ultimately facilitating) the process of knowledge acquisition and support for Deep Phenotyping by bringing together researchers and practitioners that include but are not limited to the following fields:

• biology as well as computational biology

• genomics, clinical genetics, pharmacogenomics, healthcare

• text/data mining and knowledge discovery

• knowledge representation and ontology engineering

For more information including paper submission deadlines and instructions, please go to

Jan 162014

Last week, the Phenotype RCN hosted a cross-working group call featuring presentations by Ramona Walls on the Plant Ontology and Cross-Species Reasoning [pdf] and Laurel Cooper on Common Reference Ontologies for Plants.

Dr. Walls (The iPlant Collaborative, University of Arizona, and New York Botanical Garden) demonstrated how the PO defines anatomical terms in a way that they can be used across all green plant species. After an overview of the ontology, which can be searched and browsed at, she discussed its evolution, main branches (“plant anatomical entity” and “plant structure developmental stage”), and characteristics shared with CARO (the Common Anatomy Reference Ontology). She talked about specific changes to the ontology that make it work better for all green plants and presented use cases concerning comparison of gene expression, traits, and phenotypes across species.

Dr. Cooper (Oregon State University) followed with a talk about the PO and cROP, the Common Reference Ontologies for Plants. She identified problems arising from free-text phenotype descriptions and scattered data resources, and demonstrated how the PO fits into the centralized cROP platform, where reference ontologies for plants will be used to access data sources for plant traits, phenotypes, diseases, genomes linked to gene expression and genetic diversity data across a wide range of plant species. The cROP Ontology Database may be accessed via its web portal,

Many thanks to Ramona and Laurel for their outstanding talks!

The Phenotype RCN plans to host monthly calls the first Monday of every month at 8 a.m. Pacific / 11 a.m. Eastern time. If you would like to receive invitations to join via WebEx, please email Erik Segerdell. Suggestions for topics and volunteers for presenters are welcome!

Nov 262013

A handful of new papers of interest are available at the Journal of Biomedical Semantics, which is publishing a collection of articles related to biomedical ontologies and ontology updates:

• P. E. Midford et al: The vertebrate taxonomy ontology: a framework for reasoning across model organism and species phenotypes

• R. Nigam et al: Rat Strain Ontology: structured controlled vocabulary designed to facilitate access to strain data at RGD

• P. Ciccarese et al: PAV ontology: provenance, authoring and versioning

• K. M. Livingston et al: Representing annotation compositionality and provenance for the Semantic Web

Nov 052013

via Robin Haw

On behalf of the Organizing Committee for the 7th International Biocuration Conference, I am delighted to announce that our website and registration are now open:

The conference will be held at Hart House, in Toronto from 6-9th April 2014 and it would be wonderful to see you there.  We have four keynote speakers confirmed:

Dr. Tim Hubbard, Wellcome Trust Sanger Institute

Dr. Suzanna Lewis, Lawrence Berkeley National Laboratory

Dr. Patricia Babbitt, California Institute for Quantitative Biosciences (QB3)

Dr. Lincoln Stein, Ontario Institute for Cancer Research

Early bird registration rates apply until 7th March 2014. We have secured discount rates at three hotels in Toronto; please see the Biocuration 2014 website for more information on booking.

Please note that the paper submission deadline is 15th November 2013. So there is limited time to put your paper together.

The deadline for the abstract submission to present at the conference is 10th February 2014.

Oct 292013

by Karen Eilbeck

One of our tasks at the SO-GENO phenotype workshop in Portland this fall, was to formalize the description of phenotypic data in genomic annotation. Previously we had written instructions in the use of phenotype ontologies such as HPO when creating variant file annotations in Genome Variation Format (GVF). GVF is a tab delimited variant file for the detailed annotation of sequence variants, and the specification is managed as part of the Sequence Ontology. Our revised guidelines were split into human and non-human recommendations to reflect the diversity in phenotypic annotation resources. We address best practices for annotation, provide easy to follow examples, and discuss the process for requesting new terms from the phenotype resources. The recommendations are available here and have been registered with Biosharing as a reporting guideline. Biosharing is a website to register and track well-constituted efforts to develop standards for describing and sharing biosciences experiments; see more here.

Oct 292013

by Matthew Brush

In September 2013, the Phenotype RCN sponsored a three-day workshop at Oregon Health & Science University to align sequence feature and genetic variation representation and thereby support phenotype data integration. Participants included developers of the Sequence Ontology  (SO) [1] (Karen Eilbeck, Mike Bada, and Bret Heale), and members of the ontology team from the Monarch Initiative [2] who have been developing a genotype ontology called GENO (Matthew Brush, Melissa Haendel, and Chris Mungall).


One of the goals of the Phenotype RCN is to promote coordination and standardization of phenotype-related data. A standardized representation of genotype information is required for integrating genetically-linked phenotype data from diverse sources  including model organism, human variation, livestock, and evolutionary databases.  A particular challenge relates to harmonizing phenotype annotations where they are linked to genetic variations at different levels of granularity – from complete strain genotypes, to specific gene alleles, to single nucleotide polymorphisms.

Monarch and SO Projects

The Monarch Initiative is a new effort that aims to integrate genotype-to-phenotype and related data from numerous sources under a common semantic framework, and develop tools and services for user-guided exploration and analysis. Towards this end, Monarch required development of new modeling for genotypes (housed in GENO), which was lacking in the ontology landscape. The scope of GENO necessarily overlaps with that of the Sequence Ontology, but has a unique perspective on sequence features as they relate to linking different scales of genetic variation and to organismal phenotypes. The need to align modeling between SO and GENO motivated our collaboration, which was particularly timely as the SO had recently initiated a refactoring to accommodate use cases beyond its initial charge of genome annotation. This refactoring aimed to  define the context of the SO with respect to the Basic Formal Ontology (BFO) and other OBO ontologies, enhance representation of sequence variation, and develop a parallel representation of material sequence features (MSO) to complement the abstract feature representation in the existing SO. These goals were consistent with those of Monarch to support better phenotype data integration and therefore a workshop was funded by the Phenotype RCN.

Genetic Variation in GENO

The genotype information modeled in GENO is broadly conceived to include any variation in gene expression that is tied to an observed phenotypic effect. Two types of ‘genetic variation’ are explicitly distinguished in GENO: (1) ‘Sequence-variation’ describes changes in the sequence of an organism’s genome, which are captured in the traditional genotypes shared by biologists. In this context, ‘sequence variant genes’ are heritable changes in genomic DNA, and include things like point mutations, SNPs, or transgenic insertions that are represented in SO. (2) ‘Expression-variation’ relates to experimental alterations in the expression-level of genes that are not due to changes in the sequence of the subjects’ genome. Here, ‘expression variant gene’ are genes that are altered in the level of their expression as a result of some experimental intervention such as targeted gene knock-down using reagents such as morpholinos and RNAi, or transient expression from DNA constructs. Like sequence variants, these expression variants change what is expressed in an organism and can lead to measurable phenotypic outcomes.  The GENO ontology aims to re-use and co-develop the SO sequence variation model, but the notion of expression variation was concluded to be outside the SO scope. Modeling in GENO will extend and be logically consistent with the SO approach and will leverage links to orthogonal ontologies to place variation in a broader biological context [3]. Additional information about the SO and GENO models and their interaction can be found in the presentation posted here [4].

Workshop Goals and Outcomes

One of the immediate goals of our workshop was to find consensus on high-level ontological issues that have yet to be resolved in the development of these and other OBO Foundry ontologies and document these decisions for the community.  Many such issues have been broadly debated for years, and our outcomes may be relevant for other domains or applications in biomedical research. Much progress was made in resolving key issues, and a plan was established for ongoing collaborative work.  Some outcomes are below, and more detailed notes can be found here [5].

  1. Terminological standardization of core terms.  Terms such as ‘sequence’, ‘gene’, ‘allele’, variant’, ‘reference’, ‘mutant’, ‘genetic’ are variably and ambiguously used across communities, and required precise definitions and consistent use.  Work is ongoing to craft such definitions, which will be reflected in our respective ontologies as they are refined and vetted.
  2. The ontological nature of sequences and sequence features (and their place in the BFO/IAO framework).  Specific topics included: (1) the merits and implications of modeling sequence features as generically dependent continuants, or more specifically as information content entities, (2) defining identity criteria for sequence features to include their sequence and their position (as opposed to sequence only), (3) how to model attributes of sequence features such as biological activity, experimental provenance, reference status, and zygosity, and (4) the ways in which sequence features are considered to vary with respect each other (e.g. wild-type vs mutant sequences, reference vs alternate sequences).
  3. Gene representation, and modeling the central dogma. We debated strategies to provide an OWL-based ontological representation and identifiers for genes and their variants, that would serve SO, Monarch, and the broader phenotype community.  Related discussions focused on how to build from this gene representation to link to derived sequences at RNA and protein levels, and describe properties that emerge in this derivation.
  4. Variant representation.  A precise and explicit account of how the concept of ‘sequence variation’ should be defined across SO and GENO was established. In this model, a ‘variant’ is any sequence feature that varies_with some other instance of the same feature.  So sequence variants are considered to be ‘variant_with’ any other version of that feature, rather than ‘variants_of’ some reference. But we will also represent more specific types of the ‘variant_with’ relation that describe the different ways biologists consider sequences to vary with each other based on the roles that the variants in this relation hold (including where one is reference and another alternate versions, or one is wild-type and the other mutant). This is a critical facet of relating phenotypes to genotypes.
  5. Integration of expression-level variation modeling in GENO with sequence-variation modeling in SO.  Here, the high level approach for representing expression variation in terms of genetic sequences that are altered in their expression was reviewed and vetted by members of Monarch and SO teams.  Several approaches for conceptual integration of the expression and sequence variation models are under consideration.
  6. Technical approaches for coordinated development.  Topics included how to manage parallel construction and coordination of abstract SO and physical MSO ontologies – where strategies for automated derivation of the SO from the MSO were reviewed.  In addition, we discussed how to manage community development of SO and GENO as integrated but separate ontologies, using existing platforms, tools, and standards for software development (Google projects, trackers, list-serves, build and QA tools, etc).

As noted above, more details on each of these topics, as well as many others, can be found in the document here [5].  Participation of the broader community is encouraged through feedback on this document or participation in ongoing coordination calls (contact for info).


  3. ICBO 2013 conference paper –
  4. Presentation to the Phenotype RCN, October 2013 –
  5. Google doc summarizing workshop outcomes –
Oct 222013

The Journal of Biomedical Semantics is publishing a collection of articles related to biomedical ontologies and ontology updates, including some that address phenotype representation. As of today, the following have been published and provisional PDFs are available at the JBMS website:

• The Drosophila anatomy ontology

• The Drosophila phenotype ontology

• Enhanced XAO: the ontology of Xenopus anatomy and development underpins more accurate annotation of gene expression and queries on Xenbase

• Automatically transforming pre- to post-composed phenotypes: EQ-lising HPO and MP

• Function of dynamic models in systems biology: linking structure to behaviour

• Developing a semantically rich ontology for the biobank-administration domain

• Functional tissue units and their primary tissue motifs in multi-scale physiology

• Enrichment analysis applied to disease prognosis

• The Vertebrate Trait Ontology: a controlled vocabulary for the annotation of trait data across species