CARO/PCO Oregon Summit 2014
Before the post-Thanksgiving haze had lifted, a small group of ontologists (Melissa Haendel, Chris Mungall, David Osumi-Sutherland, and Ramona Walls) converged on the lovely small town of Brownsville, Oregon to work on the Common Anatomy Reference Ontology (CARO), the Population and Community Ontology (PCO), and PATO, an ontology of biological qualities. This work was done within the context of the larger group of ontologies that make use of or are used by CARO (UBERON, GO, CL).
CARO is a relatively small upper ontology with ~165 classes and a few core relations that is used to link taxon-specific anatomy ontologies ranging from fruit flies to vertebrates to plants. The 1.0 release of CARO has been widely used, but usage has been quite inconsistent and sometimes incorrect. This is partly due to lack of clarity in some definitions, but also because it was written at a time when we lacked the tools to provide automated reports of incorrect usage.
PCO is recently developed ontology focussing on populations, communities and the relationships between organisms. The definitions of organism types in CARO are critically important for this ontology, as are the biological qualities applying to groups of organisms in PATO.
PATO, an ontology of biological qualities, has been very widely used by the community brought together by the Phenotype RCN as well as in defining classes in a wide range of other ontologies used by this community (covering phenotypes, anatomy, cell types and populations). So far, PATO has had limited axiomatisation, but there many obvious cases where axiomatisation could improve its integration with ontologies that use it – including the PCO and anatomy ontologies.
A major aim of our work on CARO at this meeting was to redraft textual definitions so that they could be understood by any competent biologist and to redraft logical definitions so that they could be used for automated classification and error checking. For both logical and textual definitions, we aimed to focus on distinctions that are important to biologists – either directly, or indirectly by making biologically useful queries possible. We also aimed to take into account new use cases that have arisen since CARO 1.0 was released, as a result of work on the PCO as well as on anatomy ontologies and the ontologies and tools that use them. In parallel with this work, we aimed to improve related axiomatisation of PATO.
Over two and half days of leftover turkey, home-fermented vegetables, and farm-fresh eggs, we took care of operational issues such as repository maintenance, as well as more hard-core ontologizing. A highlight of the meeting was an informal gathering on Monday night when we were joined by Laurel Cooper and John Campbell from Oregon State University and Joe Fontaine from Murdoch University to discuss the intersections of ontologies, ecology, plant traits, and biodiversity.
Key outputs of the meeting were:
- A fresh github repo (https://github.com/obophenotype/caro/) for CARO, with cleaned up imports.
- New CARO terms, including terms for multicellular anatomical structure and expression pattern, and a general term for organs.
- Revised text and logical definitions for most CARO terms, including anatomical structure, cellular organism, and organ (figure 1, Vue file that shows the key classes and which files they live in).
- Draft ontology design patterns (ODPs) for expression patterns and for anatomical structures with internal spaces (lumens).
- Further development of PCO, including updating import files, testing ODPs for defining collections of organisms and species/organism interactions.
- A pending beta release of CARO2.0 and plans for how to announce it.
- Better formalization of PATO through general class axioms (GCIs) necessary for CARO and PCO.
- A Jenkins job that reports on and verifies ontologies that use CARO (FBBT, PO, XAO, and ZFA))
- A draft paper on CARO2.0.
One of the key use cases for anatomy ontologies is annotation of gene expression, and we wanted a way to help curators avoid the pitfall of annotating expression to the (immaterial) space that is part of a structure rather than the (material) structure that surrounds it. We propose a design pattern in which any structure that has an interior space (such as stomach) would be modeled using four classes: one for the entire structure (which includes both the surrounding structure and the space that is part of it), one for the space, one for the wall (which is just the surrounding structure without the space) and one for “wall region”. A wall region is any portion of the wall that spans the full thickness of the wall for its entire lateral extent, whereas the wall is the mereotopological sum of all wall regions. Following this pattern, an ontology that wished to include a stomach would have classes for “stomach”, “stomach lumen”, “stomach wall”, and “region of stomach wall”. We opted against including very general classes such as “wall” or “wall region” in CARO, and instead plan to document the pattern and provide a template for its use in anatomy ontologies.
One way of specifying the structures such as a stomach that have a geometric component is through the use of GCIs in PATO. PATO includes a number of classes for qualities describing shape. Of these, lumenized, tubular, and saccular are the most relevant to CARO. We began adding GCIs to PATO of the form:
- bearer_of some lumenized subClassOf ‘has part’ some lumen
- bearer_of some unlumenized subClassOf not (‘has part’ some lumen)
An open question remains on how to document these patterns (in CARO or as separate patterns). One possibility is for CARO to include abstract geometrical classes such as “anatomical tube” or “anatomical tube wall” and “tube lumen”.
Stay tuned for another post soon, with the upcoming release of CARO!