• P. Ciccarese et al: PAV ontology: provenance, authoring and versioning
• K. M. Livingston et al: Representing annotation compositionality and provenance for the Semantic Web
• P. Ciccarese et al: PAV ontology: provenance, authoring and versioning
• K. M. Livingston et al: Representing annotation compositionality and provenance for the Semantic Web
The Journal of Biomedical Semantics is publishing a collection of articles related to biomedical ontologies and ontology updates, including some that address phenotype representation. As of today, the following have been published and provisional PDFs are available at the JBMS website:
On Monday, October 7, the Phenotype RCN hosted a cross-working group call featuring presentations by Pier Luigi Buttigieg on the Environment Ontology and by Matthew Brush on a workshop focused on genetic variation representation.
Dr. Buttigieg, a post-doctoral research associate at the Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, gave an overview of the structure and applications of the Environment Ontology (EnvO), a community ontology for the concise, controlled description of environments. It contains terms at the levels of biomes, environmental features, and environmental material. He talked about EnvO’s sister project, GAZ, an open-source gazetteer built on ontological principles, and EnvO’s adoption by the Encyclopedia of Life (EOL) to provide environmental context to taxon information. EnvO is of particular interest to the RCN because we would like to further explore capturing environments in relation to phenotype.
Matthew Brush, a member of the Ontology Development Group at Oregon Health & Science University, reported on a RCN-sponsored workshop to align genetic variation representation across the Sequence Ontology and Monarch Initiative, in support of efforts to link phenotypes to genotype data. There will be more posted on the RCN blog soon about the workshop and its outcomes.
The Phenotype RCN plans to host monthly calls the first Monday of every month at 8 a.m. Pacific / 11 a.m. Eastern time. If you would like to receive invitations to join via WebEx, please email Erik Segerdell. Suggestions for topics and volunteers for presenters are welcome!
An announcement for the next meeting, November 4, is coming soon.
The final day of the course was presentation day, where we got to hear about everyone’s research questions and how they were going to use (or not!) ontologies for their work. We group brainstormed solutions and next steps for everyone’s projects. The group was very synergistic and we believe that we’ll see some nice contributions and connections being made in upcoming months in the ontology community. Highlights from the presentations are below:
Developmental comparative morphology of the chondrocranium in turtles
Developing a knowledge base of North American butterfly monitoring data
Comparative morphology of nematode anatomy and behavior
Connectivity and function of cranial and ventral musculature in catfishes
Representing diversity of form in The Reptile Database
Phenomics of Prokaryota for Tree of Life
Use of CharaParser and ontologies to generate plant character matrices from publications
Maria Christina Diaz:
Development of the Porifera ontology to represent sponge comparative anatomy and biodiversity
Characterizing gene retention after whole genome duplication, triplication
Semanticizing the Darwin Core
Gene regulation of fin positioning in fishes
Representing the semantic types in the Embryo Project
Phenomics of Eukaryotic microbes
We spent the entire Day 3 working on learning Protege and exploring the use of various OWL2 capabilities. The students worked at their own pace, but all made it to the light at the end of the tunnel. They are all now enlightened (bearers_of some instance_of PATO:0001291). We had also converted the students VUE files into OWL, and some students were able to start work on their OWL files. Day 4 of the course was focused on developing skills around reuse of other ontologies and data interoperability. We learned techniques for performing OWL imports and the use of MIREOT (Minimum Information to Reference and External Ontology Term), which is basically a way to use a subset of another ontology in your own.
We saved a discussion of homology for the end of the last full day, knowing full well that a) the discussion would be vigorous, b) people would have to break for food at some time, and c) the conversation could continue into the evening. Kudos to our students, as they seemed to immediately understand our community’s general approach to this subject, to the degree that nobody even foamed at the mouth.
Day two started with a “speed-dating” approach with instructors pairing off for short periods with participants to strategize and work on individual projects. VUE files representing participants’ projects continued to be formalized, some now contain many nodes and some are even very pretty. These visual representations will be translated into OWL files shortly, and further refined in Protege. The morning progressed into a presentation on annotations, where tools like Phenex and Phenote were outlined.
In the afternoon we had a great overview from Karen Kranston, PI on the OpenTreeOfLife project, and we discussed how ontologies may or may not be useful for projects like OToL. We continued with a survey of web-based resources related to evolutionary biology, with participants auditing well known websites for their use, or lack thereof of ontologies. The day concluded with the last bit of preparation prior to our big practical exercise on Protege on Wednesday, a nice overview of OWL, with specific reference to the (very nice) primer. We’re looking forward to the first real taste of formalization with the Protege tutorial, and the creation of individuals’ own ontologies.
Matt Yoder, Melissa Haendel, Erik Segerdell and Jim Balhoff
The first day of the NESCent and Phenotype RCN sponsored Ontologies Course started with introductions. What a diverse group we have this year, with expertise in: phylogenetics and muscle anatomy of South American fishes, gene retention in plants, reptile limb development, microbiology, systematics of nematodes, natural history collections, sponge taxonomy and evolution, herpetology and turtle anatomy, biodiversity standards and Darwin Core, ecology and Lepidoptera traits, text mining species and character matrices, mathematical models in evolutionary biology, developmental biology of fishes and evolution of habitat and physiological traits in Cyanobacteria and Archaea. WOW!
The day was packed with a lot of lecturing about logic and how it can be your friend, true path violations, ontology best practices, and the community that we are convincing our students that they are now part of . However, we promised that on Tuesday we’d get them using what they had been taught and it would all make more sense once they got their hands dirty. The students had homework last night – they started working on modeling their own ontology project for the course in VUE, and we plan to convert these files to OWL with Jim’s new script so that they can continue their work in OWL following Wednesday’s Protege tutorial. We also had a very interesting discussion about the differences between specimens and samples, intent to collect, and whether or not populations or tissues can be target populations for sample collection.
Melissa Haendel, Matt Yoder, Erik Segerdell and Jim Balhoff
Are you interested in describing and linking biological data? Apply by June 1, 2013, for the Ontologies for Evolutionary Biology course at the National Evolutionary Synthesis Center, July 29 – August 3 in Durham, NC:
Evolutionary research has been revolutionized by the explosion of genetic information available, and ontologies must play a central role in relating this knowledge to observable diversity. Ontologies provide scaffolding that interconnects many kinds of observations; across species, they provide evolutionary, developmental, and mechanistic insights. The theme for this year’s course is “enrichment”. We aim to help participants enrich their research through the use of ontologies, to enrich existing ontologies with new content, and to bring new domain expertise to the ontology community.
There is a wealth of phenotypic information in the evolutionary literature that comes in the the form of semi-structured character state descriptions. To get that information into computable form is, right now, an awfully slow process. In Phenoscape I, we estimated that it took about five person-years in total to curate semantic phenotype annotations from 47 papers. If we are to get computable evolutionary phenotypes from a larger slice of the literature, we really need to figure out ways to speed this up.
One promising approach is to use text-mining. This could contribute in a few different ways. First, one could efficiently identify all the terms in the text that are not currently represented in ontologies and add them en masse, so that data curation does not have to stop and resume whenever such terms are encountered. Second, one could present a human curator with suggestions for what terms to use and what relations those terms have to one another, speeding the process of composing an annotation.
CharaParser, developed by Hong Cui at the University of Arizona, is an expert-based system that decomposes character descriptions into recognizable grammatical components, and it is now being used in several different biodiversity informatics projects. Baseline evaluation results from BioCreative III showed that a naive workflow combining CharaParser and Phenex, the software curators use to compose ontological annotations and relate them to character states, was capable of identifying candidate entity and quality phrases (it outperformed biocurators by 20% in recall on average) but had difficulty translating those into ontological annotations. This first iteration workflow also was not yet reducing curation time.
In March, a small contingent from NESCent (Jim Balhoff, Hilmar Lapp and Todd Vision) visited Hong Cui’s group in Tucson. We talked through improvements to CharaParser and the curation workflow, brainstormed plans for a more thorough set of evaluation tests, began refactoring of the code so that it can be more easily shared across projects, and gained a better understanding of what features make a character difficult to curate for humans vs. text-mining. We made substantial progress on all fronts, and are looking forward to seeing how much improvement in the accuracy and efficiency of curation will be achieved in the next round of testing.
We are also pleased to report that the CharaParser codebase will now be available from GitHub under an open source (MIT) license.
By Peter Midford and George Gkoutos
We held a one-day behavior ontologies workshop on Sunday, February 24, immediately prior to this
year’s RCN summit. Our goals were to bring ontology developers and behavioral biologists together
to review the NBO (NeuroBehavior Ontology) as well as discuss its use and interoperability with other
ontologies. We started the day with a series of short talks: George Gkoutos and Robert Hoehndorf
explaining the development and initial applications of the NBO, followed by six speakers who
volunteered to discuss related topics.
Beorn Brembs presented a data workflow that captured Drosophila movements in the course of
a ‘choice’ experiment. The flow went from raw video to depositing data in figshare, via R, and finished
by showing the role of NBO annotations in the final deposit. Melissa Haendel raised several issues
related to capturing behavior observations using ontologies: What does behavior inhere in? How to
relate observations across species? How do measurements and observations relate to phenotypes
or conditions? David Osumi-Sutherland discussed the application of behavior terms in annotations
within the Virtual Fly Brain (http://www.virtualflybrain.org). Janna Hastings discussed two new
ontologies for Emotions (https://code.google.com/p/emotion-ontology/) and Mental Functioning
(https://code.google.com/p/mental-functioning-ontology/) and both their relationships to the NBO and
their application to mental disease. Christine Wall introduced an ontology of processes involved in
mammalian feeding, which looked like a good candidate for inclusion in NBO and raised important
questions of representation of sequential behavior events and behaviors existing on a continuum.
Finally, Allan Kalueff introduced a community developed catalog of zebrafish behavior.
We followed this with a morning breakout session with groups selected by areas of taxonomic focus:
arthropods, non-mammalian vertebrates, non-human mammals, and humans. When the breakout
groups reported out, there were some common concerns about taxon specificity of terms, both in text
definitions and in their placement in the hierarchy – the later potentially leading to incorrect inferences
for taxa not considered during development of the ontology. There were questions about behaviors,
social and otherwise, involving more than one organism, and the role of abnormal and ‘clinic’ behavior
phenotypes. Finally one group looked at several previous efforts to construct behavior ontologies (e.g.,
the ABO constructed at a series of workshops, and David Shotton’s SABO project).
After lunch, we proposed and discussed topics for a new set of breakouts, and settled on Application
to Behavioral Ecology, Representing Affective Behavior, and a group reviewing the behavior process
branch, with NBO developers George Gkoutos and Robert Hoehndorf soliciting suggestions for high
The Behavioral Ecology session brought a group of behavioral ecologists together with Chris Mungall
to discuss the ABO ontology and how it might be integrated with the NBO. Anne Clark and Sue
Margulis discussed how the ABO had been used in the development of the Ethosearch tool, an online
collection of text ethograms indexed with terms from the ABO. They had also written, and offered
to contribute a collection of text definitions they had developed during the Ethosearch effort. The
consensus was that the ontologies were fairly compatible and that it would be desirable to graft portions
of the NBO in the ABO. The group also agreed that the learning and cognition sections of the NBO
should be a priority area for review as both structure and definitions suffered from species specificity.
The review group wound up focussing on terms for voluntary and involuntary movement, an issue that
came up in the invertebrate morning breakout as well. There was discussion of reflexes, of which the
NBO has a large number, many of which are human or mammal specific, but of significant clinical
The report-out from the affective behavior group generated a lively discussion that started by
addressing the conflation between observable behavior (e.g., smiling) and an inferred diagnosis
(emotional happiness). Although this distinction between observable behavior and inferred emotion
(which might belong in the emotion-ontology) is straightforward, other behavior terms (‘agoraphobic
behavior’) conflate behavior and diagnosis. There was also discussion of fear-related terms in general
and whether these might be too human-centric and what the scope of the NBO was; in particular
would the NBO apply to plants or even paramecia, which have been the subject of multiple ethograms
in the past 15 years. The consensus appeared to be that NBO should apply to animals with nervous
systems, that other types of behavior ought to be welcome additions to the Biological Process branch
of the Gene Ontology. There was also discussion of terms of the form ‘behavioral control of x’ where
x was a process, such as defecation or lacrimation, was meaningfully different from the underlying
The discussion of affective terms provided a nice transition to Barry Smith’s presentation ‘On the
Future of the NeuroBehavior Ontology and Its Relation to the Mental Functioning Ontology.’ After
reviewing the partitioning of domains of biological knowledge by various OBO ontologies, Barry
made the case that the portion of Biological Process that applied to whole organisms needed to be
split between the NBO for observable behavior and the complementary Mental Functioning Ontology
(MFO). The MFO will cover terms related to mental states and processes, for example sensory
perception. Perception is not an observable behavior, though there are behaviors associated with
perception (e.g., head turning, flehman response). He recommended that NBO retain the prefix NBO,
but be considered the (narrow) Behavior Ontology. He also recommended that the feeding ontology
developed by the FEED project be incorporated into NBO, that merging the ABO ontology should
be explored, perhaps scoping behavioral terms taxonomically (e.g, with ‘occurs-in-taxon’) when
appropriate, and to create a separate version of the NBO that marks the human-specific terms. He also
thought we shouldn’t be spending a lot of time discussing what is and isn’t behavior.
We finished the day with a discussion of next steps and deciding what are the best routes for providing
feedback to the NBO developers. In regard to feedback routes, we looked at several options: the OBO-
behavior list, the tracker associated with the NBO repository on google-code, as well as the notes
mechanism within the NCBO Bioportal and Ontobee. We decided that the OBO-behavior list and the
google tracker were adequate at this time. George also said he would add some new committers to
facilitate additions from the FEED ontology, the ABO, as well as terms from the community-developed
list of zebrafish terms in collaboration with ZFIN (this has been done).
We discussed next steps, in terms of ontology work, publications and funding. There was interest
in proposing a behavior ontology focussed RCN to fund workshops. There is interest in among the
behavioral ecology attendees in proposing two followups, the first being a hackathon for ontology
developers (perhaps 3 days) to clear up the ontology issues in the NBO (such as the relation between
the process and phenotype branches) which could be followed up by a workshop for a review from
the perspective of behavioral biology, perhaps in the space between ISBE and ABS in summer 2014.
There is sense that prior to seeking major funding, we should generate more publications, and that the
data and use cases are there to demonstrate the value of behavior ontologies, as several presentations
during the day had already demonstrated. One suggestion was to look at disease terms in the NBO
and look for clusters of behavior phenotypes associated with those terms. Given the importance of
behavior in model organism communities, the group expected that both the NIH and EU funding
agencies would be interested in supporting further work with the NBO.
We broke up shortly before 6 PM, though the behavior thread continued throughout the RCN summit. The application of ontologies to behavior still lags behind the use of anatomy ontologies, but combining the opportunities to benefit from the experiences with anatomical ontologies and the enthusiasm expressed at the workshop, there is reason to be optimistic about the future development and application of behavior ontologies.