This spring saw three related meetings (two workshops and a hackathon) aimed at advancing development of the Biological Collections Ontology (BCO) and the Population and Community Ontology (PCO) and developing tools to annotate data using those and other ontologies. The first two meetings were held from February 18-20, 2014 in the iPlant offices in Tucson, AZ, right before the Phenotype RCN annual meeting, and were supported in part by the Phenotype RCN. The third meeting was held concurrently with the 16th Genomics Standards Consortium (GSC) Meeting at Pembroke College in Oxford, England from March 31 – April 2. Additional support for all three meetings was provided by EAGER: An Interoperable Information Infrastructure for Biodiversity Research, RCN4GSC: A Research Coordination Network for the Genomic Standards Consortium, and BiSciCol Tracker: Towards a tagging and tracking infrastructure for biodiversity science collections, with logistic support from iPlant and the GSC.
At the first meeting, ten in-person and three remote participants gathered use cases to help grow the PCO, a relatively new ontology that describes collections of organisms such as populations and communities as well as qualities and processes related to those collections. The PCO can be used to describe any collection of organisms (or viruses or viroids), from microbes to humans, whether the collection consists of one or multiple taxa. During one and half days, we came up with a preliminary list of factors by which organisms are grouped into populations or communities, developed an ontology design pattern for how to describe membership in a group of organisms, defined several new PCO terms for specific use cases, made decisions about modeling challenging concepts such as ecological niche (spanning both only PCO and ENVO), and decided to provide pre-composed terms for those characteristics of populations that are not taxon specific and cannot be defined as derived from individual measurements. In addition, there were many lively discussions about the nature of an organism or population and how our expanding knowledge of the microbial world might turn everything we know on its head.
The second workshop focused on mapping datasets to ontology terms and converting them to Resource Description Framework (RDF), using the BCO, an ontology that describes field-based biological sampling processes and observations, as well as material entities and roles associated with those processes. During another intense one and a half days, 18 in-person and one remote participants coordinated development among BCO, OBI, and ENVO, created a concept map for DNA marker gene studies that led to new terms for OBI and a manuscript submitted to the International Conference on Biomedical Ontologies, and did a first pass mapping of Darwin Core terms to ontology terms. In addition, we mapped three data sets to the BCO, converted them to RDF triple stores, and ran preliminary queries. At the end of the third day, about half of the participants climbed into a van to take part in another three jam-packed days of meetings Biosphere 2, hosted by the Phenotype RCN. Our northern European colleagues were particularly happy to see the sunshine for the first time in months.
To help counteract the pleasant weather in Arizona and follow-up on some of the ideas generated during the workshop in Tucson, we decided to hold a BCO hackathon in Oxford six weeks later. In our honor, temperatures in the UK jumped 20 degrees (Fahrenheit) the week we were there, sparing me total weather shock. The hackathon was smaller (7 full time participants plus a few part time), and focused on generating concrete products. Over the course of four days, we coded an additional dataset to RDF, developed a Material Sample Core for the Global Biodiversity Information Framework (GBIF), created a Web Ontology Language (OWL) file for importing Darwin Core classes and properties into BCO, developed a workflow for converting biodiversity data among formats, prepared an updated version of the BCO for release, and completed a proof-of-concept conversion tool that converts existing RDF outputs to Darwin Core Archive format using an ontology specification. We also took part in several of the main meeting sessions of the GSC and reported on our work to the larger group.
A more detailed report describing these three meetings has been submitted to Standards in Genomic Science.
Submitted by Ramona Walls, on behalf of co-organizers John Deck and Rob Guralnick and all of the participants.