Jul 212014
 

Over the years, a number of different vertebrate anatomy ontologies have been developed. Some of these are dedicated to a single model species, or to human. Others have been developed to describe phenotypic variation across species, and these cover a broad range of species. In particular:

This lead to considerable duplication of effort, as common anatomical
structures such as ‘pectoral girdle‘ were represented in all five ontologies
(as well as their single species counterparts):

Haendel et al fig 1

Pectoral girdle and related concepts in Uberon, with cross-references to other ontologies shown (Fig 1, Haendel et al)

It was difficult for the Phenoscape group to integrate data across all these ontologies, as this required that curators kept mutual cross-references up to date, a time-consuming and error-prone task.

As a result, the maintainers of these ontologies agreed to join forces and build a common ontology.This work is described in a new paper in the ontologies special issue of the Journal of Biomedical Semantics:

Haendel MA, Balhoff JP, Bastian FB et al  Unification of multi-species vertebrate anatomy ontologies for comparative biology in Uberon Journal of Biomedical Semantics 2014, 5:21  doi:10.1186/2041-1480-5-21

The group selected Uberon as the core ontology, as it had the broadest coverage, was already well-integrated with the single-species ontologies, and was adapted for OWL reasoning. The curators of these ontologies worked long and hard to integrate their work, with input from anatomy experts and developers of single-species ontologies, revealing many interesting differences in the way structures are represented across species along the way. For example, the representation of teeth in the combined ontology had to be flexible enough to accommodate teeth that are in widely variable locations and configurations:

Figure 4. Diversity of tooth locations

The number of classes merged is shown in figure 2 of the paper:

Figure 2. Overlap and contributions from source ontologies. A) Venn diagram showing the extent of cross-referenced content between msAOs prior to the merge.. B) Ontology evolution and integration into Uberon

 

As a result of this effort, we have a common anatomy ontology with broad and deep coverage for vertebrate anatomy. For a variety of viewing options, see the Uberon website. For examples of use for data integration see:

Like most ontologies, work is ongoing and we are constantly striving to improve depth, coverage and quality. We’re currently actively improving the representation of facial muscles in the ontology based on the FEED ontology. We are also working on a federated approach for bringing in invertebrate anatomy ontologies, many of which are developed under the auspices of the Phenotype RCN  including the Arthropod Anatomy Ontology, the Poriferan anatomy ontology [Thacker et al, accepted, JBMS], the cephalopod  ontology and the ctenophore ontology. We welcome feedback from everyone!

Jun 182014
 

This spring saw three related meetings (two workshops and a hackathon) aimed at advancing development of the Biological Collections Ontology (BCO) and the Population and Community Ontology (PCO) and developing tools to annotate data using those and other ontologies. The first two meetings were held from February 18-20, 2014 in the iPlant offices in Tucson, AZ, right before the Phenotype RCN annual meeting, and were supported in part by the Phenotype RCN. The third meeting was held concurrently with the 16th Genomics Standards Consortium (GSC) Meeting at Pembroke College in Oxford, England from March 31 – April 2. Additional support for all three meetings was provided by EAGER: An Interoperable Information Infrastructure for Biodiversity Research, RCN4GSC: A Research Coordination Network for the Genomic Standards Consortium, and BiSciCol Tracker: Towards a tagging and tracking infrastructure for biodiversity science collections, with logistic support from iPlant and the GSC.

At the first meeting, ten in-person and three remote participants gathered use cases to help grow the PCO, a relatively new ontology that describes collections of organisms such as populations and communities as well as qualities and processes related to those collections. The PCO can be used to describe any collection of organisms (or viruses or viroids), from microbes to humans, whether the collection consists of one or multiple taxa. During one and half days, we came up with a preliminary list of factors by which organisms are grouped into populations or communities, developed an ontology design pattern for how to describe membership in a group of organisms, defined several new PCO terms for specific use cases, made decisions about modeling challenging concepts such as ecological niche (spanning both only PCO and ENVO), and decided to provide pre-composed terms for those characteristics of populations that are not taxon specific and cannot be defined as derived from individual measurements. In addition, there were many lively discussions about the nature of an organism or population and how our expanding knowledge of the microbial world might turn everything we know on its head.

The second workshop focused on mapping datasets to ontology terms and converting them to Resource Description Framework (RDF), using the BCO, an ontology that describes field-based biological sampling processes and observations, as well as material entities and roles associated with those processes. During another intense one and a half days, 18 in-person and one remote participants coordinated development among BCO, OBI, and ENVO, created a concept map for DNA marker gene studies that led to new terms for OBI and a manuscript submitted to the International Conference on Biomedical Ontologies, and did a first pass mapping of Darwin Core terms to ontology terms. In addition, we mapped three data sets to the BCO, converted them to RDF triple stores, and ran preliminary queries. At the end of the third day, about half of the participants climbed into a van to take part in another three jam-packed days of meetings Biosphere 2, hosted by the Phenotype RCN. Our northern European colleagues were particularly happy to see the sunshine for the first time in months.

To help counteract the pleasant weather in Arizona and follow-up on some of the ideas generated during the workshop in Tucson, we decided to hold a BCO hackathon in Oxford six weeks later. In our honor, temperatures in the UK jumped 20 degrees (Fahrenheit) the week we were there, sparing me total weather shock. The hackathon was smaller (7 full time participants plus a few part time), and focused on generating concrete products. Over the course of four days, we coded an additional dataset to RDF, developed a Material Sample Core for the Global Biodiversity Information Framework (GBIF), created a Web Ontology Language (OWL) file for importing Darwin Core classes and properties into BCO, developed a workflow for converting biodiversity data among formats, prepared an updated version of the BCO for release, and completed a proof-of-concept conversion tool that converts existing RDF outputs to Darwin Core Archive format using an ontology specification. We also took part in several of the main meeting sessions of the GSC and reported on our work to the larger group.

A more detailed report describing these three meetings has been submitted to Standards in Genomic Science.

Submitted by Ramona Walls, on behalf of co-organizers John Deck and Rob Guralnick and all of the participants.

 Posted by on June 18, 2014 at 2:14 pm
Jun 102014
 

Calling all Phenotype RCNer’s and anyone else who works with phenotype data – We want your name on a manuscript supporting a computable phenotypes future! (If you read and agree of course.)

Over the past four years of sponsoring meetings, courses, and exchanges, we have, with your help and participation, developed a comprehensive understanding of where the phenotype community is at, what is needed for integration of phenotypes with other data, and a vision of the science that could be achieved with this integration. In this article, we attempt to educate researchers, granting agencies, and policy makers on the current ‘non-computable’ state of phenotypic data across various life science domains, and we try to motivate them to use, develop, and advocate for semantic methods. Because of the relevance of this work to most areas of biological sciences and because it relates specifically to creating interdisciplinary knowledge—and especially because it is open access—PLoS Biology is our target journal.

  1. The link to manuscript is here: http://bit.ly/PhenotypeMS (Google doc). And a Word version (.docx) with line numbers is available if you prefer.
  2. The form to add your comments, suggestions, references, and especially your author information is here: http://bit.ly/PhenotypeComments

Please respond by 18 June 2014 (next Wednesday). We will post updates here on our blog.

‘Branching’ phenotypes are not easily recovered from free text (far right column), the format in which most organismal phenotypes are recorded. (top row) Bee setae are usually modified in a way that presumably facilitates pollen collection, a €153 billion ecosystem service. This relatively simple phenotype has been described in myriad ways. Photo of bumble bee covered in pollen by Thomas Bresson (source). Photo of seta interacting with pollen grain by István Mikó (source). (middle row) Plant trichomes take on many forms and likewise are described using many lexicons. Photo of Arabidopsis plants covered in hair-like structures (trichomes) by BlueRidgeKitties (source). Scanning electron micrograph of Arabidopsis trichome by Heiti Paves (source). (bottom) In zebrafish larvae, angiogenesis starts with vessels branching to form a network (right image) that is referred in disparate ways. Zebrafish embryo photo by MichianaSTEM (source). Zebrafish blood vessels image is Figure 5A from Alvarez et al. 2009.

 Posted by on June 10, 2014 at 2:46 am
May 302014
 

Dear Phenotype RCN community.  Please take a moment to help NSF identify priorities for investment in Genome-Phenome research.  These will be translated into funding solicitations relevant to you!

John Wingfield, Assistant Director of the National Science Foundation Directorate for Biological Sciences (BIO), is pleased to announce the posting of a Wiki to seek community input on the grand challenge of understanding the complex relationship between genomes and phenomes.  The Wiki is intended to facilitate discussion among researchers in diverse disciplines that intersect with biology, such as computation, mathematics, engineering, physics, and chemistry. The Wiki format encourages open communication, captures new viewpoints, and promotes free exchange of ideas about the bottlenecks that impede progress on the genomes-phenomes grand challenge and approaches or strategies to overcome these challenges. Information provided through the Wiki will help inform BIO’s future research investments and activities relevant to understanding genomes-phenomes relationships.

To provide comments, ask questions and view input from and interact with other community members, first-time users should sign up for an account via this link: Sign-up.  Once registered, users will be directed to the main page of the NSF Wiki to accept the terms and conditions before proceeding.  Additional guidance and subsequent visits can be accessed via this link: Genomes-Phenomes Wiki.Community members should feel free to forward notice of this to anyone they think might be interested in contributing to the discussion. Questions regarding the Wiki should be sent to bio-gen-phen@nsf.gov.

 Posted by on May 30, 2014 at 7:39 pm
Mar 072014
 

Landscape at Catalina State Park, near Biosphere 2 in Arizona. A great place to observe arthropod phenotypes! Photo by Andy Deans (CC BY 2.0)

The Arthropod Working Group of the Phenotype RCN stayed an extra day at Biosphere 2, after the annual group summit meeting, so that we could take stock of our own progress and discuss future interactions. We’re a heterogeneous crowd, each working on a different taxon (non-Hexapod PancrustaceaAraneae, Hymenoptera, Coleoptera), often on different systems (integument, circulatory, neuroanatomy, etc.), and with different motivations (taxonomy, gene expression, evolutionary questions. etc.). Our annual meeting is a chance to catch each other up on progress in our systems but also to discuss limitations and possible solutions. We’re also charged with developing a common anatomy ontology that bridges disparate lineages, some of which are represented in existing anatomy ontologies (e.g., see Costa et al. 2013 and Yoder et al. 2010). In attendance this year:  (L to R in photo below): Lars Vogt (Universität Bonn, Germany), Peter Grobe (Stiftung Zoologisches Forschungsmuseum Alexander Koenig Bonn, Germany), István Mikó (Penn State, USA), Stefan Richter (Rostock University, Germany), Martín Ramírez (Museo Argentino de Ciencias Naturales), Matt Yoder (Speciesfile, University of Illinois, USA), and, behind the camera, Andy Deans (Penn State, USA).

Rogues gallery of arthropod fanatics. Photo by Andy Deans (CC BY 2.0)

Wisely, we mostly steered clear of anatomical discussions—what’s this part here, and how do we define it?—which freed us up to talk about tools, progress, future proposals, and other news. That is, we had fewer tangents (and shouting) and more constructive conversations about collaboration. We captured most of the dialog in a Google doc (needs synthesis, for sure, and likely doesn’t capture ALL of our discussions, especially complex ideas articulated on the easel), but here are a few quick hits:

  • The MorphDBase project (Grobe & Vogt) recently received funding for further development, and there is now a lot of potential to integrate ontologies. We discussed ideas for annotations, workflows, and how our projects could interact more with this resource.
  • We talked about anatomical complexity more generally, especially in the context of essentialistic classes vs. those classes that are not so easy to define (cluster class). Our aim should be to develop user-friendly tools that make it easier to employ ontologies (i.e., that don’t require morphologists and taxonomists to overthink annotations or burden them with excessive evidence gathering).
  • The spider ontology (SPD) is being used in an ongoing effort to extract characters from the literature (Ramírez). The group discussed tools that could help facilitate this process (e.g., CharaParser) and continued development of the SPD (especially Web-based tools, like mx, that facilitate rapid, community development of ontologies).
  • The TaxonWorks project (Yoder) is looking for feedback regarding ontology tools. Should they integrate an ontology builder, à la mx? Perhaps one that interacts easily with Protégé (and the reasoners therein)? What about templates for certain kinds of taxonomic and phylogenetic characters? The user would plug in the anatomy and the phenotype, and TaxonWorks would write the semantics.
  • Of course there was also some groupthink about how to make progress towards our mandate: to build a common anatomy ontology for arthropods. More on that later, but the consensus is that we should develop system-based pieces of it separately, forging links between them later. This ontology cloud would be synthesized in a future manuscript.

It was an intense, 12-hour, pizza-fueled, beverage-driven marathon in an inspiring location. After what we universally felt was forward progress, though, we’re excited for the next round! Perhaps in Argentina, Martín …?

As a side note, it was a bit cool in Arizona in February, for most arthropods anyway, but I did see two very cool critters: a Scolopendra centipede, which was way to fast for me to photograph, and a Hadrurus scorpion, which I forgot entirely to photograph. So here’s a great image from Flickr that illustrates them both:

A Hadrurus scorpion consumes a Scolopendra centipede at San Tan Regional Park (somewhat close to Biosphere 2). Photo by Jasper Nance (CC BY-NC-ND 2.0).

 Posted by on March 7, 2014 at 1:54 pm
Mar 072014
 

B2With its research emphasis on understanding the impact of climate change on the environment, Biosphere2 turned out to be the perfect venue for our fourth annual Phenotype RCN meeting!  More than 60 students, postdocs, and professionals from 7 countries participated in this inspirational event, and our expertise was evenly split between biology and informatics.  We were particularly pleased to have the support and participation from the EDEN RCN (6 people), with their focus on understanding the impact of ecological factors on organismal development and evolution.  

The goals for this summit meeting were to (1) understand the bioinformatics landscape of environmental ontologies and vocabularies (What resources exist? What acquisitions and mergers should happen?); (2) find out how (and whether) environment is represented with respect to phenotype in projects and annotation data sets; and (3) determine research that would benefit from the integration of environment ontologies.  We frontloaded this work by initiating a group Google doc prior to the meeting, with the goal to refine it and publish it following the meeting.  Combined with presentations from meeting participants, this activity was surprisingly effective (!), and the manuscript is progressing quickly.  In short, we discovered that the ENVO ontology is likely to be the most widely used and supported, and though it needs to be provisioned with many concepts from the user community, participants felt that it would be sufficient for their needs.  It doesn’t seem that environment has been formally represented with respect to phenotypes outside of the microbial realm (where it is very important), but many interesting research questions could be addressed if it was.  Please let us know if you’d like to contribute to this doc.

Another huge accomplishment: Over a dozen new research collaborations were spawned by this meeting!  We’re still sorting these out, but the RCN hopes to support many of these activities through our Collaborative Exchange Opportunities mechanism.

On the social side, this meeting was very fun!  To the relief and immense enjoyment from those of us from the North, who haven’t seen warm weather in what seems like an eternity, most meals were held outside on the patio of B2. And one dinner was even inside the Biosphere itself.  Some of our participants enjoyed antics in B2, including one who managed to get locked in (briefly)….. The clean and cozy casitas made for great breakout spaces, the fantastic catering kept our minds sharp, the fun and beautiful setting inspired interaction, and the care and attention to every organizational detail (thanks to Kim Land at B2 and Judy Logue for the RCN) made this meeting possibly our best.  Thanks everyone!

 

 

 

 Posted by on March 7, 2014 at 1:07 am
Feb 222014
 

The Seventh International Biocuration Conference (ISB2014) will be held at the University of Toronto in Toronto, Canada, from April 6-9, 2014.

Hosted by the Ontario Institute for Cancer Research, this meeting will provide a forum for curators and developers of biological databases to discuss their work, promote collaboration and foster a sense of community in this very active and growing area of research. Participants from academia, government and industry interested in the methods and tools employed in curation of biological and medical data are encouraged to attend.

Early bird registration ends March 7, 2014

Registration ends March 24, 2014

The Conference will be preceded by a workshop on the practical use of Uberon, the integrated cross-species anatomy ontology.

Feb 212014
 

Dear all,

We are pleased to announce a Uberon workshop, satellite of the Biocuration2014 meeting, to be held in Toronto, Canada, on the 5th and 6th April 2014.

Uberon is an integrated cross-species anatomy ontology, representing a variety of entities classified according to traditional anatomical criteria such as structure, function, and developmental lineage. Uberon provides a necessary bridge between anatomical structures in different taxa for cross-species inference, allowing integration of model organism, human, and comparative morphology data.

This workshop will be user-oriented, and will be devoted to introducing and training participants to the use of Uberon. For more information and schedule, please see the workshop website:
http://edu.isb-sib.ch/course/view.php?id=167

Attendance is limited, and places will be allocated on a first come, first served basis. The estimated registration fee is C$200. The registration page is available at:
http://www.isb-sib.ch/edu/Registration/SIB_courses.php?id=234

For more information about the Biocuration conference, please see:
http://biocuration2014.events.oicr.on.ca/biocuration

Looking forward to meeting you in Toronto,

The organizers: Chris Mungall, Melissa Haendel, and Frederic Bastian

Feb 172014
 

By Pier Luigi Buttigieg

We’re happy to announce that a description of the Environment Ontology (ENVO; www.environmentontology.org) has been published in the Biomedical Ontologies series of the Journal of Biomedical Semantics:
http://dx.doi.org/10.1186/2041-1480-4-43

Capturing the environmental context of biological and biomedical entities is key to fully understanding their qualities, behaviour, and composition. That being said, concisely but meaningfully describing an “environment” with a small number of ontology classes quickly becomes a complex undertaking. This is further complicated by variability in the understanding of the boundaries and parts of environments as well as their relations to other entities of interest, even within a single discipline.

ENVO aims to offer an approachable and easily applicable ontology of environments, with classes that capture key elements of a given environmental context. In particular, ENVO focuses on the environmental system an entity is embedded in, the environmental features that causally influence it, and the environmental material that surrounds it. These complementary perspectives provide a compact but informative contextualisation that can readily aid, for example, data discovery and comparative studies. Work is underway to represent entities such as habitats, niches, and environmental conditions in order to further tease apart and define aspects of the environment. Further, instances of ENVO classes are being linked to classes in the Gazetteer (GAZ) to support (among other efforts) environmental contextualisation of place names.

To promote interoperability across existing ontologies, ENVO is being developed towards compliance with the OBO Foundry Principals. Linking ENVO with phenotypic ontologies and data offers great potential in enabling biological investigation using ontological resources and we look forward to exploring this at Phenotype RCN 2014!

Feb 132014
 

Dear Phenotype Community,

We are heading into our annual meeting next week, where we will be prioritizing our next year’s activities with the help of the Phenotype RCN Advisory Board. If you have an idea for a workshop, working group or collaborative exchange, please send me an email and/or fill out a short application with your idea. See our blog for previous posts by folks who have been funded, and email me if you would like to discuss an idea before you propose. Please get these to us by February 19th.

Thanks! Paula (pmabee@usd.edu)

 Posted by on February 13, 2014 at 2:32 am