Apr 102017

[This post was written by Anne Thessen and originally appeared at datadetektiv.com; we thank Dr. Thessen for sharing it with our community as well!]

One of the more difficult aspects of trying to apply “big data” thinking in ecology is the massive heterogeneity of terms. I stumble over this issue every time I work on a data set for the Encyclopedia of Life. The many different ways to describe the same habitat (among other things) and the varying granularity with which people describe habitats make it very difficult for data consumers to find, for example, all the beetles that live in the desert. It’s doubly more difficult to go a step further and ask for traits of beetles that live in deserts, like color, for example.

As a side note, that example is very similar to some use cases I published with several colleagues about ways to combine phenotype and environment data.

Right now, we can ask Google “How much does a narwhal weigh?” and get the answer because of the fine work my EOL colleagues and I have been doing on TraitBank (go ahead, try it), but we’ve still got a way to go before we can ask “What color are beetles that live in the desert?”. We have a plan, though, and it involves semantic technology, i.e. ontologies.

Biology already has many ontologies available for use of varying quality. Most of them can be found at OBO Foundry. Not all domains of biology have good ontologies available, for example, ecology has been left out. That means there is no standard, machine-readable way of expressing which organisms are autotrophs, or nocturnal, or use camouflage, etc. Including terms such as these in an ontology is one of the many necessary steps before we can ask “Which organisms are nocturnal in an alpine forest habitat?” or, if we want to get more complicated, “Is there a relationship between the phylogeny of terrestrial, nocturnal organisms and latitude or elevation?”.

Building an ontology is a large, never-ending, hugely complicated task. One of my clients at University Colorado, Boulder, is the ClearEarth project. The goal of this project is to repurpose NLP and ML algorithms developed for biomedicine for use in geology and biology. These algorithms can read text and automatically generate ontologies. We’ve made a lot of progress annotating domain-specific text and will have some “auto-ontologies” by this summer. Very exciting! To support this effort and make sure the ontologies resulting from this project are meshed in with existing bio-ontologies, we are hosting an “ontology-a-thon” in Boulder this summer. Please take a look and apply, if you are interested in participating. We don’t have a detailed agenda just yet, but the idea is to get ontology and ecology experts in one room to curate the auto-ontology. All expenses paid, but space is limited.

 Posted by on April 10, 2017 at 5:24 pm
Sep 262016

The Phenotype RCN is wrapping up after five years of innovation and community-building. So many great ideas have come out of this community that we’ve been asked to produce a book called Application of Semantic Technologies in Biodiversity Science that showcases the state-of-the-art in semantics for biodiversity, phylogeny, phenotypes, environments, and genomes. Would you like to participate? Please send your chapter idea described as a single paragraph and a list of potential co-authors to Anne Thessen via email annethessen@gmail.com. Anne will be editing the book to be published by IOS Press in Berlin as a part of a Semantic Web series edited by Pascal Hitzler. If you were at the 2016 Phenotype RCN meeting at Biosphere 2, you met him there. We need to get busy on the book, so please submit your chapter ideas within two weeks (by Oct 5).

This book will be an excellent product of the RCN and a great way to synthesize all the great ideas everyone has had over the years.

 Posted by on September 26, 2016 at 6:14 pm
Sep 222016

Written by Prashanti Manda

I attended the Pacific Symposium on Biocomputing (PSB) in Jan 2016. I presented a talk titled “Investigating the importance of anatomical homology for cross-species phenotype comparisons using semantic similarity.” This work explores the utility of including anatomical homology when computing semantic similarity of phenotype profiles. The majority of talks at PSB were focused on disease analytics and use of clinical phenotypes. There was a good balance of computer scientists and biologists at the meeting. An interesting session at the meeting was the social media session that was focused on large scale data analytics from sources such as Twitter and Instagram to track the spread of epidemics.

I also attended ICBO 2016 to present my work on the impact of annotation granularity on semantic similarity of phenotypes. I also served on the program committee for ICBO 2016 and was one of the poster judges at the conference. The title of my talk was “Measuring the importance of annotation granularity to the detection of semantic similarity between phenotype profiles”. Considerable human effort and time is invested to curate phenotypes in great detail from biological and medical literature using standardized ontologies. However, it is unclear if this level of detail is important for effectively measuring semantic similarity between phenotype profiles. In my work, I tested the statistical sensitivity of widely used semantic similarity metrics at varying levels of annotation granularity to test if higher annotation granularity improves the sensitivity of similarity metrics.

Attending ICBO gave me the opportunity to present my work to a diverse group of scientists focused on varying ontological applications. I found that the conference featured scientists and researchers from a wide range of areas within biology, medicine, ecology, computer science and text-mining. Of particular interest to me were the BioCreative sessions which focused on a variety of natural language processing and text mining applications to extract knowledge from scientific literature.

Lastly, I would like to acknowledge travel support from the Phenotype RCN for conference travel.

 Posted by on September 22, 2016 at 12:42 am
Aug 182016

Suzanna Lewis recently attended Phenoday 2016 (http://phenoday2016.bio-lark.org/) and other phenotype-related sessions at the Bio-Ontologies SIG (http://www.bio-ontologies.org.uk/) at ISMB 2016, held 8-12 July 2016 in Orlando, FL. 

Talks of special interest at Phenoday included Melissa Haendel on adding natural language synonyms for medical terms in the HPO, Wendy Chapman on the definition of “cough” (knowledge representation to support phenotyping from text), and Chris Mungall on a Bayesian approach to ontology structure inference with applications to the Disease Ontology (being in Orlando he used Mickey Mouse to illustrate his points on phenotyping, e.g., HP_0100024 is a conspicuously happy disposition associated with a chromosome 15q24 deletion, and MP_0001284 is absent vibrissae (aka no whiskers)). 

Although Phenoday focused mostly on human health related phenotypes, related sessions during Bio-Ontologies SIG covered applications to other species. Seth Carbon described the Noctua annotation tool, which has a web-based configuration for associating genotypes to phenotypes, essentially a web-based reincarnation of Phenote. Chris Mungall also spoke in this session, this time on PhenoPackets and proposed data exchange standards for phenotype data. 

David Osumi-Sutherland (along with Owen Randlett and Paul Sternberg) organized a workshop at the The Allied Genetics2016 Conference on Informatics Resources to Aid the Genetic Dissection of Neural Circuitry. While the name of the workshop doesn’t mention phenotypes it certainly was an integral part of what is needed for this work. The workshop was a showcase of carefully detailed work in worm, zebrafish, and fly brains and circuits.

Contact Suzi is you would like more information about these conferences.

 Posted by on August 18, 2016 at 8:45 pm
Aug 162016

A contingent of Phenotype RCN participants recently attended the 7th International Conference on Biological Ontology (ICBO) and BioCreative 2016 held over a stretch of pleasantly sunny days on the campus of Oregon State University in Corvallis, Oregon (August 1-4, 2016). The theme of the meeting was Food, Nutrition, Health, and Environment for the 9 billion and the meeting brought together folks interested in applying ontologies to innovative research in diverse domains including environment, biodiversity, biomedical sciences, plant biology, and agriculture.

The conference started off with a day of workshops covering text-mining, visualization, medicine, and tutorials on tools, techniques and standards. (Links to the program and abstracts are available here: http://icbo.cgrb.oregonstate.edu/program). Talks and posters during subsequent sessions included a diverse mix of topics such as sustainability, obstetrics and neonatal health, trauma centers, social science, infection disease, and biodiversity. Although wide-ranging in scope, a thread of common challenges emerged in working with ontology-based data, including the need for data harmonization/standardization, promoting shared resources, representation challenges for temporal or spatial reasoning, and improving descriptors/terminology.

The meeting ended with a panel discussion in which the question “Have ontologies reached their peak?” was discussed. This question was prompted by a noticeable decline since 2014 in Pubmed papers matching the word “ontology” (and a marked increase in those matching “data mining”). Consensus of the panel was that while the publication of new ontologies in Pubmed may have slowed, their use in biology was far from peaking. Rather, the community may have a more refined understanding of what an ontology is, which means fewer papers are being published that claim to be about ontologies.

Of particular interest to the phenotype community, here are the presentations given by recent RCN phenotypers:

  • James Balhoff, Wasila Dahdul, Prashanti Manda, and the Phenoscape team: The Phenoscape Knowledgebase: tools and APIs for computing across phenotypes from evolutionary diversity and model organisms
  • Pier-Luigi Buttigieg: Sustainable food systems and food in ecosystems
  • Pier-Luigi Buttigieg, Mark Jensen, Ramona Walls, Chris Mungall: Environmental semantics for sustainable development in an interconnected biosphere
  • Brian Stucky, Ramona Walls, Robert Guralnick: The Plant Phenology Ontology for Phenological Data Integration
  • Suzanna Lewis:  Telling a genome’s story graphically
  • Prashanti Manda, Jim Balhoff, Todd Vision: Measuring the importance of annotation granularity to the detection of semantic similarity between phenotype profiles
  • Chris Mungall, M Jensen,  M-A Laporte, P. Buttigieg. A sustainable approach to knowledge representation in the domain of sustainability: bridging SKOS and OWL
  • N Vasilevsky, M Engelstad, E Foster, C Mungall, P Robinson, S Köhler, M Haendel: Enhancing the Human Phenotype Ontology for Use by the Layperson
  • Melanie Courtot, James Malone, Chris Mungall: Ten simple rules for biomedical ontology development
  • Chelsea Specht: Evolution of Floral Form: The potential of ontologies across diverse plant lineages
  • Ramona Walls: Defining and sustaining populations and communities
  • Ramona Walls, Robert Guralnick: The Biological Collections Ontology for linking traditional and contemporary biodiversity data


We thank the Phenotype RCN for providing travel support to the meeting.

— Jim Balhoff, Pier-Luigi Buttigieg, Wasila Dahdul, Rob Guralnick, Suzi Lewis,  Prashanti Manda, Chris Mungall, Chelsea Specht, Brian Stucky, and Ramona Walls

 Posted by on August 16, 2016 at 1:26 am
May 172016

The Phenoscape project is recruiting a postdoc with training in bioinformatics and/or developmental biology who is interested in analyzing genomic and developmental data in relation to phenotypic data, with a focus on the vertebrate fin/limb.

The problem of how organismal phenotypes have evolved, are constrained, and acquire novelty, is one of the grand challenges in biology. The Phenoscape group has developed ontology-based methods for representing species phenotypes so that they can be integrated with model organism developmental and genetic data. The Phenoscape Knowledgebase (KB) contains over 500,000 vertebrate species phenotypes that are linked to ~16,000 genes associated with 320,000+ phenotypes and 37,000 genes with in situ expression data from model organisms (zebrafish, mouse, Xenopus, human). These data present a tremendous opportunity for integration with other data types to address questions about the evolution of phenotype.

We are seeking an individual with expertise in developmental biology and/or genomics, to (1) help evaluate results of bioinformatics methods being developed by Phenoscape and (2) leverage the Phenoscape Knowledgebase to study whole-organism phenotype and functional genomics in non-model organisms. The purpose of the methods is to improve prediction of the genetic basis of evolutionarily novel phenotypes by incorporating semantic similarity, homology, and phylogenetic propagation. Vertebrate fin and limb phenotypes and genes are enriched in the KB, and we are thus seeking candidates who ideally have knowledge of genes and networks involved fin/limb development. Further, this position presents a unique opportunity to leverage the linked developmental and genetic data in the Phenoscape KB for large-scale analysis of patterns of phenotypic evolution.

The postdoc will work under the direction of Paula Mabee (University of South Dakota) in association with Todd Vision (University of North Carolina), as part of a distributed, multidisciplinary team that includes evolutionary and model organism biologists, computer scientists, and bioinformaticists. Ideally the applicant will be based in South Dakota (with opportunities to travel to other sites), but we will consider qualified applicants who are available remotely and/or half-time. The position is available immediately for an initial appointment of one year, with potential to renew.

Required qualifications:

  • Ph.D. degree with strong background in bioinformatics; previous experience with ontologies preferred
  • Experience in functional genomics or developmental biology, with preference for candidates with a background in vertebrate fin and/or limb developmen
  • Demonstrated communication and writing skills in English
  • Demonstrated ability to work in a team setting

How to apply:

  • Please contact Dr. Mabee (pmabee@usd.edu) for inquiries.
  • Applications should be directed to Dr. Mabee and include a cover letter, CV, a brief statement detailing your research interests and career goals, and three letters of reference.

Link to original post: http://wiki.phenoscape.org/wiki/Postdoc2016#Postdoctoral_Opportunity:_Evolutionary_Bioinformatics_and_Phenomics

 Posted by on May 17, 2016 at 5:17 pm
May 032016

UPDATE: Remaining Phenotype RCN funds have now been committed.

NEWS!!!! The Phenotype RCN has funds available to support graduate students, postdocs and other researchers and curators to attend ontology-related meetings this summer.

For example, the upcoming joint ICBO (International Conference on Biological Ontology) and BioCreative meeting (August 1 – 4, 2016; Oregon State University, Corvallis, OR, USA) would be an excellent venue to present phenotype ontology based work, as would various organismal and model-organism based meetings.

Please send an email to Andy Deans (adeans@gmail.com) or Eva Huala (evahuala@gmail.com) if you are interested, indicating the meeting proposed, whether you are presenting, your current position (student, faculty, etc.), the amount of funds requested, and a 200-word statement regarding the value of the opportunity to you and the relationship to phenotype ontologies.

Jan 202016

NBO-ABO Merger Workshop Smithsonian, DC 25Oct15-620

This post is a followup to our previous post about integrating the Animal Behavior Ontology (ABO) and the NeuroBehavior Ontology (NBO). This covers the second workshop, a conference call held in early December and the poster one of us (PM) presented at SICB 2016 on January 6.

With additional funding from the Phenotype RCN, on October 24–25, 2015 we held the second workshop to begin the process of merging the ABO and the NBO based on the first workshop’s recommendations. This workshop was held at the Smithsonian Museum in Washington. Attendees included Elissa Chesler, George Gkoutos, David Osumi-Sutherland, and Reid Rumelt (Cornell undergraduate working on media tagging-based research); and workshop organizers Anne Clark, Sue Margulis, Peter Midford, Cynthia Parr, and Katja Schultz (our Local Host). Melissa Haendel participated remotely.

We made good progress getting started on a use-case based paper for applications of a behavior ontology. We also have a real home for the ABO – we deposited the OWL rendering Peter Midford generated in 2006 as the initial commit in a GitHub repository (note that this is the same repository where NBO is maintained).

We started the process of merging the ABO and NBO, our central objective. One of ABO’s strengths is a clear division between observable behavior (acts, events, and processes) and functional interpretations (for example, running vs. fleeing from a predator). The NBO is organized rather differently and we would like the division in ABO to appear at least somewhere in NBO. NBO contains a sizable number of terms not relevant to the behavioral ecology community, just as ABO has terms that are not of current use to the model organism community. We identified a number of stakeholder projects who would be affected and could potentially benefit by the merger, including Virtual Fly Brain, Rat Genome Database, and the International Mouse Phenotype consortium and probably others.

Since the workshop we have had several conference calls with the NBO developers (George Gkoutos and Robert Hoehndorf) to refine the concerns of other stakeholders. Discussion made it clear that NBO is focussed on behavior phenotypes, rather than behavior processes. However, there was some interest in incorporating the ABO functional terms. The thought was that the remaining ABO terms (those referring to events, acts, and processes) should wind up in the Gene Ontology (GO). Several of us are working on the process of merging the functional terms into NBO, and separately, looking through the existing process terms in the GO. We may want to propose a behavior process ontology, at least as a parking place for terms that eventually are added to the GO.

Finally, we presented a poster at the SICB 2016 meeting in Portland, OR on January 6. We will continue to use opportunities like this to discuss the process and implications of this merger with the broader animal behavior and neuroscience communities. We are developing a set of case studies and have outlined a followup paper to highlight both the applications of the outcome of the merging process and lessons learned during that process.

 Posted by on January 20, 2016 at 4:40 pm
Dec 302015

Biosphere 2, the site of the final Phenotype RCN Summit meeting (February 2016). Photo (CC BY-NC 2.0) by pinkgranite. See original at https://flic.kr/p/52bMzk.

The Fifth Annual Summit of the Phenotype Ontology Research Coordination Network will be held at the University of Arizona’s Biosphere 2, about 40 miles north of Tucson, AZ, from February 26-28, 2016 (Friday through Sunday noon).

The theme of this meeting will be ‘Complex data integration with phenotypes’ with a focus on the integration of phenotype data with other data sets. We will summarize where our phenotype community is at with respect to integration with other data types, and we will highlight active projects. We will be looking to the future — what projects should be priorities for the future? Joining us this year will be folks from the newly funded ‘FuturePhy’ (futurephy.org), who are interested in how to integrate multiple data types, including phenotype, with phylogenetic trees.

We estimate that the costs for this meeting (transportation to meeting from airport, lodging, food) will be approximately $500, though we will be able to cover expenses for a small number of participants, particularly students and postdocs who have specific interests in using phenotypic data associated with environment in their research. Please contact one of us if you are interested in attending. It should be agreat meeting!

Paula Mabee; pmabee@usd.edu
Eva Huala; huala@acoma.stanford.edu
Andy Deans; adeans@psu.edu
Suzanna Lewis; suzi@berkeleybop.org

The Phenotype Ontology RCN (http://phenotypercn.org) was funded by the U.S. NSF to establish a network of scientists who are interested incomparing phenotypes across species and in developing the tools and methods needed to enable comparisons. In contrast to the many well-established efforts in the molecular community, the representation of phenotypic traits using ontologies is in its infancy. Phenotype ontologies, however, have the potential to integrate these data across all levels of the biological hierarchy and to the environment. This RCN is building a community that, because of its expertise, fosters communications across disciplines to enable co-development of interoperable community standards and best practices for phenotype.

 Posted by on December 30, 2015 at 1:50 am
Dec 212015

The following post is from Peter Midford. – Andy Deans

As you may recall, at a Spring 2013 meeting of the Phenotype RCN in Durham, NC, the Behavior Breakout group discussed the existence of multiple behavioral ontologies, including the gaps in existing ontologies (such as the Neuro Behavior Ontology, or NBO) that preclude their widespread use in behavioral ecology and other sub-disciplines in animal behavior. The group felt it could be possible to merge two existing behavioral ontologies – the NBO, developed to serve studies of animal models of human behavioral dysfunction, and the Animal Behavior Ontology or ABO, developed to serve the field of comparative animal behavior, including behavioral ecology and other sub-disciplines. If successful, the merger would facilitate the broader integration of behavioral studies: applied with basic, model organism with comparative investigations, mechanistic with evolutionary, and human with non-human animal questions. At the same time, it would also need to continue to serve the specialized needs of subfields.

In late summer 2014, a small group of animal behaviorists who were present at the 2013 meeting in Durham (Anne Clark, Sue Margulis, Peter Midford, Cynthia Parr) received NSF funding to hold two workshops to accomplish these goals.

Our first workshop, held August 2014 at Princeton University, convened over a dozen animal behaviorists with a broad range of expertise in comparative behavior to develop specific recommendations on how to integrate the basic terms and concepts of the two ontologies. Key outcomes included a list of proposed changes in parent-child relations in the NBO to emphasize function, and ABO term definition improvements that together could serve as the basis of integrating the two ontologies.

Our second workshop, supported in part by additional funding from the Phenotype RCN, was held at the Smithsonian’s National Museum of Natural History, Washington, DC, on October 24-25, 2015. Its specific goal was to start the process of merging the ABO and the NBO based on the first workshop’s recommendations. Attendees in addition to the four organizers, were our local host Katja Schultz (Encyclopedia of Life), Elissa Chesler (The Jackson Laboratory), George Gkoutos (NBO developer, University of Birmingham), David Osumi-Sutherland (European Bioinformatics Institute, Virtual Fly Brain), Melissa Haendel (Oregon Health and Science University), and Reid Rumelt (Cornell University undergraduate working with Macaulay Library and Encyclopedia of Life).

The workshop began with presentations about the histories of NBO and ABO. NBO had its roots in a phenotype vocabulary supporting the EUMORPHIA project (see http://empress.har.mrc.ac.uk/ and http://www.europhenome.org/). Behavior terms were initially included in the Gene Ontology, but also maps to phenotype ontologies, such as the Mammalian Phenotype ontology (MP) and Human Phenotype Ontology so as enable the integration of data. The Neuro Behavior Ontology was created to concentrate effort specifically on behavior.


ABO was one of the first accomplishments of the EthoSource project1, begun with an NSF-sponsored workshop in 2000 with the goal of developing integrated online resources for the discipline of Animal Behavior. Two NSF- sponsored Ontology Workshops followed in 2004-2005, at which an international group of animal behaviorists developed a basic metadata standard for the discipline, the ABO. The primary use of the ABO subsequent to 2005 was indexing an online ethogram repository, EthoSearch.org.

In our second blog post, we will summarize the progress we made in the October workshop, and outline our next steps.

1Martins, E. P. 2004. EthoSource: Storing, Sharing, and Combining Behavioral Data. BioScience 54 (10): 886. doi:10.1641/0006-3568(2004)054[0886:ESSACB]2.0.CO;2

 Posted by on December 21, 2015 at 3:49 pm