Tree of life and data integration challenges at the first FuturePhy workshop

 Phenex, Phylogenetics, Teleosts, Workshops  Comments Off on Tree of life and data integration challenges at the first FuturePhy workshop
Apr 062016
 

What are the challenges in building, visualizing and using the Tree of Life? How can we best utilize and build on existing phylogenetic knowledge and look ahead to address the challenges of data integration? Recently, fellow Phenoscaper Jim Balhoff and I attended the first FuturePhy workshop in Gainesville, Florida (February 20-22, 2016). The workshop brought together three taxonomically-defined working groups (catfish, beetles, barnacles) to build megatrees from existing phylogenetic studies, and identify and begin applying diverse data layers for their respective groups. Open Tree and Arbor personnel were on hand discuss and help solve issues in data integration.

The catfish team (John Lundberg, Mariangeles Arce, Jim Balhoff, Brian Sidlauskas, Ricardo Betancur, Laura Jackson, Kole Kubicek, Kyle Luckenbill, and myself, Wasila Dahdul) included participants with expertise in catfish anatomy, phylogenetics (molecular and morphological), development, bioinformatics, and digital imaging. We were motivated to build on the work of the All Catfish Species Inventory to achieve a more complete understanding of catfish diversification by integrating published phylogenies, 2D and 3D images in various online repositories, and thousands of computable phenotypes for catfishes in Phenoscape.

Screen Shot 2016-04-06 at 9.58.44 AM

We held several hands-on sessions on tree grafting (using Mesquite, R, and Arbor), data annotation (using Phenex), and tree submission to Open Tree.  We also examined an automatically generated supermatrix for 18 published catfish matrices in the Phenoscape KB (generated using the OntoTrace tool), and prototype data visualizations for supermatrices developed by Curt Lisle in Arbor. We used Mesquite to manually create a draft megatree, and in parallel, uploaded trees to Open Tree, which automatically synthesized a megatree for catfishes. Our plan is to compare the output of manual tree-building in Mesquite with the automated tree from Open Tree.

Among the issues and priorities that emerged during the workshop was the need for inclusion of the authoritative Catalog of Fishes taxonomy in Open Tree, and allowing the addition of unnamed or uncertainly identified taxa commonly used in matrices. We also discussed challenges in automated character consolidation across multiple studies, and the reuse of images across multiple online archives.

We left with a plan to continue tree building and data layer integration post-workshop, with the aim of publishing the catfish megatree (including the methods and remaining challenges) and the integration of data layers via interactions between Arbor, Open Tree, and Phenoscape.


Filed under: Phenex, Phylogenetics, Teleosts, Workshops