Linked biology technical aspects – linking phenotypes and phylogenetic trees
Miranda, E.; Santanchè, A. (2014). Linked biology technical aspects – linking phenotypes and phylogenetic trees. Universidade Estadual de Campinas: Campinas. 20, appendices pp.
|
Authors | | Top |
- Miranda, E.
- Santanchè, A.
|
|
|
Abstract |
A large number of studies in biology, including those involving phylogenetic trees reconstruction, result in the production of a huge amount of data e.g., phenotype descriptions, morphological data matrices, etc. Biologists increasingly face a challenge and opportunity of effectively discovering useful knowledge crossing and comparing several pieces of information, not always linked and integrated. Ontologies are one of the promising choices to address this challenge. However, the existing digital phenotypic descriptions are stored in semi-structured formats, making extensive use of natural language. This technical report is related to a research developed by us [1] to addresses this problem, adding an intermediate step between semi-structured phenotypic descriptions and ontologies. It remodels semi-structured descriptions to a graph abstraction in which the data are linked. Graph transformations subsidize the transition from semi-structured data representation to a more formalized representation with ontologies. The present technical report drills down implementation details of our system. It provides a module to ingest phylogenetic trees and phenotype descriptions represented in semi-structured formats into a graph database. Additionally, two approaches to combine distinct data sources are presented and an algorithm to trace changes in phylogenetic traits of trees. |
|