Metagenomic environmental surveys, like the Global Ocean Survey (GOS) or Ocean Sampling Day, generate a huge amount of data and allow performing more holistic approaches to study marine ecosystems and discovering missing links in marine biological processes. Besides expanding our limited view on the diversity of the known protein universe, metagenomics also revealed a large fraction of genes with unknown function. This fraction usually ranges around the 40-60% and can be up to 90% of the predicted Open Reading Frames (ORFs). In most analytical pipelines this fraction of unknown function are not further elucidated. To better include the fraction of unknown function, we developed a novel rigorous approach which extracts valuable information from the co-occurrence of individual protein domains in large sets of metagenomes using Graphical Models. Based on an integrative approach we derive statistically significant associations of the known protein domain families with the unknown fraction. With the generation of the known-unknown network and the development of associated tools we are capable of exploring the dark side of the metagenomes and unlock hidden functions of environmental and biotechnological significance.


Fernandez-Guerra, A., Max Planck Institute for Marine Microbiology, Germany, afernand@mpi-bremen.de

Kottmann, R., Max Planck Institute for Marine Microbiology, Germany, rkottman@mpi-bremen.de

Barberan Torrents, A., University of Colorado, USA, albert.barberan@colorado.edu

Casamayor, E. O., Centre d’Estudis Avançats de Blanes, CEAB–CSIC, Spain, casamayor@ceab.csic.es

Glöckner, F. O., Max Planck Institute for Marine Microbiology / Jacobs University, Germany, fog@mpi-bremen.de


Oral presentation

Session #:075
Date: 2/26/2015
Time: 17:00
Location: Andalucia 1 (Floor 1)

Presentation is given by student: No