-
Notifications
You must be signed in to change notification settings - Fork 0
Citable statements
The metagenomic data life-cycle: standards and best practices in this paper they talk about the great achievement of comparing Tara and OSD sample with the same depth and salinity... we could do a lot better, maybe report how we do way more metadata from more projects in PM.
Computational eco‐systems biology in Tara Oceans: translating data into knowledge
Analysis of the Tara Oceans' data is likely to continue for years, perhaps decades. Together with other data sources and types, the Tara Oceans' data sets should contribute to a comprehensive parts list of organisms, genes and genomes in our oceans, although challenges in data comparability still need to be addressed (Box 1).
Box 1: Tara Oceans: from parts lists towards an understanding of ecosystems.
Tara Oceans released a massive amount of primary and derived data along with the publication of their initial results. For example, a data volume of ca. 13 terabytes has already been archived at the EBI (PRJEB402); however, many data types can still not be easily compared as methodological details and context differ.
Due to differing biological features in the different organism classes and due to funding constraints, different methods were applied to capture biodiversity. For example, metagenomics could not be afforded for eukaryotes, since only a very small fraction of the large genomes are protein‐coding. Also, because of missing methodological standards, direct comparison of these data is challenging. For example, due to difficulties in delineating species based on molecular data alone, the term operational taxonomic unit (OTU) is commonly used to define a taxonomic group based on sequence similarity of select taxonomic marker genes. However, the 18S and 16S rRNA genes are used for eukaryotes and prokaryotes, respectively, which differ in diversification rates and operational taxonomic definitions. Moreover, as viruses lack any universal genes that could be used for consistent taxonomic classification, long contiguous sequences of assembled viral genomes were used as an alternative approach to quantify viral populations. On the other hand, for studying genetic diversity, similar gene definitions were used for metagenomically characterized prokaryotic genes and metatranscriptomically derived eukaryotic genes. However, sequencing depths, sample numbers, gene lengths, genome sizes and many other parameters are different and need normalization, before sensible comparisons can be made (Table 1). Thus, despite a 1,000‐fold increase of data over earlier ocean surveys (Rusch et al, 2007), the established Tara Oceans' resources are only the tip of an iceberg when attempting to collect planetary biodiversity. While representing a promising start to collect the molecular and taxonomic parts lists of the contemporary ocean, Tara Oceans has a lot of work ahead to connect these into species interactions and their functional meaning in the context of the environment.