Metagenomics, or: more problems to solve by bioinformaticians!

Nature Reviews Microbiology had their special issue on metagenomics in 2005, and the closely related topic of horizontal gene transfer shortly afterward, now it is PLoS Biology’s turn with several articles on advances in studying microbial communities in the ocean as part of their Oceanic Metagenomics collection. Not that, in theory, metagenomics is limited to microbes, but that’s where the research focus is now (e.g. [1][2][3]), because scaling up genomics research isn’t easy or cheap – and think of all the data that needs to be stored, processed, and analysed.

For the non-biologist reader in 3 sentences (or synopsis [4]): metagenomics, or `high-throughput molecular ecology’ (also called community genomics, ecogenomics, environmental genomics, or population genomics) combines molecular biology with ecosystems. It reveals community and population-specific metabolisms with the interdependent biological behaviour of organisms in nature that is affected by its micro-climate. Take a handful of soil (ocean water, mud, …) and figure out which microorganisms live there, who’s active (and what are they doing?), who’s dormant, what are the ratios of the population sizes of the different types of microorganisms, how does a microbial community ‘look’ like, etc?

For the data-enthusiast: all those individual microorganisms need to have their DNA and RNA sequenced, where, of course, the results go into databases. And then the analysis: putting back together the pieces from shotgun sequencing, comparing DNA with DNA, rRNA with rRNA, with each other, how to do the binning and so forth [5]. Naively: more and faster algorithms wouldn’t hurt; how can you visualize a community of microorganisms on your screen, and make simulations of those bacterial communities?

And then, somewhere in this whole endeavor, bio-ontologists should be able to find their place, to help out (and figure out) how to best represent all the new information in a usable and reusable way. Because metagenomics is a hot topic with much research and novel results, ontology maintenance (tracking changes etc) will then likely be more important than the attention it receives in ODEs at present, as well as reasoning over ontologies and massive amounts of data. Ouch. Some work has been and is being done on these topics (e.g. [6] [7]), and more can/will/does/should follow.

[1] DeLong, E.F. Microbial community genomics in the ocean. Nature Reviews Microbiology, 2005, 3:459-469.
[2] Lorenz, P., Eck, J. Metagenomics and industrial applications. Nature Reviews Microbiology, 2005, 3:510-516.>
[3] Schleper, C., Jurgens, G., Jonuscheit, M. Genomic studies of uncultivated Archae. Nature Reviews Microbiology, 3:479-488.
[4] Gross, L. Untapped Bounty: Sampling the Seas to Survey Microbial Biodiversity. PLoS Biology, 2007, 5(3): e85.
[5] Eisen, J.A. Environmental Shotgun Sequencing: Its Potential and Challenges for Studying the Hidden World of Microbes. PLoS Biology, 2007, 5(3): e82.
[6] Klein, M. and Noy, N.F. (2003). A Component-Based Framework for Ontology Evolution. Workshop on Ontologies and Distributed Systems at IJCAI-2003, Acapulco, Mexico.
[7] Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A, Rosati, R. Linking data to ontologies: The description logic DL-lite A. Proc. of the 2nd Workshop on OWL: Experiences and Directions (OWLED 2006), 2006.