REVIEW article

Front. Microbiol., 05 June 2015

Sec. Evolutionary and Genomic Microbiology

Volume 6 - 2015 | https://doi.org/10.3389/fmicb.2015.00563

Discovery of new protein families and functions: new challenges in functional metagenomics for biotechnologies and microbial ecology

  • 1. Université de Toulouse, Institut National des Sciences Appliquées (INSA), Université Paul Sabatier (UPS), Institut National Polytechnique (INP), Laboratoire d’Ingénierie des Systèmes Biologiques et des Procédés (LISBP), Toulouse, France

  • 2. INRA - UMR792 Ingénierie des Systèmes Biologiques et des Procédés, Toulouse, France

  • 3. CNRS, UMR5504, Toulouse, France

Abstract

The rapid expansion of new sequencing technologies has enabled large-scale functional exploration of numerous microbial ecosystems, by establishing catalogs of functional genes and by comparing their prevalence in various microbiota. However, sequence similarity does not necessarily reflect functional conservation, since just a few modifications in a gene sequence can have a strong impact on the activity and the specificity of the corresponding enzyme or the recognition for a sensor. Similarly, some microorganisms harbor certain identified functions yet do not have the expected related genes in their genome. Finally, there are simply too many protein families whose function is not yet known, even though they are highly abundant in certain ecosystems. In this context, the discovery of new protein functions, using either sequence-based or activity-based approaches, is of crucial importance for the discovery of new enzymes and for improving the quality of annotation in public databases. This paper lists and explores the latest advances in this field, along with the challenges to be addressed, particularly where microfluidic technologies are concerned.

Introduction

The implications of the discovery of new protein functions are numerous, from both cognitive and applicative points of view. Firstly, it improves understanding of how microbial ecosystems function, in order to identify biomarkers and levers that will help optimize the services rendered, regardless of the field of application. Next, the discovery of new enzymes and transporters enables expansion of the catalog of functions available for metabolic pathway engineering and synthetic biology. Finally, the identification and characterization of new protein families, whose functions, three-dimensional structure and catalytic mechanism have never been described, furthers understanding of the protein structure/function relationship. This is an essential prerequisite if we are to draw full benefit from these proteins, both for medical applications (for example, designing specific inhibitors) and for relevant integration into biotechnological processes.

Many reviews have been published on functional metagenomics these last 10 years. Many of them focus on the strategies of library creation and on bio-informatic developments (Di Bella et al., 2013; Ladoukakis et al., 2014), while others describe the various approaches set up to discover novel targets [like therapeutic molecules (Culligan et al., 2014)] for a specific application. In particular several review papers have been written on the numerous activity-based metagenomics studies carried out to find new enzymes for biotechnological applications, without necessarily finding new functions or new protein families (Ferrer et al., 2009; Steele et al., 2009). The present review focuses on all the functional metagenomics approaches, sequence- or activity-based, allowing the discovery of new functions and families from the uncultured fraction of microbial ecosystems, and makes a recent overview on the advances of microfluidics for ultra-fast microbial screening of metagenomes.

Sampling Strategies

The literature describes a wide variety of microbial environments sampled in the search for new enzymes. A large number of studies look at ecosystems with high taxonomic and functional diversity, such as soils or natural aquatic environments that are either undisturbed or exposed to various pollutants (Gilbert et al., 2008; Brennerova et al., 2009; Zanaroli et al., 2010). Extreme environments enable the discovery of enzymes that are naturally adapted to the constraints of certain industrial processes, such as glycoside hydrolases and halotolerant esterases (Ferrer et al., 2005; LeCleir et al., 2007), thermostable lipases (Tirawongsaroj et al., 2008), or even psychrophilic DNA-polymerases (Simon et al., 2009). Other microbial ecosystems, such as anaerobic digesters including both human and/or animal intestinal microbiota and industrial remediation reactors, are naturally specialized in metabolizing certain substrates. These are ideal targets for research into particular functions, such as the degrading activity of lignocellulosic plant biomass (Warnecke et al., 2007; Tasse et al., 2010; Hess et al., 2011; Bastien et al., 2013) or dioxygenases for the degradation of aromatic compounds (Suenaga et al., 2007).

Some studies refer to enrichment steps that occur before sampling, with the aim of increasing the relative abundance of micro-organisms that have the target function. This enrichment can be done by modifying the physical and chemical conditions of the natural environment (van Elsas et al., 2008) or by incorporating the substrate to be metabolized in vivo (Hess et al., 2011) or in vitro, in reactors (DeAngelis et al., 2010) or mesocosms (Jacquiod et al., 2013). Through stable isotopic probing and cloning of the DNA of micro-organisms able to metabolize a specifically labeled substrate for the creation of metagenome libraries, it is possible to increase the frequency of positive clones by several orders of magnitude (Chen and Murrell, 2010). These approaches require functional and taxonomic controls at the different stages of enrichment, which are often sequential, to prevent the proliferation of populations dependent on the activity of the populations preferred at the outset. These kinds of checks are difficult to do in vivo, where there would actually be an increased risk of selecting populations able to metabolize only the degradation products of the initial substrate, to the detriment of those able to attack the more resistant original substrate with its more complex structure.

Functional Screening: New Challenges for the Discovery of Functions

Two complementary approaches can be used to discover new functions and protein families within microbial communities. The first involves the analysis of nucleotide, ribonucleotide or protein sequences, and the other the direct screening of functions before sequencing (Figure 1).

FIGURE 1

The Sequence, Marker of Originality

There have been a number of large-scale random metagenome sequencing projects (Yooseph et al., 2007; Vogel et al., 2009; Gilbert et al., 2010; Qin et al., 2010; Hess et al., 2011) over the past few years, resulting in catalogs listing millions of genes from different ecosystems, the majority of which are recorded in the GOLD1 (RRID:nif-0000-02918), MG-RAST2 (RRID:OMICS_01456) and EMBL-EBI3 (RRID:nlx_72386) metagenomics databases. At the same time, the obstacles inherent to metatranscriptomic sampling (fragility of mRNA, difficulty with extraction from natural environments, separation of other types of RNA) have been removed, opening a window into the functional dynamics of ecosystems according to biotic or abiotic constraints (Saleh-Lakha et al., 2005; Warnecke and Hess, 2009; Schmieder et al., 2012). Metatranscriptomes sequencing has thus enabled the identification of new gene families, such as those found in microbial communities (prokaryotes and/or eukaryotes) expressed specifically in response to variations in the environment (Bailly et al., 2007; Frias-Lopez et al., 2008; Gilbert et al., 2008) and new enzyme sequences belonging to known carbohydrate active enzymes families (Poretsky et al., 2005; Tartar et al., 2009; Damon et al., 2012).

Regardless of the origin of the sequences (DNA or cDNA, with or without prior cloning in an expression host), the advances made with automatic annotation, most notably thanks to the IMG-M (RRID:nif-0000-03010) and MG-RAST (RRID:OMICS_01456) servers (Markowitz et al., 2007; Meyer et al., 2008), now make it possible to quantify and compare the abundance of the main functional families in the target ecosystems (Thomas et al., 2012), identified through comparison of sequences with the general functional databases: KEGG (RRID:nif-0000-21234) (Kanehisa and Goto, 2000), eggNOG (RRID:nif-0000-02789) (Muller et al., 2010), and COG/KOG (RRID:nif-0000-21313) (Tatusov et al., 2003). They also enable research into specific protein families, thanks to motif detection using Pfam (RRID:nlx_72111) (Finn et al., 2010), TIGRFAM (RRID:nif-0000-03560) (Selengut et al., 2007), CDD (RRID:nif-0000-02647) (Marchler-Bauer et al., 2009), Prosite (RRID:nif-0000-03351) (Sigrist et al., 2010), and HMM model construction (Hidden Markov Models; Söding, 2005). Other servers can be used to interrogate databases specialized in specific enzymatic families (Table 1).

TABLE 1

DatabasesEnzymesReferences
MetaBioMEEnzymes of industrial interestSharma et al. (2010)
CAZycarbohydrate active enzymesCantarel et al. (2012)
(RRID:OMICS_01677)Auxiliary redox enzymes for lignocellulose degradationLevasseur et al. (2013)
CATcarbohydrate active enzymesPark et al. (2010)
(RRID:OMICS_01676)
LccEDLaccasesSirim et al. (2011)
LEDLipasesPleiss et al. (2000)
(RRID:nif-0000-03084)
MEROPSProteasesRawlings et al. (2012)
(RRID:nif-0000-03112)
ThYmeThioesterasesCantu et al. (2011)

Examples of databases specialized in enzymatic functions of biotechnological interest.

Finally, the performance of methods used to assemble next generation sequencing reads is set to open up access to a plethora of complete genes to feed expert databases, which currently only contain a tiny percentage of genes from uncultivated organisms—less than 1% for the CAZy database (RRID:OMICS_01677), for example—while the majority of metagenomic studies published target ecosystems with a high number of plant polysaccharide degradation activities by carbohydrate active enzymes (André et al., 2014).

Even based on a large majority of truncated genes, metagenomes and metatranscriptomes functional annotation enables in silico estimations of the functional diversity of the ecosystem and identification of the most original sequences within a known protein family. It is then possible to use PCR (Polymerase Chain Reaction) to capture those sequences specifically, and test their function experimentally to assess their applicative value. In this way, the sequencing of the rumen metagenome (268 Gb) enabled identification of 27,755 coding genes for carbohydrate active enzymes, and isolation of 51 active enzymes belonging to known families specifically involved in lignocellulose degradation (Hess et al., 2011).

PCR, and more generally DNA/DNA or DNA/cDNA hybridization, also make it possible to directly capture coding genes for protein families that are abundant and/or expressed in the target ecosystem, but with no need for a priori large-scale sequencing. This strategy requires the conception of nucleic acid probes or PCR primers using consensus sequences specific to known protein families. There are plenty of examples of the discovery of enzymes in metagenomes using these approaches, for instance bacterial laccases (Ausec et al., 2011), dioxygenases (Zaprasis et al., 2009), nitrites reductases (Bartossek et al., 2010), hydrogenases (Schmidt et al., 2010), hydrazine oxidoreductases (Li et al., 2010), or chitinases (Hjort et al., 2010) from various ecosystems. The Gene-Targeted-metagenomics approach (Iwai et al., 2009) combines PCR screening and amplicon pyrosequencing to generate primers in an iterative manner and increase the structural diversity of the target protein families, for example the dioxygenases from the microbiota of contaminated soil. Elsewhere, the use of high-density functional microarrays considerably multiplies the number of probes and is therefore a low-cost way of obtaining a snapshot of the abundance and diversity of sequences within specific protein families and even, where the DNA or cDNA has been cloned (He et al., 2010; Weckx et al., 2010), directly capturing targets of interest while rationalizing sequencing. Using a similar strategy, the solution hybrid selection method enables the selection of fragments of coding DNA for specific enzymatic families using 31-mers capture probes. Applied to the capture of cDNA, this method provides access to entire genes which can be then cloned and their activity tested (Bragalini et al., 2014). Solution hybrid selection can therefore be used to explore the taxonomic and functional diversity of all protein families. More especially, this approach opens the way for the selection and characterization of families that are highly represented in a microbiome but whose function remains unknown, in order to further the understanding of ecosystemic functions and discover novel biocatalysts.

Metaproteomics has recently proved its worth in identifying new protein families and/or functions. Paired with genomic, metagenomic and metatranscriptomic data (Erickson et al., 2012), it provides access to excellent biomarkers of the functional state of the ecosystem. Recent developments, such as high-throughput electrospray ionization paired with mass spectrometry, enable full metaproteome analysis after separation of proteins by liquid chromatography. It is thus possible to highlight hundreds of proteins with no associated function and new enzyme families playing a key functional role in the ecosystem (Ram et al., 2005).

This latter example illustrates the need for research and/or experimental proof of function for proteins where the function remains unknown (products of orphan genes or, on the contrary, genes highly prevalent in the microbial realm but that have never been characterized) or poorly annotated. In fact, annotation errors, which are especially common for multi-modular proteins such as carbohydrate active enzymes, are spread at an increasing rate as a result of the explosion in the number of functional genomics and meta-genomic, -transcriptomic and -proteomic projects. New annotation strategies, most notably based on the prediction of the three-dimensional structure of proteins, are also worth exploring (Uchiyama and Miyazaki, 2009). However, at the present time, it is very difficult to predict the specificity of substrate and the mechanism of action (and therefore the function of the protein) on the basis of sequence or even structure, especially where there is no homologue characterized from a structural and functional point of view. Functional screening can address this challenge.

Activity Screening: Speeding up the Discovery of Biotechnology Tools

There are three prerequisites for this approach: (i) the cloning of DNA or cDNA in an expression vector for the creation of, respectively, metagenomic or metatranscriptomic libraries, (ii) heterologous expression of cloned genes in a microbial host, iii) the conception of efficient phenotypic screens to isolate the clones of interest that produce the target activity, also referred to as “hits.”

Using this approach, the functions of a protein can be accessed without any prior information on its sequence. It is therefore the only way of identifying novel protein families that have known functions or previously unseen functions (as long as an adequate screen can be developed). Finally, it helps to rationalize sequencing efforts and focus them only on the hits: for example, those that are of biotechnological interest. The expression potential of the selected heterologous host, the size of the DNA inserts and the type of vectors all determine the success of functional screening. Short fragments of metagenomic DNA (smaller than 15 kb, and most often between 2 and 5 kb), or cDNA for the metatranscriptomic libraries, cloned in plasmids under the influence of a strong expression promoter, enable the overexpression of a single protein, and the easy recovery and sequencing of the hits’ DNA (Uchiyama and Miyazaki, 2009). On the other hand, fragments of bacterial DNA measuring between 15 and 40 kb, 25 and 45 kb or even 100 and 200 kb, cloned respectively in cosmids, fosmids or bacterial artificial chromosomes, can be used to explore a functional diversity of several Gb per library and, above all, provide access to operon-type multigene clusters, coding for complete catabolic or anabolic pathways This is of major interest for the discovery of cocktails of synergistic activities that degrade complex substrates such as plant cell walls for biorefineries. This strategy also ensures high reliability for the taxonomic annotation of inserts, and can even be used to identify the mobile elements responsible for the plasticity of the bacterial metagenome, mediated by horizontal gene transfers (Tasse et al., 2010). However, it requires sensitive activity screens, since the target genes are only weakly expressed, controlled by their own native promoters.

Escherichia coli, whose transformation efficiency is exceptionally high, even for fosmids or bacterial artificial chromosomes, remains the host of choice in the immense majority of studies published. The first exhaustive functional screening study of a fosmid library revealed that E. coli can be used to express genes from bacteria that are very different from a taxonomical point of view, including a large number of Bacteroidetes and Gram-positive bacteria (Tasse et al., 2010), contrary to what had been predicted by in silico detection of expression signals compatible with E. coli (Gabor et al., 2004). However, the value of developing shuttle vectors to screen metagenomic libraries in hosts with different expression and secretion potentials, for example Bacillus, Sphingomonas, Streptomyces, Thermus, or the α-, β- and γ–proteobacteria (Taupp et al., 2011; Ekkers et al., 2012) must not be underestimated, if we are to unlock the functional potential of varied taxons and increase the sensitivity of screens. Finally, it is still very difficult to get access to the uncultivated fraction of eukaryotic microorganisms, due to the lack of screening hosts with sufficient transformation efficiency for the creation of large clone libraries (and thus the exploration of a vast array of sequences) and compatible with the post-translational modifications required to obtain functional recombinant proteins from eukaryotes. Thus, at the present time, only a few studies have been published on the enzyme activity-based screening of metatranscriptomic libraries (making it possible to do away with introns) of eukaryotes from soil, rumen and the gut of the termite (Bailly et al., 2007; Findley et al., 2011, Sethi et al., 2013).

Regardless of the type of library screened, the functional exploration of hundreds of thousands of clones is required, whereas the hit rate rarely exceeds 6‰ (Duan et al., 2009; Bastien et al., 2013). This requires very high throughput primary screens, in a solid medium before or after the automated organization of libraries in 96- or 384-well micro-plate format, in a liquid medium after enzymatic cell lysis and/or thawing and freezing (Bao et al., 2011), or using UV-inducible auto-lytic vectors (Li et al., 2007). This stage is very often followed by medium or low throughput characterization of the properties of the hits obtained, particularly to assess their biotechnological interest (Tasse et al., 2010).

Two generic strategies, used at throughputs exceeding 400,000 tests per week, have been and continue to be applied widely. Positive selection on a medium containing, for example, substrates to be metabolized as the sole source of carbon, can be used to isolate enzymes (Henne et al., 1999), complete catabolic pathways (Cecchini et al., 2013), or membrane transporters (Majerník et al., 2001). This approach also helps easily identify antibiotic resistant genes (Diaz-Torres et al., 2006). The use of chromogenic (Beloqui et al., 2010; Bastien et al., 2013; Nyyssönen et al., 2013), fluorescent (LeCleir et al., 2007), or opalescent substrates or reagents, such as insoluble polymers or proteins (Mayumi et al., 2008; Waschkowitz et al., 2009), or simply the observation of an original clone phenotype, has already enabled the isolation of several 100 catabolic enzymes, like the numerous hydrolases of very varied taxonomic origin (Simon and Daniel, 2009), some of which were coded by genes that are very abundant in the target ecosystem (Jones et al., 2008; Gloux et al., 2011), but also, although much less frequently, new oxidoreductases (Knietsch et al., 2003). Novel enzymes (laccases, esterases and oxygenases in particular) from microbial communities of very diverse origins (soil, water, activated sludge, digestive tracts) have been highlighted for their capacity to degrade pollutants such as nitriles (Robertson and Steer, 2004), lindane (Boubakri et al., 2006), styrene (Van Hellemond et al., 2007), naphthalene (Ono et al., 2007), aliphatic and aromatic carbohydrates (Uchiyama et al., 2004; Brennerova et al., 2009; Lu et al., 2012), organophosphorus (Kambiranda et al., 2009; Math et al., 2010), or plastic materials (Mayumi et al., 2008).

The discovery of proteins involved in prokaryote-eukaryote interactions (Lakhdari et al., 2010) or anabolic pathways is rarer, since it often requires the development of complex screens and lower throughputs. Nonetheless, a few examples of simple screens, based on the aptitude of metagenomic clones to inhibit the growth of a strain by producing antibacterial activity or to complement an auxotrophic strain for a specific compound, have enabled the identification of new pathways for the synthesis of antimicrobials (Brady and Clardy, 2004) or biotin (Entcheva et al., 2001). Nano-technologies, and in particular the latest developments focused on the medium-throughput screening of libraries obtained by combinatorial protein engineering, enable the design of custom microarrays and covered with one to several 100 specific enzymatic substrates, the processing of which may be followed by fluorescence, chemiluminescence, immunodetection, surface plasmon resonance or mass spectrometry (André et al., 2014). Nanostructure-initiator mass spectrometry technology, combining fluorescence and mass spectrometry, is the first example of a functional metagenomic application for the discovery of anabolic enzymes, namely sialyltransferases (Northen et al., 2008).

The Immense Challenges of Ultra-fast Screening (Figure 2)

FIGURE 2

Microfluidic technologies are of undeniable interest when it comes to reaching screening rates of a million clones per day. The substrate induced gene-expression screening method has been developed to use fluorescence-activated cell sorting to isolate plasmidic clones containing genes (or fragments of genes) that induce the expression of a fluorescent marker in response to a specific substrate. However, this technique is only suited to small substrates that are non-lethal and internalizable for the host strain (Uchiyama and Watanabe, 2008). Finally, the advances made over the past few years in cellular compartmentalization (Nawy, 2013), selective sorting, based on sequence detection (Pivetal et al., 2014; Lim et al., 2015) or specific metabolites (Kürsten et al., 2014) and the control of reaction kinetics (Mazutis et al., 2009) in microfluidic circuits should allow for a huge acceleration in the discovery of new proteins and metabolic pathways expressed in prokaryotes and eukaryotes in an intercellular, membrane or extracellular manner.

The very first examples of metagenome functional exploration applications have already been used to establish the proof of concept regarding the effectiveness of microfluidics in the discovery of new bioactive molecules and new enzymes. For example, droplet-based microfluidics technology was recently used by the teams of A. Griffiths and A. Drevelle to isolate new strains producing cellobiohydrolase and cellulase activities at a rate of 300,000 cells sorted per hour, using just a few microliters of reagent, i.e., 250,000 times less than with the conventional technologies mentioned above (Najah et al., 2014). Here, soil bacteria and a fluorescent substrate were co-encapsulated in micro-droplets in order to sort cells on the basis of the extracellular activity only. In fact, the strategy used, which requires the seeding of cells on a defined medium after sorting, is not compatible with the detection of intracellular enzymes, which require a lethal lysis step to convert the substrate. Applying a similar principle, the ultra-rapid sorting of eukaryote cells encapsulated with their substrate now also makes it possible to select yeast clones presenting extracellular enzymatic activities (Sjostrom et al., 2014). This technology should, in the short term, make it possible to explore the functional diversity of uncultivated eukaryotes at a very high throughput, by directly sorting fungal populations or libraries of metatranscriptomic clones. In the latter case, access to the sequence involved in the target activity will be easy, since the libraries are built using hosts whose culture is well managed, with insertion of the metatranscriptomic cDNA fragment into a specific region of the genome. Where sorting is done without cloning of the metagenome or metatranscriptome, only microorganisms capable of growth on a defined medium can be recovered, which hugely limits access to functional diversity.

To increase the proportion of cultivable organisms, Kim Lewis’ team recently used the iChip to simultaneously isolate and cultivate soil bacteria thanks to the delivery of nutrients from the original medium, into which the iChip is introduced, via semi-permeable membranes. This method enables an increase in cultivable organisms ranging from 1 to 50%. Using colonies cultivated in the chip, the clones isolated in a Petri dish were screened for the production of antimicrobial compounds (Ling et al., 2015). A novel antibiotic was thus identified, together with its biosynthesis pathway, after sequencing and functional annotation of the complete genome.

It is quite another matter when it comes to selecting, on the basis of intracellular activity, completely uncultivable organisms or metagenomic clones containing DNA inserts of several dozen kbp, which are difficult to amplify using PCR. In this case, to liberate the enzymes in question, we are required to include a cellular lysis step, preventing seeding after sorting. On the other hand, this approach is compatible with the sorting of plasmid clone libraries, where the metagenomic or metatranscriptomic inserts can easily be amplified using PCR, on the basis of just a few dozen lysed cells. For libraries with large DNA inserts, the barriers are now being broken down, most notably thanks to the development of the SlipChips microfluidic approach (Ma et al., 2014), which uses two culture microcompartments, where the content of one can be lysed for the detection of enzymatic activities, for example, and the other is used as a backup replicate for the culture and recovery of subsequent DNA for sequencing. In spite of these recent, highly encouraging developments, the proof of concept has not yet been established for the identification of new functions and intracellular metabolic pathways.

Conclusion

The rapid expansion of meta-omic technologies over the past decade has shed light on the functions of the uncultivated fraction of microbial ecosystems. A huge number of enzymes have been discovered, in particular through experimental approaches to functional metagenomes exploration. Where their performance can be rapidly assessed within the framework of a known process, or where they catalyze new, previously undescribed reactions, many of them have provided new tools for industrial biotechnologies. However, several challenges still need to be addressed to speed up the rate at which new functions are discovered and to make optimal use of the functional diversity that so far remains unexplored. Firstly, while the uncultivated prokaryote fraction of microbial communities is still extensively studied, the functions of the eukaryote fraction are relatively unexplored from an experimental angle, even though they play a fundamental role for numerous ecosystems. Secondly, in the majority of cases, the functions discovered using meta-omic approaches play a catabolic role, mainly involved in the deconstruction of plant biomass or in bioremediation. It is thus necessary to develop functional screens to access anabolic functions and enrich the catalog of reactions available for synthetic biology. Finally, there are very few studies aimed at identifying the role of protein families that are highly prevalent in the target ecosystem but that have not yet been characterized, even though some of them could be considered as biomarkers of the functional state of the microbial community. Indeed, sequence-based functional metagenomic projects continuously highlight many sequences annotated as domains of unknown function in the Pfam database (RRID: nlx_72111) (Bateman et al., 2010; Finn et al., 2014), some with 3D structures solved thanks to structural genomics initiatives, and available in the Protein Data Bank (RRID: nif-0000-00135). With the goal of characterizing these new protein families and identifying previously unseen functions from the selection the most prevalent protein families (those containing the highest number of homologous sequences without any associated function) in the target ecosystem, the integration of structural, biochemical, genomic and meta-omic data is now also possible (Ladevèze et al., 2013). It allows to benefit from the huge amount of long scaffolds now available in sequence databases, and to access the genomic context of the targeted genes in order to facilitate functional assignation. In the next few years, these strategies should enhance our understanding of how microbial ecosystems function and, at the same time, enable greater control over them.

Statements

Author contributions

LU, GPV, EL contributed equally to this work.

Acknowledgments

This research was funded by the Ministry of Education and Research (Ministère de l’Enseignement supérieur et de la Recherche, MESR), the Agence Nationale de la Recherche (Grant Number ANR 2011-Nano 007 03) and the INRA metaprogramme M2E (project Metascreen).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  • 1

    AndréI.Potocki-VéronèseG.BarbeS.MoulisC.Remaud-SiméonM. (2014). CAZyme discovery and design for sweet dreams. Curr. Opin. Chem. Biol.19, 1724. 10.1016/j.cbpa.2013.11.014

  • 2

    AusecL.van ElsasJ. D.Mandic-MulecI. (2011). Two- and three-domain bacterial laccase-like genes are present in drained peat soils. Soil Biol. Biochem.43, 975983. 10.1016/j.soilbio.2011.01.013

  • 3

    BaillyJ.Fraissinet-TachetL.VernerM.-C.DebaudJ.-C.LemaireM.Wésolowski-LouvelM.et al (2007). Soil eukaryotic functional diversity, a metatranscriptomic approach. ISME J.1, 632642. 10.1038/ismej.2007.68

  • 4

    BaoL.HuangQ.ChangL.ZhouJ.LuH. (2011). Screening and characterization of a cellulase with endocellulase and exocellulase activity from yak rumen metagenome. J. Mol. Catal. B Enzym.73, 104110. 10.1016/j.molcatb.2011.08.006

  • 5

    BartossekR.NicolG. W.LanzenA.KlenkH.-P.SchleperC. (2010). Homologues of nitrite reductases in ammonia-oxidizing archaea: diversity and genomic context. Environ. Microbiol.12, 10751088. 10.1111/j.1462-2920.2010.02153.x

  • 6

    BastienG.ArnalG.BozonnetS.LaguerreS.FerreiraF.FauréR.et al (2013). Mining for hemicellulases in the fungus-growing termite Pseudacanthotermes militaris using functional metagenomics. Biotechnol. Biofuels6, 78. 10.1186/1754-6834-6-78

  • 7

    BatemanA.CoggillP.FinnR. D. (2010). DUFs: families in search of function. Acta Crystallogr. Sect. F Struct. Biol. Cryst. Commun.66, 11481152. 10.1107/S1744309110001685

  • 8

    BeloquiA.PolainaJ.VieitesJ. M.Reyes-DuarteD.TorresR.GolyshinaO. V.et al (2010). Novel hybrid esterase-haloacid dehalogenase enzyme. Chembiochem11, 19751978. 10.1002/cbic.201000258

  • 9

    BoubakriH.BeufM.SimonetP.VogelT. M. (2006). Development of metagenomic DNA shuffling for the construction of a xenobiotic gene. Gene375, 8794. 10.1016/j.gene.2006.02.027

  • 10

    BradyS. F.ClardyJ. (2004). Palmitoylputrescine, an antibiotic isolated from the heterologous expression of DNA extracted from bromeliad tank water. J. Nat. Prod.67, 12831286. 10.1021/np0499766

  • 11

    BragaliniC.RibiereC.ParisotN.VallonL.PrudentE.PeyretailladeE.et al (2014). Solution hybrid selection capture for the recovery of functional full-length eukaryotic cDNAs from complex environmental samples. DNA Res.21, 685694. 10.1093/dnares/dsu030

  • 12

    BrennerovaM. V.JosefiovaJ.BrennerV.PieperD. H.JuncaH. (2009). Metagenomics reveals diversity and abundance of meta-cleavage pathways in microbial communities from soil highly contaminated with jet fuel under air-sparging bioremediation. Environ. Microbiol.11, 22162227. 10.1111/j.1462-2920.2009.01943.x

  • 13

    CantarelB. L.LombardV.HenrissatB. (2012). Complex carbohydrate utilization by the healthy human microbiome. PLoS ONE7:e28742. 10.1371/journal.pone.0028742

  • 14

    CantuD. C.ChenY.LemonsM. L.ReillyP. J. (2011). ThYme: a database for thioester-active enzymes. Nucleic Acids Res.39, D342D346. 10.1093/nar/gkq1072

  • 15

    CecchiniD. A.LavilleE.LaguerreS.RobeP.LeclercM.DoréJ.et al (2013). Functional metagenomics reveals novel pathways of prebiotic breakdown by human gut bacteria. PLoS ONE8:e72766. 10.1371/journal.pone.0072766

  • 16

    ChenY.MurrellJ. C. (2010). When metagenomics meets stable-isotope probing: progress and perspectives. Trends Microbiol.18, 157163. 10.1016/j.tim.2010.02.002

  • 17

    CulliganE. P.SleatorR. D.MarchesiJ. R.HillC. (2014). Metagenomics and novel gene discovery: promise and potential for novel therapeutics. Virulence5, 399412. 10.4161/viru.27208

  • 18

    DamonC.LehembreF.Oger-DesfeuxC.LuisP.RangerJ.Fraissinet-TachetL.et al (2012). Metatranscriptomics reveals the diversity of genes expressed by eukaryotes in forest soils. PLoS ONE7:e28967. 10.1371/journal.pone.0028967

  • 19

    DeAngelisK. M.GladdenJ. M.AllgaierM.D’haeseleerP.FortneyJ. L.ReddyA.et al (2010). Strategies for enhancing the effectiveness of metagenomic-based enzyme discovery in lignocellulolytic microbial communities. BioEnergy Res.3, 146158. 10.1007/s12155-010-9089-z

  • 20

    Diaz-TorresM. L.VilledieuA.HuntN.McNabR.SprattD. A.AllanE.et al (2006). Determining the antibiotic resistance potential of the indigenous oral microbiota of humans using a metagenomic approach. FEMS Microbiol. Lett.258, 257262. 10.1111/j.1574-6968.2006.00221.x

  • 21

    Di BellaJ. M.BaoY.GloorG. B.BurtonJ. P.ReidG. (2013). High throughput sequencing methods and analysis for microbiome research. J. Microbiol. Methods95, 401414. 10.1016/j.mimet.2013.08.011

  • 22

    DuanC.-J.XianL.ZhaoG.-C.FengY.PangH.BaiX.-L.et al (2009). Isolation and partial characterization of novel genes encoding acidic cellulases from metagenomes of buffalo rumens. J. Appl. Microbiol.107, 245256. 10.1111/j.1365-2672.2009.04202.x

  • 23

    EkkersD. M.CretoiuM. S.KielakA. M.van ElsasJ. D. (2012). The great screen anomaly—a new frontier in product discovery through functional metagenomics. Appl. Microbiol. Biotechnol.93, 10051020. 10.1007/s00253-011-3804-3

  • 24

    EntchevaP.LieblW.JohannA.HartschT.StreitW. R. (2001). Direct cloning from enrichment cultures, a reliable strategy for isolation of complete operons and genes from microbial consortia. Appl. Environ. Microbiol.67, 8999. 10.1128/AEM.67.1.89-99.2001

  • 25

    EricksonA. R.CantarelB. L.LamendellaR.DarziY.MongodinE. F.PanC.et al (2012). Integrated metagenomics/metaproteomics reveals human host-microbiota signatures of Crohn’s disease. PLoS ONE7:e49138. 10.1371/journal.pone.0049138

  • 26

    FerrerM.BeloquiA.TimmisK. N.GolyshinP. N. (2009). Metagenomics for mining new genetic resources of microbial communities. J. Mol. Microbiol. Biotechnol.16, 109123. 10.1159/000142898

  • 27

    FerrerM.GolyshinaO. V.ChernikovaT. N.KhachaneA. N.Martins Dos SantosV. A. P.YakimovM. M.et al (2005). Microbial enzymes mined from the Urania deep-sea hypersaline anoxic basin. Chem. Biol.12, 895904. 10.1016/j.chembiol.2005.05.020

  • 28

    FindleyS. D.MormileM. R.Sommer-HurleyA.ZhangX.-C.TiptonP.ArnettK.et al (2011). Activity-based metagenomic screening and biochemical characterization of bovine ruminal protozoan glycoside hydrolases. Appl. Environ. Microbiol.77, 81068113. 10.1128/AEM.05925-11

  • 29

    FinnR. D.BatemanA.ClementsJ.CoggillP.EberhardtR. Y.EddyS. R.et al (2014). Pfam: the protein families database. Nucleic Acids Res.42, D222D230. 10.1093/nar/gkt1223

  • 30

    FinnR. D.MistryJ.TateJ.CoggillP.HegerA.PollingtonJ. E.et al (2010). The Pfam protein families database. Nucleic Acids Res.38, D211D222. 10.1093/nar/gkp985

  • 31

    Frias-LopezJ.ShiY.TysonG. W.ColemanM. L.SchusterS. C.ChisholmS. W.et al (2008). Microbial community gene expression in ocean surface waters. Proc. Natl. Acad. Sci. U.S.A.105, 38053810. 10.1073/pnas.0708897105

  • 32

    GaborE. M.AlkemaW. B. L.JanssenD. B. (2004). Quantifying the accessibility of the metagenome by random expression cloning techniques. Environ. Microbiol.6, 879886. 10.1111/j.1462-2920.2004.00640.x

  • 33

    GilbertJ. A.FieldD.HuangY.EdwardsR.LiW.GilnaP.et al (2008). Detection of large numbers of novel sequences in the metatranscriptomes of complex marine microbial communities. PLoS ONE3:e3042. 10.1371/journal.pone.0003042

  • 34

    GilbertJ. A.FieldD.SwiftP.ThomasS.CummingsD.TempertonB.et al (2010). The taxonomic and functional diversity of microbes at a temperate coastal site: a “Multi-Omic” study of seasonal and diel temporal variation. PLoS ONE5:e15545. 10.1371/journal.pone.0015545

  • 35

    GlouxK.BerteauO.El oumamiH.BeguetF.LeclercM.DoreJ. (2011). A metagenomic β-glucuronidase uncovers a core adaptive function of the human intestinal microbiome. Proc. Natl. Acad. Sci. U.S.A.108(Suppl. 1), 45394546. 10.1073/pnas.1000066107

  • 36

    HeS.KuninV.HaynesM.MartinH. G.IvanovaN.RohwerF.et al (2010). Metatranscriptomic array analysis of “Candidatus Accumulibacter phosphatis”-enriched enhanced biological phosphorus removal sludge: metatranscriptomic array analysis of EBPR sludge. Environ. Microbiol.12, 12051217. 10.1111/j.1462-2920.2010.02163.x

  • 37

    HenneA.DanielR.SchmitzR. A.GottschalkG. (1999). Construction of environmental DNA libraries in Escherichia coli and screening for the presence of genes conferring utilization of 4-hydroxybutyrate. Appl. Environ. Microbiol.65, 39013907.

  • 38

    HessM.SczyrbaA.EganR.KimT.-W.ChokhawalaH.SchrothG.et al (2011). Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science331, 463467. 10.1126/science.1200387

  • 39

    HjortK.BergströmM.AdesinaM. F.JanssonJ. K.SmallaK.SjölingS. (2010). Chitinase genes revealed and compared in bacterial isolates, DNA extracts and a metagenomic library from a phytopathogen-suppressive soil. FEMS Microbiol. Ecol.71, 197207. 10.1111/j.1574-6941.2009.00801.x

  • 40

    IwaiS.ChaiB.SulW. J.ColeJ. R.HashshamS. A.TiedjeJ. M. (2009). Gene-targeted-metagenomics reveals extensive diversity of aromatic dioxygenase genes in the environment. ISME J.4, 279285. 10.1038/ismej.2009.104

  • 41

    JacquiodS.FranquevilleL.CécillonS.VogelT. M.SimonetP. (2013). Soil bacterial community shifts after chitin enrichment: an integrative metagenomic approach. PLoS ONE8:e79699. 10.1371/journal.pone.0079699

  • 42

    JonesB. V.BegleyM.HillC.GahanC. G. M.MarchesiJ. R. (2008). Functional and comparative metagenomic analysis of bile salt hydrolase activity in the human gut microbiome. Proc. Natl. Acad. Sci. U.S.A.105, 1358013585. 10.1073/pnas.0804437105

  • 43

    KambirandaD. M.Asraful-IslamS. M.ChoK. M.MathR. K.LeeY. H.KimH.et al (2009). Expression of esterase gene in yeast for organophosphates biodegradation. Pestic. Biochem. Physiol.94, 1520. 10.1016/j.pestbp.2009.02.006

  • 44

    KanehisaM.GotoS. (2000). KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res.28, 2730. 10.1093/nar/28.1.27

  • 45

    KnietschA.WaschkowitzT.BowienS.HenneA.DanielR. (2003). Construction and screening of metagenomic libraries derived from enrichment cultures: generation of a gene bank for genes conferring alcohol oxidoreductase activity on Escherichia coli. Appl. Environ. Microbiol.69, 14081416. 10.1128/AEM.69.3.1408-1416.2003

  • 46

    KürstenD.KotheE.WetzelK.BergmannK.KöhlerJ. M. (2014). Micro-segmented flow and multisensor-technology for microbial activity profiling. Environ. Sci. Process. Impacts16, 23622370. 10.1039/C4EM00255E

  • 47

    LadevèzeS.TarquisL.CecchiniD. A.BercoviciJ.AndréI.TophamC. M.et al (2013). Role of glycoside phosphorylases in mannose foraging by human gut bacteria. J. Biol. Chem.288, 3237032383. 10.1074/jbc.M113.483628

  • 48

    LadoukakisE.KolisisF. N.ChatziioannouA. A. (2014). Integrative workflows for metagenomic analysis. Front. Cell Dev. Biol.2:70. 10.3389/fcell.2014.00070

  • 49

    LakhdariO.CultroneA.TapJ.GlouxK.BernardF.EhrlichS. D.et al (2010). Functional metagenomics: a high throughput screening method to decipher microbiota-driven NF-κB modulation in the human gut. PLoS ONE5:e13092. 10.1371/journal.pone.0013092

  • 50

    LeCleirG. R.BuchanA.MaurerJ.MoranM. A.HollibaughJ. T. (2007). Comparison of chitinolytic enzymes from an alkaline, hypersaline lake and an estuary. Environ. Microbiol.9, 197205. 10.1111/j.1462-2920.2006.01128.x

  • 51

    LevasseurA.DrulaE.LombardV.CoutinhoP. M.HenrissatB. (2013). Expansion of the enzymatic repertoire of the CAZy database to integrate auxiliary redox enzymes. Biotechnol. Biofuels6, 41. 10.1186/1754-6834-6-41

  • 52

    LiM.HongY.KlotzM. G.GuJ.-D. (2010). A comparison of primer sets for detecting 16S rRNA and hydrazine oxidoreductase genes of anaerobic ammonium-oxidizing bacteria in marine sediments. Appl. Microbiol. Biotechnol.86, 781790. 10.1007/s00253-009-2361-5

  • 53

    LiS.XuL.HuaH.RenC.LinZ. (2007). A set of UV-inducible autolytic vectors for high throughput screening. J. Biotechnol.127, 647652. 10.1016/j.jbiotec.2006.07.030

  • 54

    LimS. W.TranT. M.AbateA. R. (2015). PCR-activated cell sorting for cultivation-free enrichment and sequencing of rare microbes. PLoS ONE10:e0113549. 10.1371/journal.pone.0113549

  • 55

    LingL. L.SchneiderT.PeoplesA. J.SpoeringA. L.EngelsI.ConlonB. P.et al (2015). A new antibiotic kills pathogens without detectable resistance. Nature517, 455459. 10.1038/nature14098

  • 56

    LuZ.DengY.Van NostrandJ. D.HeZ.VoordeckersJ.ZhouA.et al (2012). Microbial gene functions enriched in the Deepwater Horizon deep-sea oil plume. ISME J.6, 451460. 10.1038/ismej.2011.91

  • 57

    MaL.DattaS. S.KarymovM. A.PanQ.BegoloS.IsmagilovR. F. (2014). Individually addressable arrays of replica microbial cultures enabled by splitting SlipChips. Integr. Biol.6, 796805. 10.1039/C4IB00109E

  • 58

    MajerníkA.GottschalkG.DanielR. (2001). Screening of environmental DNA libraries for the presence of genes conferring Na+(Li+)/H+ antiporter activity on Escherichia coli: characterization of the recovered genes and the corresponding gene products. J. Bacteriol.183, 66456653. 10.1128/JB.183.22.6645-6653.2001

  • 59

    Marchler-BauerA.AndersonJ. B.ChitsazF.DerbyshireM. K.DeWeese-ScottC.FongJ. H.et al (2009). CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res.37, D205D210. 10.1093/nar/gkn845

  • 60

    MarkowitzV. M.IvanovaN. N.SzetoE.PalaniappanK.ChuK.DaleviD.et al (2007). IMG/M: a data management and analysis system for metagenomes. Nucleic Acids Res.36, D534D538. 10.1093/nar/gkm869

  • 61

    MathR. K.Asraful IslamS. M.ChoK. M.HongS. J.KimJ. M.YunM. G.et al (2010). Isolation of a novel gene encoding a 3,5,6-trichloro-2-pyridinol degrading enzyme from a cow rumen metagenomic library. Biodegradation21, 565573. 10.1007/s10532-009-9324-5

  • 62

    MayumiD.Akutsu-ShigenoY.UchiyamaH.NomuraN.Nakajima-KambeT. (2008). Identification and characterization of novel poly (DL-lactic acid) depolymerases from metagenome. Appl. Microbiol. Biotechnol.79, 743750. 10.1007/s00253-008-1477-3

  • 63

    MazutisL.BaretJ.-C.GriffithsA. D. (2009). A fast and efficient microfluidic system for highly selective one-to-one droplet fusion. Lab Chip.9, 26652672. 10.1039/b903608c

  • 64

    MeyerF.PaarmannD.D’SouzaM.OlsonR.GlassE. M.KubalM.et al (2008). The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics9:386. 10.1186/1471-2105-9-386

  • 65

    MullerJ.SzklarczykD.JulienP.LetunicI.RothA.KuhnM.et al (2010). eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucleic Acids Res.38, D190D195. 10.1093/nar/gkp951

  • 66

    NajahM.CalbrixR.Mahendra-WijayaI. P.BeneytonT.GriffithsA. D.DrevelleA. (2014). Droplet-based microfluidics platform for ultra-high-throughput bioprospecting of cellulolytic microorganisms. Chem. Biol.21, 17221732. 10.1016/j.chembiol.2014.10.020

  • 67

    NawyT. (2013). Lab-On-A-Chip: receptive cells feel the squeeze. Nat. Methods10, 198198. 10.1038/nmeth.2395

  • 68

    NorthenT. R.LeeJ.-C.HoangL.RaymondJ.HwangD.-R.YannoneS. M.et al (2008). A nanostructure-initiator mass spectrometry-based enzyme activity assay. Proc. Natl. Acad. Sci. U.S.A.105, 36783683. 10.1073/pnas.0712332105

  • 69

    NyyssönenM.TranH. M.KaraozU.WeiheC.HadiM. Z.MartinyJ. B. H.et al (2013). Coupled high-throughput functional screening and next generation sequencing for identification of plant polymer decomposing enzymes in metagenomic libraries. Front. Microbiol.4:282. 10.3389/fmicb.2013.00282

  • 70

    OnoA.MiyazakiR.SotaM.OhtsuboY.NagataY.TsudaM. (2007). Isolation and characterization of naphthalene-catabolic genes and plasmids from oil-contaminated soil by using two cultivation-independent approaches. Appl. Microbiol. Biotechnol.74, 501510. 10.1007/s00253-006-0671-4

  • 71

    ParkB. H.KarpinetsT. V.SyedM. H.LeuzeM. R.UberbacherE. C. (2010). CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database. Glycobiology20, 15741584. 10.1093/glycob/cwq106

  • 72

    PivetalJ.ToruS.Frenea-RobinM.HaddourN.CecillonS.DempseyN. M.et al (2014). Selective isolation of bacterial cells within a microfluidic device using magnetic probe-based cell fishing. Sens. Actuators B Chem.195, 581589. 10.1016/j.snb.2014.01.004

  • 73

    PleissJ.FischerM.PeikerM.ThieleC.SchmidR. D. (2000). Lipase engineering database. J. Mol. Catal. B Enzym.10, 491508. 10.1016/S1381-1177(00)00092-8

  • 74

    PoretskyR. S.BanoN.BuchanA.LeCleirG.KleikemperJ.PickeringM.et al (2005). Analysis of microbial gene transcripts in environmental samples. Appl. Environ. Microbiol.71, 41214126. 10.1128/AEM.71.7.4121-4126.2005

  • 75

    QinJ.LiR.RaesJ.ArumugamM.BurgdorfK. S.ManichanhC.et al (2010). A human gut microbial gene catalogue established by metagenomic sequencing. Nature464, 5965. 10.1038/nature08821

  • 76

    RamR. J.VerberkmoesN. C.ThelenM. P.TysonG. W.BakerB. J.BlakeR. C.et al (2005). Community proteomics of a natural microbial biofilm. Science308, 19151920. 10.1126/science.1109070

  • 77

    RawlingsN. D.BarrettA. J.BatemanA. (2012). MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res.40, D343D350. 10.1093/nar/gkr987

  • 78

    RobertsonD. E.SteerB. A. (2004). Recent progress in biocatalyst discovery and optimization. Curr. Opin. Chem. Biol.8, 141149. 10.1016/j.cbpa.2004.02.010

  • 79

    Saleh-LakhaS.MillerM.CampbellR. G.SchneiderK.ElahimaneshP.HartM. M.et al (2005). Microbial gene expression in soil: methods, applications and challenges. J. Microbiol. Methods63, 119. 10.1016/j.mimet.2005.03.007

  • 80

    SchmidtO.DrakeH. L.HornM. A. (2010). Hitherto unknown [Fe-Fe]-hydrogenase gene diversity in anaerobes and anoxic enrichments from a moderately acidic fen. Appl. Environ. Microbiol.76, 20272031. 10.1128/AEM.02895-09

  • 81

    SchmiederR.LimY. W.EdwardsR. (2012). Identification and removal of ribosomal RNA sequences from metatranscriptomes. Bioinformatics28, 433435. 10.1093/bioinformatics/btr669

  • 82

    SelengutJ. D.HaftD. H.DavidsenT.GanapathyA.Gwinn-GiglioM.NelsonW. C.et al (2007). TIGRFAMs and genome properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res.35, D260D264. 10.1093/nar/gkl1043

  • 83

    SethiA.SlackJ. M.KovalevaE. S.BuchmanG. W.ScharfM. E. (2013). Lignin-associated metagene expression in a lignocellulose-digesting termite. Insect Biochem. Mol. Biol.43, 91101. 10.1016/j.ibmb.2012.10.001

  • 84

    SharmaV. K.KumarN.PrakashT.TaylorT. D. (2010). MetaBioME: a database to explore commercially useful enzymes in metagenomic datasets. Nucleic Acids Res.38, D468D472. 10.1093/nar/gkp1001

  • 85

    SigristC. J. A.CeruttiL.de CastroE.Langendijk-GenevauxP. S.BulliardV.BairochA.et al (2010). PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Res.38, D161D166. 10.1093/nar/gkp885

  • 86

    SimonC.DanielR. (2009). Achievements and new knowledge unraveled by metagenomic approaches. Appl. Microbiol. Biotechnol.85, 265276. 10.1007/s00253-009-2233-z

  • 87

    SimonC.HerathJ.RockstrohS.DanielR. (2009). Rapid identification of genes encoding DNA polymerases by function-based screening of metagenomic libraries derived from glacial ice. Appl. Environ. Microbiol.75, 29642968. 10.1128/AEM.02644-08

  • 88

    SirimD.WagnerF.WangL.SchmidR. D.PleissJ. (2011). The Laccase Engineering Database: a classification and analysis system for laccases and related multicopper oxidases. Database2011, bar006. 10.1093/database/bar006

  • 89

    SjostromS. L.BaiY.HuangM.LiuZ.NielsenJ.JoenssonH. N.et al (2014). High-throughput screening for industrial enzyme production hosts by droplet microfluidics. Lab Chip14, 806813. 10.1039/C3LC51202A

  • 90

    SödingJ. (2005). Protein homology detection by HMM-HMM comparison. Bioinformatics21, 951960. 10.1093/bioinformatics/bti125

  • 91

    SuenagaH.OhnukiT.MiyazakiK. (2007). Functional screening of a metagenomic library for genes involved in microbial degradation of aromatic compounds. Environ. Microbiol.9, 22892297. 10.1111/j.1462-2920.2007.01342.x

  • 92

    SteeleH. L.JaegerK.-E.DanielR.StreitW. R. (2009). Advances in recovery of novel biocatalysts from metagenomes. J. Mol. Microbiol. Biotechnol.16, 2537. 10.1159/000142892

  • 93

    TartarA.WheelerM. M.ZhouX.CoyM. R.BouciasD. G.ScharfM. E. (2009). Parallel metatranscriptome analyses of host and symbiont gene expression in the gut of the termite Reticulitermes flavipes. Biotechnol. Biofuels2, 25. 10.1186/1754-6834-2-25

  • 94

    TasseL.BercoviciJ.Pizzut-SerinS.RobeP.TapJ.KloppC.et al (2010). Functional metagenomics to mine the human gut microbiome for dietary fiber catabolic enzymes. Genome Res.20, 16051612. 10.1101/gr.108332.110

  • 95

    TatusovR. L.FedorovaN. D.JacksonJ. D.JacobsA. R.KiryutinB.KooninE. V.et al (2003). The COG database: an updated version includes eukaryotes. BMC Bioinformatics4:41. 10.1186/1471-2105-4-41

  • 96

    TauppM.MewisK.HallamS. J. (2011). The art and design of functional metagenomic screens. Curr. Opin. Biotechnol.22, 465472. 10.1016/j.copbio.2011.02.010

  • 97

    ThomasT.GilbertJ.MeyerF. (2012). Metagenomics—a guide from sampling to data analysis. Microb. Inform. Exp.2, 3. 10.1186/2042-5783-2-3

  • 98

    TirawongsarojP.SriprangR.HarnpicharnchaiP.ThongaramT.ChampredaV.TanapongpipatS.et al (2008). Novel thermophilic and thermostable lipolytic enzymes from a Thailand hot spring metagenomic library. J. Biotechnol.133, 4249. 10.1016/j.jbiotec.2007.08.046

  • 99

    UchiyamaT.AbeT.IkemuraT.WatanabeK. (2004). Substrate-induced gene-expression screening of environmental metagenome libraries for isolation of catabolic genes. Nat. Biotechnol.23, 8893. 10.1038/nbt1048

  • 100

    UchiyamaT.MiyazakiK. (2009). Functional metagenomics for enzyme discovery: challenges to efficient screening. Curr. Opin. Biotechnol.20, 616622. 10.1016/j.copbio.2009.09.010

  • 101

    UchiyamaT.WatanabeK. (2008). Substrate-induced gene expression (SIGEX) screening of metagenome libraries. Nat. Protoc.3, 12021212. 10.1038/nprot.2008.96

  • 102

    van ElsasJ. D.CostaR.JanssonJ.SjölingS.BaileyM.NalinR.et al (2008). The metagenomics of disease-suppressive soils—experiences from the METACONTROL project. Trends Biotechnol.26, 591601. 10.1016/j.tibtech.2008.07.004

  • 103

    Van HellemondE. W.JanssenD. B.FraaijeM. W. (2007). Discovery of a novel styrene monooxygenase originating from the metagenome. Appl. Environ. Microbiol.73, 58325839. 10.1128/AEM.02708-06

  • 104

    VogelT. M.SimonetP.JanssonJ. K.HirschP. R.TiedjeJ. M.van ElsasJ. D.et al (2009). TerraGenome: a consortium for the sequencing of a soil metagenome. Nat. Rev. Microbiol.7, 252252. 10.1038/nrmicro2119

  • 105

    WarneckeF.HessM. (2009). A perspective: metatranscriptomics as a tool for the discovery of novel biocatalysts. J. Biotechnol.142, 9195. 10.1016/j.jbiotec.2009.03.022

  • 106

    WarneckeF.LuginbühlP.IvanovaN.GhassemianM.RichardsonT. H.StegeJ. T.et al (2007). Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite. Nature450, 560565. 10.1038/nature06269

  • 107

    WaschkowitzT.RockstrohS.DanielR. (2009). Isolation and Characterization of metalloproteases with a novel domain structure by construction and screening of metagenomic libraries. Appl. Environ. Microbiol.75, 25062516. 10.1128/AEM.02136-08

  • 108

    WeckxS.Van der MeulenR.AllemeerschJ.HuysG.VandammeP.Van HummelenP.et al (2010). Community dynamics of bacteria in sourdough fermentations as revealed by their metatranscriptome. Appl. Environ. Microbiol.76, 54025408. 10.1128/AEM.00570-10

  • 109

    YoosephS.SuttonG.RuschD. B.HalpernA. L.WilliamsonS. J.RemingtonK.et al (2007). The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol.5:e16. 10.1371/journal.pbio.0050016

  • 110

    ZanaroliG.BalloiA.NegroniA.DaffonchioD.YoungL. Y.FavaF. (2010). Characterization of the microbial community from the marine sediment of the Venice lagoon capable of reductive dechlorination of coplanar polychlorinated biphenyls (PCBs). J. Hazard. Mater.178, 417426. 10.1016/j.jhazmat.2010.01.097

  • 111

    ZaprasisA.LiuY.-J.LiuS.-J.DrakeH. L.HornM. A. (2009). Abundance of novel and diverse tfda-like genes, encoding putative phenoxyalkanoic acid herbicide-degrading dioxygenases, in soil. Appl. Environ. Microbiol.76, 119128. 10.1128/AEM.01727-09

Summary

Keywords

metagenomics, discovery of new functions, proteins, high throughput screening, microbial ecosystems, microbial ecology, biotechnologies

Citation

Ufarté L, Potocki-Veronese G and Laville É (2015) Discovery of new protein families and functions: new challenges in functional metagenomics for biotechnologies and microbial ecology. Front. Microbiol. 6:563. doi: 10.3389/fmicb.2015.00563

Received

17 April 2015

Accepted

21 May 2015

Published

05 June 2015

Volume

6 - 2015

Edited by

Eamonn P. Culligan, University College Cork, Ireland

Reviewed by

Marc Strous, University of Calgary, Canada; Lukasz Jaroszewski, Sanford-Burnham Institute for Medical Research, USA

Copyright

*Correspondence: Élisabeth Laville, Equipe de Catalyse et Ingénierie Moléculaire Enzymatiques, Laboratoire d’Ingénierie des Systèmes Biologiques et des Procédés, INSA - UMR INRA 792 - UMR CNRS 5504, 135 Avenue de Rangueil, 31077 Toulouse cedex 4, France,

This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Microbiology.

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics