Microbial communities contribute more than half of all the cells our bodies are composed of. And not surprisingly, the taxonomic and genetic makeup of microbiomes is closely linked to the health of humans, animals and plants.
Yet especially the functional genetic composition of microbiomes is hard to establish and current metagenomics tools struggle with correctly predicting functional composition or changes in function between microbiome samples [Lindgreen et al. 2015].
What if you could access tools to de novo assemble metagenome data, reliably predict functional elements, and identify statistically significant changes in function between samples? And what if these tools were fully integrated into the industry standard for scientist-friendly NGS data analysis, and came along with a toolbox that has been optimized for microbiologists?
Accuracy of results
Detecting gene function in microbial communities based on metagenomic data is hard. Correctly measuring changes in the functional makeup between different metagenome samples is even harder.
Lindgreen et al. showed that most of the benchmarked open source tools failed to correctly predict such changes at levels that are statistically significant.
With our solution for microbial genomics you can more accurately detect and quantify functional elements in a sample. And the included statistical tools allow you to confidently measure statistically significant changes in function between samples.
Multi-sample comparison is used to detect functional changes between samples and to identify samples with similar or diverging functional genomic elements. Data can be grouped and analyzed in the context of your sample-metadata.
As shown in Figure 2, our tools were able to reliably separate samples from the two different communities based on the relative abundance of their predicted functional content.
Quality of metagenome assembly
Our accurate assignment of gene function depends on a novel metagenome assembler producing higher quality assemblies compared to leading alternatives. Table 1 illustrates how our metagenome assembler compares favorably when it comes to misassemblies, InDels, mismatch errors, and other quality metrics.
Run time and compute resource efficiency
Run time and compute resource requirements are important when sample volume is high.
We have benchmarked the metagenome assembler included in our microbial genomics solution against leading metagenome assemblers using a dataset by Shakya et al. 2013. Shorter run time and greater compute resource efficiency was consistently demonstrated compared to other leading assemblers.
Increase walk away time
To increase walk away time, users can use the Workflow feature in CLC Genomics Workbench to combine the analysis steps 2 through 7 listed below into a preconfigured one-click workflow. Workflows are capable of batch processing many samples increasing walk-away time.
Analysis steps in functional metagenomics workflow:
1 → Import of multiple whole metagenome sample read datasets and association of metadata to each sample.
2 → QC and trimming of whole metagenome reads.
3 → De novo assembly of each sample read dataset into high-quality contigs using the new De Novo Assemble Metagenome tool.
4 → Locate coding sequences (CDS) in the resulting contigs using the third-party MetaGeneMark genefinder plugin for the CLC workbenches.
5 → Annotate CDSs with Gene Ontology (GO) terms and Pfam protein families or Best BLAST Hits using one of the two new tools, Annotate CDS with Pfam or Annotate CDS with Best BLAST Hit, respectively.
6 → Map the input reads back to the annotated contigs using the built-in Map Reads to Reference tool in the CLC workbenches.
7 → Build a functional abundance profile of each sample using the Build Functional Profile tool
8 → Merge the functional abundance profiles for all samples into one profile using the Merge Functional Profile tool.
9 → Visualize the individual and merged functional abundance profiles, perform filtering based on abundance, and apply the different options for showing the abundance profiles in the context of metadata.
10→ Perform hierarchical clustering and statistical analysis based on the relative abundance of functional elements in the samples.
Find out more about our microbial genomics solution