Overview¶

nbis-meta is a snakemake workflow for metagenomics projects

Installing¶

From GitHub¶

To start using the workflow either clone the latest version of the repo, run:

git clone https://github.com/NBISweden/nbis-meta.git

or download the latest release from the release page and extract the archive.

Then change directory into the nbis-meta folder and create the core conda environment:

cd nbis-meta
conda env create -f environment.yml

Note

mamba instead of conda

mamba is a faster replacement for conda. Give it a try by installing it from the conda-forge channel: conda install -c conda-forge mamba. You can then run mamba env create -f environment.yml.

From DockerHub¶

You may also pull the latest Docker image:

docker pull nbisweden/nbis-meta:latest

What’s next?¶

You are now ready to start using the workflow!

for information on how to prepare necessary files see Getting-started
then check out the How-to page for more info on how to run the workflow

Workflow overview¶

Preprocessing¶

This workflow can perform preprocessing of paired- and/or single-end whole-genome shotgun metagenomic data (in fastq-format) using e.g.:

Trimmomatic (adapter/quality trimming)
Cutadapt (adapter trimming)
SortMeRNA (rRNA filtering)
Fastuniq (de-duplication)
FastQC and MultiQC (read QC and report generation)

Downstream analysis¶

Read-based classification¶

Preprocessed reads can be used for taxonomic classification and profiling using tools such as:

Kraken2
Centrifuge
MetaPhlAn3

producing taxonomic profiles of the samples, as well as interactive krona plots.

Assembly-based analysis¶

Preprocessed reads can also be assembled and analyzed further using tools such as:

Megahit/MetaSPADES (for metagenomic assembly)
prodigal (gene calling)
pfam_scan, eggnog-mapper, Resistance Gene Identifier (protein level annotations)
bowtie2 (mapping reads to contigs)
featureCounts (assigning and counting mapped reads)
edgeR + metagenomeSEQ (normalization of read counts for genes/features)
contigtax + sourmash (taxonomic assignments)
metabat2, CONCOCT, MaxBin2 (metagenomic binning)
checkm (genome bin QC)
GTDB-TK (genome bin phylogenetic assignments)
fastANI (genome bin clustering)