Assembly

Assembly can be run as a single part of the workflow using the command:

snakemake --configfile <yourconfigfile> assembly

Megahit and Metaspades are currently the only assemblers supported by this workflow. Megahit is used by default so in order to use Metaspades you have to explicitly configure the workflow to do so (see below). Settings for these assembler are:

assembly: Set to True to assemble samples based on the assemblyGroup column in your sample list.

assembly_threads: Number of threads to use for the assembly software.

megahit_keep_intermediate: If set to True, intermediate contigs produced using different k-mer lengths, as part of the megahit assembly procedure, are stored. Note that this can take up a large chunk of disk space. By default this is set to False.

megahit_additional_settings: The megahit assembler is run with default settings. Use this parameter to add additional settings to the megahit assembler, such as ‘–preset meta-large’ or something else that you want to change.

metaspades: Set to True to use the metaspades assembler instead of megahit

metaspades_keep_intermediate: As for megahit, setting this to True will store intermediate contigs created during assembly.

metaspades_keep_corrected: Set to True to keep corrected reads created during metaspades assembly.

metaspades_additional_settings: This is where you put any additional settings for metaspades.

Reports

The workflow produces a set of report files for the assemblies created. These files are saved in the directory specified by report_path: in your configuration file.

assembly_stats.pdf

A multi-panel plot showing number of contigs, total assembly size, as well as various length statistics of each assembly created.

The figure below shows the plot for the example dataset.

Assembly report plot