nextflow.config

Relevant Nextflow components
  • Use the manifest scope to define metadata for publishing the pipeline.
  • Use the nextflowVersion setting to set the minimum version of Nextflow.
  • Use the params scope to define parameters accessible to the pipeline.
  • Use the shell directive to define a custom shell command.
  • Use the dag config scope to generate a workflow diagram.
  • Use the report config scope to generate an html summary report.
  • Use the timeline config scope to generate an html runtime report.
  • Use the trace config scope to generate a task execution trace file.
  • Use the profile scope to import profile configs from config/ directory.
  • Use the process selector to specify default directives for processes.

This is the configuration script that Nextflow looks for when you run a workflow. It contains a number of property definitions that are used by the pipeline. A key feature of Nextflow is the ability to separate workflow implementation from the underlying execution platform using this configuration file.

Since we can add additional configuration files for different run environments (i.e. job schedulers, use of singularity vs bioconda) each configuration file can contain conflicting settings and parameters listed in this file can be overwritten in the run command by specifying relevant commands.

See here for details on the hierarchy of configuration files.

What’s in nextflow.config?

This file contains:

  • A manifest for defining workflow metadata
  • Mandated minimal version of Nextflow required to run pipeline
  • Default workflow parameter definitions
  • Shell behaviour settings for the workflow
  • Execution reports
  • Configuration profiles
  • Default resource definitions for processes

Manifest

The manifest scope allows you to define metadata required when publishing or running your pipeline. In this template, we’ve applied the following information to the workflow:

manifest {
    author = 'Georgie Samaha'
    name = 'Nextflow_DSL2_template-nf'
    description = 'Template for creating Nextflow workflows'
    homePage = 'https://github.com/Sydney-Informatics-Hub/Nextflow_DSL2_template'
}

Nextflow version

The nextflowVersion setting allows you to mandate a minimum version of Nextflow that can be used to run your pipeline. In this template we have specified the minimal version of Nextflow that can handle DSL2 syntax:

nextflowVersion = '!>=20.07.1'

Parameters

The params scope allows you to define all parameters accessible in the pipeline script. Here, you can define default values to parameters which can then be overwritten using parameter flags (--) when the workflow is run. These defined parameters should correspond to parameters called by the process modules. For example:

In nextflow.config we define default parameters:

params.foo = 'Hello'
params.bar = 'world!'

In main.nf we define the workflow and specify our parameters as inputs to the process:

// Include the process file 
include { PROCESS } from './modules/process.nf'

// Run the workflow, calling the process called PROCESS 
workflow {
    PROCESS(params.foo, params.bar).view()
}

In modules/process.nf we provide those parameters as input channels to the process and defined the output channel as standard output:

process PROCESS {
input:
    val(params.foo)
    val(params.bar)

output:
    stdout

script:
    """
    echo "$params.foo $params.bar"
    """
}

Running these scripts as:

nextflow run main.nf

Gives the following output:

N E X T F L O W  ~  version 22.10.3
Launching `main.nf` [angry_montalcini] DSL2 - revision: 33ed2edb33
executor >  local (1)
[eb/25f43a] process > PROCESS [100%] 1 of 1 ✔
Hello world!

You can override these parameters when you run the workflow, for example:

nextflow run main.nf --foo goodbye 

Gives the following output:

N E X T F L O W  ~  version 22.10.3
Launching `main.nf` [maniac_hopper] DSL2 - revision: c7986b5056
executor >  local (1)
[5c/33c2f0] process > PROCESS [100%] 1 of 1 ✔
goodbye world!

Shell behaviour settings

By default for the most recent versions of Nextflow, workflows are executed with set -ue. Set is a Bash command used to control the different attributes and parameters of the shell environment. The -e flag ensures the workflow exits for non-zero exit status values and the -u flag traces unset variables. We have provided additional recommended shell settings to enhance error handling and pipeline reliability:

  • '/bin/bash' tells Nextflow to use Bash as the shell to run process scripts
  • '-euo', 'pipefail' tells Bash to consider a pipeline as failed if any command within it fails and to exit on error.
shell = ['/bin/bash', '-euo', 'pipefail']

You can specify additional bash options to specify script execution and error management strategies.

Execution reports

Nextflow provides a number of features for tracing and reporting on process execution status. In this template, we’ve asked Nextflow to generate info reports with dag{}, report{}, timeline{}, and trace{} scopes.

DAG visualisation with dag{}

A Nextflow pipeline’s direct acyclic graph (DAG) can be rendered by specifying the dag{} scope in the nextflow.config. In this template, we have:

  • Enabled DAG creation upon workflow completion by default with enable = true
  • Enabled the overwriting of existing DAGs with overwrite = true
  • Specified a file name and location to save the DAG with file = 'runInfo/dag.svg'

To generate the DAG, you must have GraphViz installed on your system.

Execution report with report{}

Nextflow can create an HTML execution report that contains useful metrics about your workflow execution. See the Nextflow documentation for more information on its contents. Similarly, we have employed the same operators as above to output the execution report, however unlike a DAG you may wish to not overwrite existing execution reports, you can do this by specifying overwrite = false and naming your execution report based on timestamp, as was demonstrated here for a trace report:

// Define timestamp, to avoid overwriting existing report 
def timestamp = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss')

report {
    enabled = true
    overwrite = false
    file = 'runInfo/report-${timestamp}.html'
}

Timeline summary with timeline{}

Nextflow can create an HTML timeline report for all tasks executed by your workflow. See the Nextflow documentation for more information on its contents. Same as with the dag and report features, you may wish to adjust the way these are output in the template.

Process resource trace with trace{}

This template also creates a simple resource trace report. A trace report can be customised to include any combination of available fields as demonstrated here. Consider customising the trace file generated by the workflow by adding the fields = option to specify any of the trace fields provided by Nextflow.

Configuration profiles

We have provided configuration files for specific infrastructures inside the config/ directory in this template. These are defined inside the nextflow.config file using the profiles{} scope. To keep the workflow modular and flexible, we’ve assigned profile names to these config files using includeConfig allowing these profile definitions to be activated when launching your pipeline with nextflow run main.nf -profile standard for example. These configuration files can be used to override defaults specified inside the nextflow.config. Add your own custom configuration to config/ directory and update the profiles scope within the nextflow.config with the following:

profiles {
    standard    { includeConfig "config/standard.config"}
    configName  { includeConfig "config/my.config"      }
}

See here for more details on what sorts of things can be included here.

Default resource definitions

The process{} configuration scope can be used to provide default configurations for the processes in your workflow. In the template, we have set some default CPU and memory directives to be applied to all processes run by the workflow. There are many process directives you can apply here to control resource availability, software execution, interact with job schedulers etc.

You can take a more fine-grained approach to process configuration by using Nextflow’s withName process selector, as demonstrated here. For example, for a process called PROCESS, we could change the default settings applied at the workflow level with:

process {
    cpus = 1
    memory = 5.Gb

    withName: 'PROCESS' {
        cpus = 2
        memory = 10.Gb
    }
}