WDL workflow inputs

It is easy to specify (hard-code) file paths and parameter values for the workflow input variables directly in your WDL scripts. However, this practice undermines the code integrity and reproducibility of setting up a production workflow in the first place. Instead, we recommend using a JSON file to define all the variable values for file and parameter inputs which you can then customize for each run without having to modify your WDL workflow scripts. The Cromwell execution engine will then use the JSON file to fill in the input values of your task commands whenever applicable.

You can use the Cromwell open source package (external link) to simplify the process of generating the JSON skeleton file of workflow inputs. The package comes with a set of command line utilities (WOMtool) for interacting with the Workflow Object Model (WOM), including a utility that parses your WDL workflow scripts and generates the input JSON skeleton file.

Generate input JSON skeleton file

Execute the WOMtool program on your WDL main workflow script to generate the input JSON file in your computer terminal.

java -jar womtool.jar inputs MyMainWorkflow.wdl > MyMainWorkflowInputs.json 

The command creates a myworkflow_inputs.json file containing a list of all workflow inputs that you can customize from run to run for the given WDL workflow. The file uses the following content structure:

  "WorkflowName.TaskName.variableName": "Value_type",
  "WorkflowName.SubworkflowName.TaskName.variableName": "Value_type"

Now all you need to do is to specify the actual values of the listed input variables for a particular workflow run. Below is an example:

  "MainWorkflow.Bwa.readR1": "File",
  "MainWorkflow.Bwa.readR2": "File",
  "MainWorkflow.Bwa.reference": "File",
  "MainWorkflow.Bwa.sampleName": "String"

Specify the file paths and sample name directly in the JSON file or in a copy of the JSON file.

  "MainWorkflow.Bwa.readR1": "/path/to/r1.fastq.gz",
  "MainWorkflow.Bwa.readR2": "/path/to/r2.fastq.gz",
  "MainWorkflow.Bwa.reference": "/path/to/ref.fa",
  "MainWorkflow.Bwa.sampleName": "HG002"

Validate your workflow scripts

The WOMtool program provides a utility to perform a full validation of your WDL scripts, including syntax and semantic checking.

java -jar womtool.jar validate MyMainWorkflow.wdl

Once validated, you can also run and test your WDL workflow locally by providing the updated input JSON file along with your workflow scripts to the Cromwell run command (external link).

java -jar cromwell.jar run --inputs MyMainWorkflowInputs.json MyMainWorkflow.wdl

The SeqsLab platform provides the fully-managed WDL workflow execution environment agnostic to the underlying computing and storage infrastructure. When you are ready to run your workflow production on the SeqsLab platform, you simply provide the inputs JSON file along with the WDL workflow scripts and SeqsLab Jobs will streamline the process of connecting input/output datasets, parallelizing workload processing, and provisioning optimal computing resources.

Published workflows by Atgenomix