Workflow description

12/23/2021

What is WDL?

The Workflow Description Language (WDL) is a way to specify data processing workflows with a human-readable and -writeable syntax. WDL makes it straightforward to define analysis tasks, chain them together in workflows, and parallelize their execution. Investigators, bioinformaticians, and operators of the SeqsLab platform can write their own WDL workflows and directly execute WDL scripts on the fully-managed high-performance data parallel computing and storage infrastructure on the SeqsLab platform. The SeqsLab platform supports the OpenWDL (external link) language specification 1.0 (external link).

Tip

If you are new to WDL, we recommend reading the educational materials for learning WDL (external link). In addition, we recommend following the set of BioWDL style guidelines (external link) to make your WDL workflow scripts more human-readable.

How does SeqsLab use WDL?

The SeqsLab workflow execution service implements Cromwell, the fully-featured execution engine supporting WDL, and extends its functionalities to allow existing WDL workflows to seamlessly run in a cloud high-performance computing environment (e.g., Azure Batch) and securely retrieve and store workflow input/output datasets specified as uniform resource identifiers (URIs), e.g., https://hostname/ga4gh/drs/v1/objects/123456, drs://hostname/123456, or file:///localdir/localfile.

SeqsLab simplifies the workflow development life cycle from building and validation to production. By following a set of WDL design guidelines and best practices, you can focus on writing and validating your WDL workflows with the Cromwell (external link) system locally without worrying about performance and the underlying complexity of managing your computing environment. When you’re ready for production, you can just use the verified WDL workflow scripts to build your workflow production and run jobs on the SeqsLab platform that automatically and transparently optimizes your workflows for performance, scalability, and integrity. With the SeqsLab workflow execution service, you can streamline WDL workflow script development and avoid writing error-prone and complicated scatter-gather, data sharding, and complex Shell command tasks.

What are the benefits of using WDL?

Using WDL on SeqsLab enables you to:

  • Map task input files to the cloud dataset on the SeqsLab Data Hub along with optimized workload parallelization.

  • Specify the cluster computing runtime resources for each workflow.

  • Determine the cloud resource provider and zone region where you want to execute the workflow job.