Develop a tool

Develop a tool#

The tool development process can be broken down into three main steps:

1. Choose a standardized workflow language.#

WDL is a mainstream standardized workflow language that is natively supported on SeqsLab. You can follow the OpenWDL development guide ( external link ) when drafting your WDL workflows. For more information, see WDL workflows on SeqsLab and WDL best practice guidelines.

2. Containerize the runtime environment.#

To ensure consistency, you should containerize the runtime environment for your tools as described in the Cromwell guide for running WDL container workflows ( external link ). Furthermore, SeqsLab provides its own base Docker runtime image on GitHub Package container registries that was built and tested by Atgenomix. We recommend using this as a base when building your Docker runtime image to ensure that the image will run on SeqsLab.

To pull the images, use the following command:

docker pull ghcr.io/atgenomix/runtime/base:1.5_20.04

Or simply create Dockerfile to build from the image:

FROM ghcr.io/atgenomix/runtime/base:1.5_20.04
RUN ...

When building your application container images using our runtime base container images, it is essential to follow a standardized naming convention. This convention ensures clarity and compatibility with the underlying Ubuntu OS version, and also enables users to easily identify the Ubuntu OS version and release date associated with your application container image.

Please adhere to the following format when naming your application container images:

# convention
<application_name>:<app_version>_<ubuntu_version>_<release_date>

# example
germline-gatk4-snpindel:1.0.0_20.04_2022-03-01-01-03

<application_name>: Replace this with the name of your application, it could be named after a tool, a WDL task, or a WDL sub-workflow.
<app_version>: Specify the version of your application container image.
<ubuntu_version>: Indicate the specific version of Ubuntu used in the base container image (e.g., 20.04 or 22.04).
<release_date>: Mention the release date of your application container image in the format YYYY-MM-DD.

SeqsLab supports tools that use one or more Docker runtime images, depending on the granularity. Atgenomix recommends setting the Docker runtime granularity at sub-workflow level for more balanced granularity. For example, you can have multiple WDL tasks in a sub-workflow to be run on a Docker runtime image, and one or more Docker runtime images that will be used in a main workflow.

Please register your Docker runtime images in the container registries registered on SeqsLab platform, see SeqsLab Container Registry for more information.

3. Choose an execution engine.#

When the workflows and Docker runtime images are ready, the final step is to test them using execution engines ( external link ). The SeqsLab platform extends the Cromwell () execution engine by using an Azure backend, which provides a container-based Spark () cluster using Azure Batch () and the AKS () computation infrastructure.

Typical tool development#

The following diagram shows a typical development flow from writing the WDL draft, preparing the runtime images, and testing with a local Cromwell backend. The expected outputs include a standardized WDL workflow, Docker runtime image(s), and an inputs.json file.

TRS-overview