Run your first job#

In this example, we use the WGS-Germline-Snps-Indels workflow and provide you with all the resources you will need to run your first job on SeqsLab.

The goal of this quickstart guide is to provide a hands-on example on how you can use SeqsLab to run your workflows. For a more in-depth explanation of the test drive process, see Test drive SeqsLab.


This guide assumes that you have some familiarity with genomic data analysis. It would also be helpful to have some experience in using a command line interface (CLI) tool.

Step 1. Test drive SeqsLab#

SeqsLab is available as a managed application on Azure Marketplace. You can test drive SeqsLab and try out its features for free.

  1. Go to the Azure Marketplace listing.

  2. Click Test Drive.
    A login window displays.

  3. Sign in to Microsoft Azure marketplace using a Microsoft email address.
    A confirmation message displays.

  4. Enable the checkbox to grant access to your basic profile information and then click Continue.

Step 2. Pull and run SeqsLab CLI#

The SeqsLab CLI is deployed as a Docker container. Before you can use the CLI, you must first complete the following steps:

  1. On a command line application, run docker pull

  2. Find and define the subdomain name of the SeqsLab API as PRIVATE_NAME. For example, the domain uses atgenomix as its PRIVATE_NAME.

  3. Run export PRIVATE_NAME="yourCompanyName".

  4. Run the SeqsLab CLI:

    docker run --rm --name cli \
        -e PRIVATE_NAME="testdrive" \
        -e WORKSPACE="tdwestus2" \
        --privileged \
  5. Set up the environment keyring and install the required package inside the SeqsLab CLI:

    dbus-run-session -- bash
    echo $RANDOM | gnome-keyring-daemon --unlock
    apt install zip wget -y
  6. Sign in with your Microsoft Entra ID account and password.


    Go to the Access information section of the marketplace test drive page to get the login credentials.

  7. Optional: Set up the Multi-Factor Authentication (MFA) for the given Microsoft Entra ID account. You can skip this process since the test drive expires in two days.

    seqslab auth signin -i

Step 3. Register the sample and reference files#

  1. Register the FASTQ sample files:

    cat sample.json | seqslab datahub register-blob \
        --stdin file-blob --workspace "${WORKSPACE}"
  2. Register the reference files:

    cat static.json | seqslab datahub register-blob \
        --stdin file-blob --workspace "${WORKSPACE}"

Step 4: Register a tool#

  1. Get the workflow from Dockstore:

    mkdir working_dir && \
        wget \
        -O working_dir/
    cd working_dir && unzip
  2. Register the workflow as a new tool:

    export TOOL_ID="wgs_germline_gatk4_snp_indel_`date '+%Y%m%d%H%M%S'`"
    seqslab tools tool \
        --id "${TOOL_ID}" \    
        --name "testdrive-${TOOL_ID}" \
        --description "test drive workflow"
  3. Register a new tool version:

    seqslab tools version \
        --descriptor-type WDL  \
        --tool-id "${TOOL_ID}" \
        --id "1.0" \
        --workspace "${WORKSPACE}" \
        --images '[{"image_type": "docker", "image_name": "atgenomix/seqslab_runtime-1.5_ubuntu-20.04_preprocessgatk4-", "registry_host": "", "size": 3349109245, "checksum": "sha256:020be4ed428c7dfec67d6c640aaec07d239406c1222295b22a8da6be2ec38ad1"}]'
  4. Upload the tool files to the newly created tool version:

    seqslab tools file \
        --descriptor-type WDL \
        --tool-id "${TOOL_ID}" \
        --version-id "1.0" \
        --working-dir `pwd`/GATK-Germline-Snps-Indels/ \
        --file-info execs/gatk.parallel.hg19.0713.execs.json

Step 5. Run a job#

  1. Create a run request file:

    seqslab jobs request \
        --run-name testdrive \
        --working-dir `pwd` \
        --execs GATK-Germline-Snps-Indels/execs/gatk.parallel.hg19.0713.execs.json \
        --workflow-url \
            "${TOOL_ID}/versions/1.0/WDL/files/" \
        --runtimes \
  2. Submit a workflow run:

    seqslab jobs run \
        --workspace "${WORKSPACE}" \
        --working-dir `pwd` \
        --response-path run.json

Step 6. Check the job status#

  1. Check the status of the workflow run:

    seqslab jobs run-state --run-id "yourRunID"
  2. When the job reaches the COMPLETE state, get the detailed information of your workflow run:

    seqslab jobs get --run-id "yourRunID"

Step 7. Download the results#

Download the workflow output files to the local machine:

% seqslab datahub download \
    --workspace seqslabwus2 \
    --dst ~/Downloads/ \
    --self-uri drs:// drs://