Run a job
Contents
Run a job#
Prerequisites#
Before you begin, you will need the following:
A running instance of the SeqsLab CLI tool. For details, see Pull and run the SeqsLab CLI.
A registered tool. For details, see Register a TRS tool to the SeqsLab platform.
Execute a job with WES#
The SeqsLab CLI provides a jobs request command to create a run-request.json
. When the run-request.json
files are ready, you can use the jobs run command to launch all WES runs based on the run-request.json
file in the working directory.
The jobs run command will then respond with a JSON file indicating the submitted run_id and run_name.
The following is an example command:
seqslab jobs request \
--working-dir /home/ubuntu/src/ \
--workflow-url https://dev-api.seqslab.net/trs/v2/tools/trs_wgs_snp_indel/versions/1.0/WDL/files/ \
--workspace seqslabwus2 \
--execs execs/germline-gatk4-snpindel.json \
--name demo-run
seqslab jobs run \
--workspace seqslabwus2 \
--working-dir /home/ubuntu/src/ \
--response-path result.json
Monitor a job run#
You can use the jobs run-state command to check the status of each run until it reaches the COMPLETE state.
seqslab jobs run-state --run-id run_DdtSfRfOr2AVTSe
{"run_id": "run_DdtSfRfOr2AVTSe", "state": "COMPLETE"}
If you want to get the full run information, you can use the job get command. Running this command returns the detailed WES run information in JSON format. The response includes basic attributes like the run_id, run_name, state, start_time, and end_time for run monitoring. It also includes a logs section containing a list of detailed execution information for each WDL task, such as the rendered command, start_time, end_time, exit_code, storage_url, and outputs. Lastly, it includes an outputs section containing a list of the WDL main-workflow level output mapping from FQN, DRS self-URI, and local file name.
seqslab jobs get --run-id run_DdtSfRfOr2AVTSe
{
"id": "run_DdtSfRfOr2AVTSe",
"name": "2022_02_11_WGS_22010402",
"outputs": [
{
"fqn": "WGS.sampleMutect2Vcf",
"cloud": [
"drs://api.seqslab.net/drs_FxjCfOIBJ8mm89L"
],
"local": [
"22010402_Mutect2_tumor.vcf.gz"
]
},
...
],
"logs": [
{
"id": 1558,
"name": "bwa-x-4643c-run-ddtsfrfor2avtse",
"cmd": "set -e -o pipefail\n\n/home/tools/bwa-0.7.17/bwa \\\n mem -M -t 14 ${refFa} \\\n -R \"@RG\\tID:NextSeq550_${day}\\tSM:${sampleName}\\tPL:NextSeq\\tPI:550\" \\\n ${inFileFastq} > \\\n ${outPathSam} 2>> ${outPathLog}\n\n/home/tools/samtools-1.9/samtools \\\n view -bS \\\n ${outPathSam} \\\n -o tmp.bam\n\n/home/tools/samtools-1.9/samtools \\\n sort tmp.bam \\\n -o ${outPathBam}",
"start_time": "2022-02-11T10:41:43Z",
"end_time": "2022-02-11T11:10:47Z",
"stdout": "stdout",
"stderr": "stderr",
"activity": "../audit.log",
"storage_url": "abfss://seqslab@seqslabapi32b21storage.dfs.core.windows.net/outputs/wes/run_DdtSfRfOr2AVTSe/WGS.NIPT.Bwa_x/",
"exit_code": 0,
"outputs": [
{
"fqn": "WGS.NIPT.Bwa.outFileBam",
"cloud": [
"drs://api.seqslab.net/drs_DjKkaETD7x7gZBA"
],
"local": [
"22010402.bam"
]
},
{
"fqn": "WGS.NIPT.Bwa.outFileLog",
"cloud": [
"drs://api.seqslab.net/drs_9Er0LWEMDbbCokV"
],
"local": [
"22010402_Bwa.log"
]
}
]
},
...
],
"state": "COMPLETE",
"request": {
"id": 283,
"name": "2022_01_18_2_WGS_22010402",
"description": null,
"workflow_type": "WDL",
"workflow_type_version": "1.0",
"workflow_params": { ...
}
"workflow_backend_params": { ...
},
"workflow_url": "https://api.seqslab.net/trs/v2/tools/trs_wgs_snp_indel/versions/1.0/WDL/files/",
"tags": []
},
"start_time": "2022-02-11T10:41:28Z",
"end_time": "2022-02-11T13:47:20Z"
}
Retrieve results#
Once the run reaches the COMPLETE state, you can retrieve the run result using the datahub download command. Doing so downloads the pipeline run output files to the local machine. This command can take either multiple DRS self-URIs or multiple DRS IDs, and then downloads them into a destination directory.
% seqslab datahub download \
--workspace seqslabwus2 \
--dst ~/Downloads/ \
--self-uri drs://api.seqslab.net/drs_ODlEMzEKxhxwc43 drs://api.seqslab.net/drs_Otr1u9pIYAe2JLr