SeqsLab operators are plugins and open-source Cromwell extensions that allow you to scale complex workflow tasks using distributed computing clusters and GPU accelerators.
Capabilities of operators#
Using operators to create operator pipelines simplifies the process of reusing and scaling your workflows when working with different datasets. Each operator type has a specific function. Collectively, however, operators help you achieve the following benefits:
Localization and delocalization#
You can use Operators to load the individual input files of each task and store the output files using standard or custom access methods that are storage infrastructure agnostic. SeqsLab moves input data from the data lake to the cluster (localization) and output data from the cluster back to the data lake (delocalization) for each individual file or dataset. This on-the-fly optimization enables you to adjust to different I/O requirements and to turn single-node processing into multi-node and multi-core computing.
SeqsLab Operators allow you to program clusters and build accelerated workflow task life cycles even if you are not a cluster computing expert. Using Operators with the SeqsLab platform enables you to load diverse data types and process them at scale by partitioning and formatting various data types in real time. SeqsLab implicitly and dynamically performs data parallelization, which leverages the combined power of spot and dedicated clusters to optimize your workloads.