SeqsLab management console¶
The SeqsLab management console brings the biomedical data processing infrastructure technologies right to your computer with a secure, easy-to-access, and web-based portal. You can build, manage, and scale your analysis production, and learn how to do more with SeqsLab solutions.
You can access the management console using Chrome 84 or later, Firefox 79 or later, Safari 13.1.2, or later.
Home / Landing Page¶
The console home page serves as the default entry page for all users. It provides the portal directory functions for both publicly available and subscription-enabled platform services. From the landing page, you can access links to the SeqsLab documentation, relevant genomic tools and workflows, and learn about the latest changes to the SeqsLab platform.
The Data Hub is a central repository for all your SeqsLab data. Powered by a Hadoop-compatible data lake storage, the data hub stores datasets from multiple data repositories and organizes them for analysis workloads, distribution, and subsetting and sharing.
The SeqsLab workflow execution system works with SeqsLab data hub in automating your biomedical data analysis production. However, you can only specify input command tool variables, distributed datasets, and datasets of regular files to the workflow system from registered and permitted data repositories.
Supported file types¶
SeqsLab data hub supports genome sequencing data and related biomedical data files as workflow execution system input datasets, including FASTA, FASTQ, SAM, BAM, BAI, VCF, GVCF, IDX, GFF3, GTF, SEG, BED, etc. Registered datasets on the SeqsLab data hub may consist of a single file or multiple files. A dataset may also contain one or more datasets.
You can import GA4GH DRS objects and upload datasets from physical files or readable URIs via web browser or command line program tools. Input variables of workflow command tools should be in the JSON file format.
The SeqsLab data hub provides an interactive genome visualization browser, which is an implementation of the popular IGV browser. You can visualize genomics datasets interactively on a web browser, using your existing datasets on data hub without leaving the SeqsLab management console.
Life cycle management¶
The SeqsLab data hub provides a life cycle management function for datasets stored on the SeqsLab data lake storage that can automatically tier down datasets (hot, cool, archive, delete) based on when the file was last accessed.
Compliance and traceability¶
All defined workflow jobs and generated output datasets are stored on SeqsLab. You can track and manage your workflows and datasets from the data hub.
SeqsLab provides a web-based and user-friendly interface to configure and manage workflow jobs.
SeqsLab may automatically recommend and provision compute resources and optimize each individual task execution based on the dataset volume, turnaround time, and cost requirements. However, you can adjust the runtime options for individual pipelines, including managed cluster computing and environment variables, as needed.
Types of jobs¶
SeqsLab enables you to create two types of jobs:
Automated - A SeqsLab worfkflow job that consists of one main workflow, and optionally one or more sub-workflows. Each workflow is defined using OpenWDL draft-2, draft-3, or version 1, and should specify the software container image to be used. SeqsLab supports container images generated using Docker or Singularity.
Interactive - A SeqsLab workflow job that is run from a Jupyter Notebook using a programming language, such as Python. Using the online and interactive scripting environment, you can define custom functions to load, process, and analyze datasets, and to generate graphical charts.
You can choose to run jobs immediately or at a specified future time, either once or repeatedly.