Operator pipelines
Operator pipelines#
Atgenomix currently provides the following operator pipelines on SeqsLab.
File type |
Partitions |
Operator pipeline |
---|---|---|
All |
None |
Automatic workload pipeline for localizing either a file or directory in a single node cluster |
|
Depends on the data |
File-based FASTQ workload parallelization pipeline with 1,048,576 read records for each partition |
|
1 |
File-based BAM workload pipeline with all BAM records in a single partition |
|
1 |
File-based BAM workload pipeline with unmapped BAM records in a single partition |
|
23 |
File-based BAM workload pipeline with reads on HG19 primary chromosome parallelized into 23 partitions (one autosome per partition, and chrX, chrY, and chrM merged into a single partition) |
|
77 |
File-based BAM workload pipeline with reads on HG19 primary chromosome parallelized into 77 contiguous unmasked regions |
|
155 |
File-based BAM workload pipeline with reads on HG19 primary chromosome parallelized into 155 contiguous unmasked regions |
|
3,109 |
File-based BAM workload pipeline with reads on HG19 primary chromosome parallelized into 3,109 contiguous unmasked regions |
|
155 |
File-based BAM workload pipeline with reads on HG19 primary chromosome parallelized into 155 contiguous unmasked regions, where both reads in a read pair are presented in each partition for analyses (e.g., read consensus) |
|
45 |
File-based BAM workload pipeline with HG19 reference genome chr20 parallelized into 45 contiguous unmasked regions |
|
23 |
File-based BAM workload pipeline with reads on GRCH38 primary chromosome parallelized into 23 partitions (one autosome per partition, and chrX, chrY, and chrM merged into a single partition) |
|
50 |
File-based BAM workload pipeline with reads on GRCH38 primary chromosome parallelized into 50 contiguous unmasked regions |
|
50 |
File-based BAM workload pipeline with reads on GRCH38 primary chromosome parallelized into 50 contiguous unmasked regions, where both reads in a read pair are presented in each partition for analyses (e.g., read consensus) |
|
155 |
File-based BAM workload pipeline with GRCH38 reference genome parallelized into 155 contiguous unmasked regions |
|
3,101 |
File-based BAM workload pipeline with HG19 reference genome parallelized into 3,101 contiguous unmasked regions |
|
None |
File-based unmapped BAM workload without data parallelization |
|
None |
File-based BAM workload with no data parallelization |
|
3,109 |
File-based VCF workload pipeline with HG19 reference genome parallelized into 3,109 contiguous unmasked regions |
|
3,101 |
File-based VCF workload pipeline with GRCh38 reference genome parallelized into 3,101 contiguous unmasked regions |
|
None |
File-based VCF workload pipeline using Glow and Delta Lake |
delta lake |
None |
File-based Delta Lake workload pipeline |