Pipeline Parameters
The following table describes both the input files and optional parameters that can be set when running the Sequence Analysis Pipeline. These parameters are applicable to running the pipeline on either Seven Bridges (where they are set with the graphical user interface), or on a local server (where they are set using the input specification YML file).
Required and optional inputs and parameters
Input field | Input | Required? |
---|---|---|
AbSeq_Reference | File input: FASTA AbSeq reference file as described in the Input files section. Ensure that the AbSeq reference file contains only the BD AbSeq Ab-Oligos that were used in the experiment. | Optional |
Cell_Calling_ATAC_Algorithm | Default: Basic. Specify the putative cell calling algorithm for ATAC-Seq: Basic, Refined. | Optional |
Cell_Calling_Bioproduct_Algorithm | Default: Basic. Specify the putative cell calling algorithm for bioproducts: Basic, Refined | Optional |
Cell_Calling_Data | Default: mRNA. Specify the data to be used for putative cell calling: mRNA, AbSeq, ATAC, mRNA_and_ATAC. | Optional |
Custom_STAR_Params | Default: pipeline defaults. Advanced. Modify STAR alignment parameters - Set this parameter to fully override default STAR mapping parameters used in the pipeline. This applies to fastqs provided in the Reads user input | Optional |
Custom_bwa_mem2_Params | Default: pipeline defaults. Advanced. Modify bwa-mem2 alignment parameters - Set this parameter to fully override bwa-mem2 mapping parameters used in the pipeline. This applies to fastqs provided in the Reads_ATAC user input | Optional |
Exact_Cell_Count | Set a specific number (>=1) of cells as putative, based on those with the highest error-corrected read count. | Optional |
Exclude_Intronic_Reads | Default: False. By default, reads aligned to exons and introns are considered and represented in molecule counts. Including intronic reads may increase sensitivity, resulting in an increase in molecule counts and the number of genes per cell for both cellular and nuclei samples. Intronic reads may indicate unspliced mRNAs and are also useful, for example, in the study of nuclei and RNA velocity. When set to True, intronic reads will be excluded. | Optional |
Expected_Cell_Count | Guide the basic putative cell calling algorithm by providing an estimate of the number of cells expected. Usually this can be the number of cells loaded into the Rhapsody cartridge. | Optional |
Generate_Bam | Default: False. A Bam read alignment file contains reads from all the input libraries, but creating it can consume a lot of compute and disk resources. By setting this field to True, the Bam file will be created. | Optional |
Long_Reads | Default: Auto. Specify if the STARlong aligner should be used instead of STAR. By default, when this parameter is not set, the pipeline will attempt to autodetect long reads. Set to True to force use of STARlong. Set to False to force use of STAR. | Optional |
Predefined_ATAC_Peaks | File input: An optional BED file (such as the ATAC-Seq peaks file output by the Rhapsody pipeline) containing pre-established chromatin accessibility peak regions for generating the ATAC-Seq cell-by-peak matrix. Useful if a direct comparison of chromatin accessibility between two or more ATAC-Seq samples is desired. | Optional for ATAC-Seq assay |
Reads | File input: R1 reads and R2 reads. Ensure to include all FASTQ sequencing data from the experiment, including R1 and R2 files for the targeted or WTA RNA library, and, if applicable, the Sample Tag, TCR, BCR, and BD® AbSeq libraries. | Required for applicable libraries |
Reads_ATAC | File input: R1, R2 and I2 reads. Ensure to include all FASTQ sequencing data from the experiment, including R1, R2 and I2 files for the ATAC-Seq library. | Required for ATAC-Seq libraries |
Reference_Archive (WTA or WTA+ATAC-Seq) | File input: A TAR.GZ file that includes a STAR (and possibly a bwa-mem2) indexed reference genome file, along with a GTF gene annotation file. | Yes |
Run_Name | Specify a run name to be used as the base output filename. Use only letters, numbers, hyphens, or underscores. If any other special characters are included, they will be corrected to hyphens. | Optional |
Sample_Tags_Version | For a multiplexed samples run only. Specify the Sample Tag kit used: human (hs), mouse (mm), flex, nuclei_includes_mrna, or nuclei_atac_only. | Required for multiplexed samples |
Supplemental_Reference | File input: This is a FASTA file that contains additional transgene sequences. | Optional |
Tag_Names | For a multiplexed samples run only. Associate a name with each Sample Tag, which will appear in the output files. Within square brackets, enter a comma-separated list of Sample Tag numbers and associated names. For each sample, use the following format, using a hyphen—no spaces or forward slashes allowed: Sample Tag number-sample name Example: Tag_Names: [3-Ramos, 4-BT549] | Optional for multiplexed samples |
Targeted_Reference (Targeted only) | File input: FASTA file containing the sequences amplified by the primers of the Targeted assay. This can be a pre-designed, supplemental, or custom panel. Ensure that the reference matches the species and panel used for the experiment. Otherwise, read mapping will not be correctly aligned. | Yes |
VDJ_Version | For experiments with VDJ libraries. Specify the species and/or chain types. Species only selection will include both BCR and TCR. Options: human mouse humanBCR humanTCR mouseBCR mouseTCR | Required for TCR/BCR assay |