High-level description
This file defines constants and default parameters used in the Cassiopeia preprocessing pipeline. These constants include BAM tag names, quality scores, a DNA substitution matrix, and default parameters for various pipeline stages.Code Structure
The code defines several dictionaries:BAM_CONSTANTS
, SINGLE_CELL_BAM_TAGS
, SPATIAL_BAM_TAGS
, CHEMISTRY_BAM_TAGS
, DNA_SUBSTITUTION_MATRIX
, and DEFAULT_PIPELINE_PARAMETERS
. The first four dictionaries define BAM tag names for different sequencing chemistries. DNA_SUBSTITUTION_MATRIX
defines a substitution matrix for DNA alignment. DEFAULT_PIPELINE_PARAMETERS
defines default parameters for each stage of the preprocessing pipeline.
Symbols
Symbol Name: BAM_CONSTANTS
Description:
This dictionary stores constants related to BAM file tags used in the preprocessing pipeline.Inputs:
N/A - This is a constant dictionary.Outputs:
N/A - This is a constant dictionary.Internal Logic:
The dictionary maps descriptive names to their corresponding BAM tag strings. For example,RAW_CELL_BC_TAG
maps to "CR"
, which represents the tag for the raw cell barcode sequence.
Symbol Name: SINGLE_CELL_BAM_TAGS
Description:
This dictionary defines BAM tag names for single-cell sequencing chemistries.Inputs:
N/A - This is a constant dictionary.Outputs:
N/A - This is a constant dictionary.Internal Logic:
The dictionary maps data types (umi
, cell_barcode
) to tuples of BAM tag names. Each tuple contains two tags: one for the sequence and one for the quality scores.
Symbol Name: SPATIAL_BAM_TAGS
Description:
This dictionary defines BAM tag names for spatial sequencing chemistries.Inputs:
N/A - This is a constant dictionary.Outputs:
N/A - This is a constant dictionary.Internal Logic:
Similar toSINGLE_CELL_BAM_TAGS
, this dictionary maps data types (umi
, spot_barcode
) to tuples of BAM tag names.
Symbol Name: CHEMISTRY_BAM_TAGS
Description:
This dictionary maps specific sequencing chemistries to their corresponding BAM tag dictionaries.Inputs:
N/A - This is a constant dictionary.Outputs:
N/A - This is a constant dictionary.Internal Logic:
The dictionary maps chemistry names (e.g., ‘dropseq’, ‘10xv2’) to eitherSINGLE_CELL_BAM_TAGS
or SPATIAL_BAM_TAGS
based on the chemistry type.