cassiopeia_preprocess.py
script. This script orchestrates the execution of the various preprocessing steps based on a user-provided configuration file.
The pipeline.py
file contains the core functions for each preprocessing step, which are called by the main script.
The setup_utilities.py
file provides functions for parsing the configuration file and setting up the pipeline.
UMI_utils.py
: Contains functions for collapsing and preprocessing Unique Molecular Identifiers (UMIs).alignment_utilities.py
: Provides utilities for sequence alignment and CIGAR string parsing.constants.py
: Defines constants and default parameters used throughout the preprocessing pipeline.doublet_utils.py
: Contains functions for identifying and filtering potential cell doublets.lineage_utils.py
: Provides functions for calling lineage groups and processing lineage-related data.map_utils.py
: Contains functions for resolving allele ambiguity in molecule tables.utilities.py
: Offers various utility functions for data filtering, conversion, and manipulation.DEFAULT_PIPELINE_PARAMETERS
dictionary in constants.py
defines the default values for these parameters.
Key configurable parameters include: