collapse_umi_test.py
Here’s a detailed documentation of the target file test/preprocess_tests/collapse_umi_test.py
:
High-level description
This file contains unit tests for the UMI Collapsing module in the Cassiopeia preprocessing pipeline. It tests various aspects of UMI collapsing, including sorting BAM files, forming collapsed clusters, and converting BAM files to DataFrames.
Code Structure
The main class TestCollapseUMIs
inherits from unittest.TestCase
and contains several test methods. The setUp
method prepares the test environment by creating test files and running UMI collapsing operations. The test methods then verify the correctness of these operations.
Symbols
TestCollapseUMIs
Description
A test class that contains unit tests for the UMI Collapsing functionality.
Internal Logic
- Sets up test files and performs UMI collapsing operations in
setUp
. - Tests various aspects of UMI collapsing in individual test methods.
setUp
Description
Prepares the test environment by creating test files and running UMI collapsing operations.
Internal Logic
- Creates test directories and files.
- Sorts BAM files.
- Performs UMI collapsing using different methods and parameters.
test_sort_bam
Description
Tests the sorting of BAM files.
Internal Logic
- Opens the sorted BAM file.
- Extracts cell barcodes and UMIs.
- Verifies the correctness of specific entries.
test_sort_bam_uncorrected
Description
Tests the sorting of uncorrected BAM files.
Internal Logic
Similar to test_sort_bam
, but uses uncorrected test files.
test_collapse_bam
Description
Tests the collapsing of BAM files.
Internal Logic
- Opens the collapsed BAM file.
- Extracts various fields (cell barcodes, UMIs, read counts, cluster IDs, qualities).
- Verifies the correctness of specific entries and counts.
test_collapse_bam_bayesian
Description
Tests the Bayesian collapsing of BAM files.
Internal Logic
Similar to test_collapse_bam
, but uses Bayesian collapsed files.
test_collapse_bam_uncorrected
Description
Tests the collapsing of uncorrected BAM files.
Internal Logic
Similar to test_collapse_bam
, but uses uncorrected collapsed files.
test_bam2DF
Description
Tests the conversion of BAM files to DataFrames.
Internal Logic
- Converts BAM to DataFrame.
- Verifies specific entries in the resulting DataFrame.
- Compares with a separately loaded DataFrame from a text file.
test_collapsing_passes_header
Description
Tests that the collapsing process correctly passes header information.
Internal Logic
- Opens the collapsed BAM file.
- Verifies that the alignment has non-None query sequence and qualities.
Dependencies
Dependency | Purpose |
---|---|
os | File and directory operations |
unittest | Unit testing framework |
pandas | Data manipulation and analysis |
pathlib | Object-oriented filesystem paths |
pysam | Reading/writing SAM/BAM/VCF/BCF files |
cassiopeia.preprocess | UMI collapsing functionality |
Error Handling
The tests use assertions to verify the correctness of operations. If any assertion fails, an AssertionError
will be raised, indicating a test failure.
This file is crucial for ensuring the correctness of the UMI collapsing functionality in the Cassiopeia preprocessing pipeline. It covers various scenarios and edge cases, helping to maintain the reliability of the UMI collapsing process.