resolve_umi_sequence_test.py
Here’s a detailed documentation of the target file test/preprocess_tests/resolve_umi_sequence_test.py
:
High-level description
This file contains unit tests for the UMI Resolution module in the Cassiopeia preprocessing pipeline. It tests the functionality of resolving UMI sequences and filtering cells based on UMI and read count thresholds.
Code Structure
The main class TestResolveUMISequence
inherits from unittest.TestCase
and contains several test methods. It uses a sample collapsed UMI table to test the resolve_umi_sequence
function from the pipeline
module.
Symbols
TestResolveUMISequence
Description
A test case class that contains methods to test the UMI resolution functionality.
Internal Logic
- Sets up a sample collapsed UMI table in the
setUp
method. - Contains test methods to verify different aspects of UMI resolution.
setUp
Description
Initializes the test environment by creating a sample collapsed UMI table and setting up a temporary directory.
test_resolve_umi
Description
Tests the basic functionality of the resolve_umi_sequence
function.
Internal Logic
- Calls
resolve_umi_sequence
with the sample data. - Checks if the correct sequence was selected for cell1-UMIA.
- Verifies that cell2 was filtered out.
- Ensures cell3 retained both UMIs.
- Checks the expected read counts for each cell.
test_filter_by_reads
Description
Tests the filtering of cells based on the average number of reads per UMI.
Internal Logic
- Calls
resolve_umi_sequence
with a highermin_avg_reads_per_umi
threshold. - Verifies that only the expected cells (cell3) remain after filtering.
- Checks that the expected removed cells (cell1, cell2) are not in the result.
tearDown
Description
Cleans up the temporary directory after each test.
Dependencies
unittest
: Python’s built-in unit testing frameworkos
: For file path operationsshutil
: For directory removaltempfile
: For creating temporary directoriespandas
: For data manipulationcassiopeia.preprocess.pipeline
: The module being tested
Configuration
The tests use a predefined collapsed UMI table and various parameters for the resolve_umi_sequence
function.
Error Handling
The tests use assertions to verify the expected outcomes. If any assertion fails, the test will raise an AssertionError
.
Notes
- The tests use a small, predefined dataset to verify the functionality of the UMI resolution process.
- The
plot
parameter is set toFalse
intest_resolve_umi
andTrue
intest_filter_by_reads
, demonstrating testing with and without plot generation. - The temporary directory is used to store any output files generated during the tests and is cleaned up after each test.
This test file ensures that the UMI resolution functionality in the Cassiopeia preprocessing pipeline works as expected, particularly in handling different cell and UMI filtering scenarios.