align_sequence_test.py file:
High-level description
This file contains unit tests for the sequence alignment functionality in the Cassiopeia preprocessing pipeline. It tests various aspects of thealign_sequences function, including the structure of the output, the behavior with different alignment parameters, and the correctness of alignments for specific input sequences.
Code Structure
The main classTestAlignSequence inherits from unittest.TestCase and contains several test methods. Each method tests a different aspect of the sequence alignment functionality.
Symbols
TestAlignSequence
Description
This class contains all the unit tests for the sequence alignment functionality.Internal Logic
- Sets up test data in the
setUpmethod. - Defines several test methods, each focusing on a specific aspect of the alignment function.
setUp
Description
Initializes the test data used across all test methods.Internal Logic
- Creates a DataFrame
self.querieswith sample sequence data. - Sets a reference sequence
self.reference.
test_alignment_dataframe_structure
Description
Tests the structure of the output DataFrame from thealign_sequences function.
Internal Logic
- Calls
align_sequenceswith test data. - Checks if the output DataFrame has the correct number of rows and expected columns.
- Verifies that all cell barcodes from the input are present in the output.
test_extremely_large_gap_open_penalty
Description
Tests the alignment behavior when using an extremely large gap open penalty.Internal Logic
- Calls
align_sequenceswith a very high gap open penalty (255). - Checks that no gaps (insertions or deletions) are present in the resulting alignments.
test_default_alignment_works
Description
Tests the correctness of alignments using default parameters.Internal Logic
- Calls
align_sequenceswith default parameters. - Compares the resulting CIGAR strings and alignment scores with expected values for each input sequence.
test_global_alignment
Description
Tests the global alignment mode of thealign_sequences function.
Internal Logic
- Calls
align_sequenceswith themethodparameter set to “global”. - Compares the resulting CIGAR strings and alignment scores with expected values for global alignment.
Dependencies
The test file depends on the following modules:unittest: For creating and running unit tests.numpy: For numerical operations.pandas: For handling DataFrames.cassiopeia: The main package being tested.
