High-level description

This file contains unit tests for the allele calling functionality in the Cassiopeia library, specifically testing the alignment_utilities.py and pipeline.py modules. The tests cover various scenarios of CIGAR string parsing, indel detection, and allele calling for both basic and complex cases.

Code Structure

The code is structured as a single test class TestCallAlleles that inherits from unittest.TestCase. It contains multiple test methods, each focusing on different aspects of allele calling and CIGAR string parsing. The setUp method initializes test data used across multiple test cases.




A test class that contains multiple unit tests for the allele calling functionality in Cassiopeia.

Internal Logic

  1. Sets up test data in the setUp method.
  2. Implements various test methods to cover different scenarios of CIGAR string parsing and allele calling.
  3. Uses assertions to verify the correctness of the results.



Initializes test data used across multiple test cases.

Internal Logic

  1. Sets up basic and long reference sequences.
  2. Defines barcode intervals and cutsite locations.
  3. Creates test alignment dataframes.



Tests parsing of a basic CIGAR string with only matches.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a simple match case.
  2. Asserts the correctness of the returned intBC and indels.



Tests parsing of a CIGAR string with a deletion.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a deletion case.
  2. Asserts the correctness of the returned intBC and indels.



Tests parsing of a CIGAR string with a deletion, including context information.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a deletion case and context enabled.
  2. Asserts the correctness of the returned intBC and indels, including context.



Tests parsing of a CIGAR string with an insertion.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with an insertion case.
  2. Asserts the correctness of the returned intBC and indels.



Tests parsing of a CIGAR string with an insertion, including context information.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with an insertion case and context enabled.
  2. Asserts the correctness of the returned intBC and indels, including context.



Tests parsing of a long CIGAR string without context.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a long CIGAR string.
  2. Asserts the correctness of the returned intBC and indels for multiple cutsites.



Tests parsing of a long CIGAR string with context.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a long CIGAR string and context enabled.
  2. Asserts the correctness of the returned intBC and indels for multiple cutsites, including context.



Tests parsing of a CIGAR string with an intersite deletion.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a CIGAR string containing an intersite deletion.
  2. Asserts the correctness of the returned intBC and indels for multiple cutsites.



Tests parsing of a complex CIGAR string with multiple intersite deletions.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a complex CIGAR string containing multiple intersite deletions.
  2. Asserts the correctness of the returned intBC and indels for multiple cutsites.



Tests the call_alleles function from the Cassiopeia preprocessing module.

Internal Logic

  1. Calls cassiopeia.pp.call_alleles with test alignment data.
  2. Verifies the structure of the returned molecule table.
  3. Asserts the correctness of the called alleles and intBCs for each read.



Tests that a warning is raised when there is missing data in the allele calling process.

Internal Logic

  1. Calls cassiopeia.pp.call_alleles with alignment data containing missing information.
  2. Asserts that a PreprocessWarning is raised.


unittestProvides the testing framework
numpyUsed for numerical operations
pandasUsed for handling dataframes
cassiopeiaThe main library being tested

Error Handling

The code uses assertions to verify the correctness of the results. It also tests for the raising of a PreprocessWarning in case of missing data.

This test suite provides comprehensive coverage for the allele calling functionality in Cassiopeia, ensuring that various scenarios of CIGAR string parsing and indel detection are handled correctly.