High-level description

This file contains unit tests for the allele calling functionality in the Cassiopeia library, specifically testing the alignment_utilities.py and pipeline.py modules. The tests cover various scenarios of CIGAR string parsing, indel detection, and allele calling for both basic and complex cases.

Code Structure

The code is structured as a single test class TestCallAlleles that inherits from unittest.TestCase. It contains multiple test methods, each focusing on different aspects of allele calling and CIGAR string parsing. The setUp method initializes test data used across multiple test cases.

Symbols

TestCallAlleles

Description

A test class that contains multiple unit tests for the allele calling functionality in Cassiopeia.

Internal Logic

  1. Sets up test data in the setUp method.
  2. Implements various test methods to cover different scenarios of CIGAR string parsing and allele calling.
  3. Uses assertions to verify the correctness of the results.

setUp

Description

Initializes test data used across multiple test cases.

Internal Logic

  1. Sets up basic and long reference sequences.
  2. Defines barcode intervals and cutsite locations.
  3. Creates test alignment dataframes.

test_basic_cigar_string_match

Description

Tests parsing of a basic CIGAR string with only matches.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a simple match case.
  2. Asserts the correctness of the returned intBC and indels.

test_basic_cigar_string_deletion

Description

Tests parsing of a CIGAR string with a deletion.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a deletion case.
  2. Asserts the correctness of the returned intBC and indels.

test_basic_cigar_string_deletion_with_context

Description

Tests parsing of a CIGAR string with a deletion, including context information.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a deletion case and context enabled.
  2. Asserts the correctness of the returned intBC and indels, including context.

test_basic_cigar_string_insertion

Description

Tests parsing of a CIGAR string with an insertion.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with an insertion case.
  2. Asserts the correctness of the returned intBC and indels.

test_basic_cigar_string_insertion_with_context

Description

Tests parsing of a CIGAR string with an insertion, including context information.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with an insertion case and context enabled.
  2. Asserts the correctness of the returned intBC and indels, including context.

test_long_cigar_parsing_no_context

Description

Tests parsing of a long CIGAR string without context.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a long CIGAR string.
  2. Asserts the correctness of the returned intBC and indels for multiple cutsites.

test_long_cigar_parsing_with_context

Description

Tests parsing of a long CIGAR string with context.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a long CIGAR string and context enabled.
  2. Asserts the correctness of the returned intBC and indels for multiple cutsites, including context.

test_intersite_deletion_parsing

Description

Tests parsing of a CIGAR string with an intersite deletion.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a CIGAR string containing an intersite deletion.
  2. Asserts the correctness of the returned intBC and indels for multiple cutsites.

test_complex_cigar_parsing_intersite_deletion

Description

Tests parsing of a complex CIGAR string with multiple intersite deletions.

Internal Logic

  1. Calls alignment_utilities.parse_cigar with a complex CIGAR string containing multiple intersite deletions.
  2. Asserts the correctness of the returned intBC and indels for multiple cutsites.

test_call_alleles_function

Description

Tests the call_alleles function from the Cassiopeia preprocessing module.

Internal Logic

  1. Calls cassiopeia.pp.call_alleles with test alignment data.
  2. Verifies the structure of the returned molecule table.
  3. Asserts the correctness of the called alleles and intBCs for each read.

test_missing_data_in_allele_throws_warning

Description

Tests that a warning is raised when there is missing data in the allele calling process.

Internal Logic

  1. Calls cassiopeia.pp.call_alleles with alignment data containing missing information.
  2. Asserts that a PreprocessWarning is raised.

Dependencies

DependencyPurpose
unittestProvides the testing framework
numpyUsed for numerical operations
pandasUsed for handling dataframes
cassiopeiaThe main library being tested

Error Handling

The code uses assertions to verify the correctness of the results. It also tests for the raising of a PreprocessWarning in case of missing data.

This test suite provides comprehensive coverage for the allele calling functionality in Cassiopeia, ensuring that various scenarios of CIGAR string parsing and indel detection are handled correctly.