High-level description
This directory contains unit tests for branch length estimators in the Cassiopeia library, specifically focusing on two classes:IIDExponentialBayesian and IIDExponentialMLE. These tests are designed to validate the functionality and accuracy of these estimators under various scenarios, including edge cases, hand-solvable problems, and simulated data.
What does it do?
The tests in this directory serve several purposes:-
Validate the correctness of the
IIDExponentialBayesianestimator by comparing its output to closed-form solutions and numerical approximations for log likelihood, log joints, and posterior time distributions. -
Verify the behavior of the
IIDExponentialMLEestimator in various scenarios, including:- Degenerate cases (no mutations or saturation)
- Hand-solvable problems with simple tree topologies
- Regression tests on small trees
- Performance on simulated data
- Handling of subtree collapses when no mutations are present
- Enforcement of minimum branch length constraints
- Incorporation of site-specific mutation rates
- Ensure proper error handling for invalid inputs or edge cases in both estimators.
Key Files
-
iid_exponential_bayesian_test.py: This file contains tests for theIIDExponentialBayesianclass. It includes:- Comparisons against closed-form solutions for small and medium-sized trees
- Validation of error handling for invalid inputs
- Numerical calculations of log likelihood, log joints, and posterior time distributions
-
iid_exponential_mle_test.py: This file contains tests for theIIDExponentialMLEclass. It includes:- Tests for degenerate cases (no mutations and saturation)
- Verification of results for hand-solvable problems
- Regression tests on small trees
- Tests using simulated data
- Validation of subtree collapse behavior
- Tests for minimum branch length enforcement
- Verification of site-specific mutation rate handling
Dependencies
The tests rely on several external libraries and frameworks:unittest: The standard Python unit testing framework.parameterized: Used for creating parameterized tests, allowing the same test to be run with different inputs.numpy: Used for numerical operations and array manipulations.scipy: Specificallyscipy.integrateis used for numerical integration in the Bayesian tests.networkx: Used for working with graph structures, particularly in creating and manipulating tree topologies.cvxpy: A Python-embedded modeling language for convex optimization problems, used in the MLE estimator.
cassiopeia.data.CassiopeiaTree: Represents the phylogenetic tree structure.cassiopeia.simulator.Cas9LineageTracingDataSimulator: Used for generating simulated data.cassiopeia.tools.IIDExponentialBayesianandcassiopeia.tools.IIDExponentialMLE: The classes being tested.
Configuration
The tests do not rely on external configuration files or environment variables. However, they do use various parameters to configure the estimators and test scenarios:-
For
IIDExponentialBayesian:mutation_rate: The mutation rate of the model.birth_rate: The birth rate of the model.sampling_probability: The sampling probability of the model.discretization_level: The number of timesteps used to discretize time.
-
For
IIDExponentialMLE:solver: The optimization solver to use (ECOS or SCS).minimum_branch_length: A constraint on the minimum allowed branch length.relative_mutation_rates: Site-specific mutation rates.
