cassiopeia/critique
directory contains modules for comparing and analyzing phylogenetic trees, particularly in the context of lineage tracing experiments. It provides implementations for calculating tree similarity metrics, such as the Robinson-Foulds distance and triplets correct accuracy, as well as utility functions for tree manipulation and analysis.
__init__.py
file, which exposes two key functions:
robinson_foulds
: Computes the Robinson-Foulds distance between two phylogenetic trees.triplets_correct
: Calculates the triplets correct accuracy between two phylogenetic trees.compare.py
module, which contains their implementations. The critique_utilities.py
file provides supporting functions used by the main comparison algorithms.
The typical workflow would involve:
cassiopeia.critique
.CassiopeiaTree
objects to compare.compare.py
: This file contains the main implementations of the tree comparison algorithms:
triplets_correct
: A detailed comparison function that samples triplets at different depths and compares their topology between two trees.robinson_foulds
: A wrapper around the Ete3 library’s implementation of the Robinson-Foulds distance calculation.critique_utilities.py
: This file provides utility functions used in tree analysis and comparison:
nCr
: Calculates binomial coefficients.annotate_tree_depths
: Annotates each node in a tree with its depth and the number of triplets rooted at that node.get_outgroup
: Infers the outgroup of a given triplet of leaves in a tree.sample_triplet_at_depth
: Samples a triplet of leaves from a tree with a specified most recent common ancestor depth.compare.py
file relies on the utility functions in critique_utilities.py
to perform its calculations efficiently.
critique
module relies on several external libraries and internal Cassiopeia components:
collections
: Used for defaultdict
data structure.copy
: Used for deep copying trees.ete3
: Used for Robinson-Foulds distance calculation.networkx
: Likely used in the CassiopeiaTree
implementation.numpy
: Used for numerical operations.typing
: Used for type hinting.math
: Used for factorial calculations in combinatorial functions.cassiopeia.data.CassiopeiaTree
: The main data structure representing phylogenetic trees.defaultdict
), numerical operations (numpy), and specialized tree operations (ete3). The use of type hinting (typing) suggests a focus on code clarity and potential use of static type checking tools.
In conclusion, the cassiopeia/critique
directory provides a set of powerful tools for comparing and analyzing phylogenetic trees, with a focus on efficiency and accuracy. It leverages both standard Python libraries and specialized scientific computing packages to deliver robust tree comparison functionality, which is crucial for validating and interpreting results in lineage tracing experiments and other phylogenetic studies.