The cassiopeia/tools/fitness_estimator
directory contains implementations of fitness estimation algorithms for phylogenetic trees, specifically designed for use with the Cassiopeia library. The main components include an abstract base class FitnessEstimator
and a concrete implementation LBIJungle
that uses the Lineage Branching Index (LBI) method.
This module provides tools for estimating the fitness of nodes in phylogenetic trees. The main functionality includes:
FitnessEstimator
class.jungle
package, which calculates fitness based on the branching patterns of the tree.The fitness estimation process helps in understanding the evolutionary dynamics of the sequences represented in the phylogenetic tree, with higher fitness values indicating potentially more successful or rapidly evolving lineages.
The main entry points for using this module are:
FitnessEstimator
: An abstract base class that defines the interface for all fitness estimation algorithms in Cassiopeia.LBIJungle
: A concrete implementation of the FitnessEstimator
class that uses the Lineage Branching Index method for fitness estimation.Developers can use these classes to estimate fitness for CassiopeiaTree
objects, which represent phylogenetic trees in the Cassiopeia library.
_FitnessEstimator.py
: Defines the abstract FitnessEstimator
class and the FitnessEstimatorError
exception._lbi_jungle.py
: Implements the LBIJungle
class, which uses the jungle
package to estimate fitness using the LBI method.__init__.py
: Serves as the top-level entry point for the module, exposing the main components.The module relies on several external libraries:
jungle
: A wrapper around Neher et al.’s original code for LBI calculations.networkx
: Used for representing and manipulating tree topologies.numpy
: Used for random number generation and array manipulation.ete3
: For phylogenetic tree manipulation and visualization (used in the _jungle
subdirectory).Bio.Phylo
: For interfacing with Biopython’s phylogenetic tree representation (used in the _jungle
subdirectory).scipy
: For various scientific computing tasks and statistical functions (used in the _jungle
subdirectory).pandas
: For data manipulation and analysis (used in the _jungle
subdirectory).matplotlib
: For visualization of results and trees (used in the _jungle
subdirectory).The main classes use constructor parameters and method arguments for configuration:
LBIJungle
:
random_seed
: Optional integer to set the random seed for reproducibility.estimate_fitness
method:
CassiopeiaTree
object as input and modifies it in place by adding a ‘fitness’ attribute to each node.Users can adjust these parameters to customize the fitness estimation process for their specific needs in evolutionary studies and population genetics research.
The _jungle
subdirectory contains additional classes and functions for more advanced phylogenetic analysis, including:
Forest
: For managing collections of phylogenetic trees.Tree
: For analyzing individual phylogenetic trees.SFS
: For calculating and analyzing Site Frequency Spectra.SizeMatchedModel
: For statistical modeling based on data size.These components provide a comprehensive toolkit for in-depth analysis of evolutionary fitness and phylogenetic relationships in biological sequences.
The cassiopeia/tools/fitness_estimator
directory contains implementations of fitness estimation algorithms for phylogenetic trees, specifically designed for use with the Cassiopeia library. The main components include an abstract base class FitnessEstimator
and a concrete implementation LBIJungle
that uses the Lineage Branching Index (LBI) method.
This module provides tools for estimating the fitness of nodes in phylogenetic trees. The main functionality includes:
FitnessEstimator
class.jungle
package, which calculates fitness based on the branching patterns of the tree.The fitness estimation process helps in understanding the evolutionary dynamics of the sequences represented in the phylogenetic tree, with higher fitness values indicating potentially more successful or rapidly evolving lineages.
The main entry points for using this module are:
FitnessEstimator
: An abstract base class that defines the interface for all fitness estimation algorithms in Cassiopeia.LBIJungle
: A concrete implementation of the FitnessEstimator
class that uses the Lineage Branching Index method for fitness estimation.Developers can use these classes to estimate fitness for CassiopeiaTree
objects, which represent phylogenetic trees in the Cassiopeia library.
_FitnessEstimator.py
: Defines the abstract FitnessEstimator
class and the FitnessEstimatorError
exception._lbi_jungle.py
: Implements the LBIJungle
class, which uses the jungle
package to estimate fitness using the LBI method.__init__.py
: Serves as the top-level entry point for the module, exposing the main components.The module relies on several external libraries:
jungle
: A wrapper around Neher et al.’s original code for LBI calculations.networkx
: Used for representing and manipulating tree topologies.numpy
: Used for random number generation and array manipulation.ete3
: For phylogenetic tree manipulation and visualization (used in the _jungle
subdirectory).Bio.Phylo
: For interfacing with Biopython’s phylogenetic tree representation (used in the _jungle
subdirectory).scipy
: For various scientific computing tasks and statistical functions (used in the _jungle
subdirectory).pandas
: For data manipulation and analysis (used in the _jungle
subdirectory).matplotlib
: For visualization of results and trees (used in the _jungle
subdirectory).The main classes use constructor parameters and method arguments for configuration:
LBIJungle
:
random_seed
: Optional integer to set the random seed for reproducibility.estimate_fitness
method:
CassiopeiaTree
object as input and modifies it in place by adding a ‘fitness’ attribute to each node.Users can adjust these parameters to customize the fitness estimation process for their specific needs in evolutionary studies and population genetics research.
The _jungle
subdirectory contains additional classes and functions for more advanced phylogenetic analysis, including:
Forest
: For managing collections of phylogenetic trees.Tree
: For analyzing individual phylogenetic trees.SFS
: For calculating and analyzing Site Frequency Spectra.SizeMatchedModel
: For statistical modeling based on data size.These components provide a comprehensive toolkit for in-depth analysis of evolutionary fitness and phylogenetic relationships in biological sequences.