BirthDeathFitnessSimulator.py
High-level description
The BirthDeathFitnessSimulator
class simulates phylogenetic trees using a forward birth-death process, incorporating fitness variations among lineages. It allows users to define custom birth and death waiting time distributions, mutation rates, and fitness effects, enabling the simulation of diverse evolutionary scenarios. The simulation can be stopped based on either the number of extant lineages or a predefined experiment time.
Code Structure
The BirthDeathFitnessSimulator
class inherits from the TreeSimulator
abstract class. It implements the simulate_tree
method, which orchestrates the simulation process. The simulation involves initializing a tree, maintaining a priority queue of lineages, and iteratively sampling events (birth or death) for each lineage. The sample_lineage_event
method handles the event sampling and updates the tree accordingly. The update_fitness
method calculates the fitness of a lineage based on its mutations and a user-defined fitness distribution.
References
This class references the CassiopeiaTree
class for representing the simulated tree and the TreeSimulatorError
exception for handling errors during simulation.
Symbols
BirthDeathFitnessSimulator
Description
This class simulates phylogenetic trees using a forward birth-death process with fitness variations. It allows users to define custom birth and death waiting time distributions, mutation rates, and fitness effects.
Inputs
Name | Type | Description |
---|---|---|
birth_waiting_distribution | Callable[[float], float] | A function that samples waiting times from the birth distribution. |
initial_birth_scale | float | The initial scale parameter for the birth distribution. |
death_waiting_distribution | Optional[Callable[[], float]] | A function that samples waiting times from the death distribution. Defaults to no death (infinite waiting time). |
mutation_distribution | Optional[Callable[[], int]] | A function that samples the number of mutations occurring at a division event. Defaults to no mutations. |
fitness_distribution | Optional[Callable[[], float]] | A function that samples the exponential that the fitness base is raised by, determining the strength of fitness mutations. Required if mutation_distribution is provided. |
fitness_base | float | The base that is raised by the value given by the fitness distribution, determining the base strength of fitness mutations. Defaults to Euler’s constant (e). |
num_extant | Optional[int] | The number of extant lineages as a stopping condition. |
experiment_time | Optional[float] | The total experiment time as a stopping condition. |
collapse_unifurcations | bool | Whether to collapse unifurcations in the resulting tree. Defaults to True. |
random_seed | int | A seed for reproducibility. |
initial_tree | Optional[CassiopeiaTree] | A tree used for initializing the simulation. |
Outputs
The class itself does not have outputs. Its main functionality is to simulate a tree, which is returned by the simulate_tree
method.
Internal Logic
The simulation process involves the following steps:
- Initialization: A tree is initialized with a single root node.
- Lineage Queue: A priority queue is created to store active lineages, sorted by their next event time.
- Event Sampling: For each lineage in the queue, the waiting times for birth and death events are sampled.
- Lineage Update: The lineage with the earliest event is processed. If it’s a birth event, a new lineage is created and added to the queue. If it’s a death event, the lineage is marked as inactive.
- Fitness Update: At each birth event, the fitness of the new lineage is updated based on the
mutation_distribution
andfitness_distribution
. - Stopping Condition: The simulation continues until either the desired number of extant lineages (
num_extant
) is reached or the experiment time (experiment_time
) is exceeded. - Tree Population: The simulated tree is populated with metadata, including branch lengths, node times, and character states.
Side Effects
The simulate_tree
method modifies the internal state of the BirthDeathFitnessSimulator
object by creating and populating a CassiopeiaTree
object.
simulate_tree
Description
This method simulates a phylogenetic tree using the defined birth-death process with fitness variations.
Inputs
This method does not take any explicit inputs.
Outputs
Name | Type | Description |
---|---|---|
tree | CassiopeiaTree | A CassiopeiaTree object representing the simulated tree. |
Internal Logic
The method follows the internal logic described for the BirthDeathFitnessSimulator
class. It uses helper functions like initialize_tree
, make_initial_lineage_dict
, make_lineage_dict
, and sample_lineage_event
to manage the simulation process.
Error Handling
The method raises a TreeSimulatorError
if all lineages die before a stopping condition is met.
sample_lineage_event
Description
This helper function samples a birth or death event for a given lineage and updates the tree accordingly.
Inputs
Name | Type | Description |
---|---|---|
lineage | Dict[str, Union[int, float]] | A dictionary representing the current lineage. |
current_lineages | PriorityQueue | The priority queue of active lineages. |
tree | nx.DiGraph | The NetworkX DiGraph representing the tree. |
names | Generator | A generator for unique node names. |
observed_nodes | List[str] | A list of observed nodes. |
Outputs
This method does not have any explicit outputs. It modifies the tree
and current_lineages
objects in place.
Internal Logic
The method samples birth and death waiting times for the lineage. If a birth event occurs first, a new lineage is created, its fitness is updated, and it’s added to the queue. If a death event occurs first, the lineage is marked as inactive. The method also handles cases where the lineage would live past the experiment time, cutting off the branch length and marking the lineage as inactive.
Error Handling
The method raises a TreeSimulatorError
if a negative waiting time is sampled or a non-active lineage is passed.
update_fitness
Description
This method updates the fitness of a lineage by scaling its birth scale parameter based on the number and strength of mutations.
Inputs
Name | Type | Description |
---|---|---|
birth_scale | float | The current birth scale parameter of the lineage. |
Outputs
Name | Type | Description |
---|---|---|
updated_birth_scale | float | The updated birth scale parameter. |
Internal Logic
The method samples the number of mutations using the mutation_distribution
. For each mutation, it calculates a multiplicative factor by exponentiating the fitness_base
by a value sampled from the fitness_distribution
. The birth scale parameter is then multiplied by the product of these factors.
Error Handling
The method raises a TreeSimulatorError
if a negative number of mutations is sampled.
populate_tree_from_simulation
Description
This method populates the simulated tree with metadata, including branch lengths, node times, and character states.
Inputs
Name | Type | Description |
---|---|---|
tree | nx.DiGraph | The NetworkX DiGraph representing the simulated tree. |
observed_nodes | List[str] | A list of observed nodes. |
Outputs
Name | Type | Description |
---|---|---|
cassiopeia_tree | CassiopeiaTree | A CassiopeiaTree object representing the populated tree. |
Internal Logic
The method creates a CassiopeiaTree
object from the simulated tree. It then iterates through the nodes, setting the time, birth scale, and character states for each node. Finally, it prunes dead lineages and collapses unifurcations if specified.
Error Handling
The method raises a TreeSimulatorError
if all lineages died before a stopping condition was met.
Dependencies
This class depends on the following external libraries:
Dependency | Purpose |
---|---|
networkx | Representing and manipulating the tree structure. |
numpy | Random number generation and array operations. |
queue | Maintaining the priority queue of lineages. |
Error Handling
The BirthDeathFitnessSimulator
class uses the TreeSimulatorError
exception to handle various error conditions, such as invalid input parameters, negative waiting times, and all lineages dying before a stopping condition is met.