High-level description

The BirthDeathFitnessSimulator class simulates phylogenetic trees using a forward birth-death process, incorporating fitness variations among lineages. It allows users to define custom birth and death waiting time distributions, mutation rates, and fitness effects, enabling the simulation of diverse evolutionary scenarios. The simulation can be stopped based on either the number of extant lineages or a predefined experiment time.

Code Structure

The BirthDeathFitnessSimulator class inherits from the TreeSimulator abstract class. It implements the simulate_tree method, which orchestrates the simulation process. The simulation involves initializing a tree, maintaining a priority queue of lineages, and iteratively sampling events (birth or death) for each lineage. The sample_lineage_event method handles the event sampling and updates the tree accordingly. The update_fitness method calculates the fitness of a lineage based on its mutations and a user-defined fitness distribution.

References

This class references the CassiopeiaTree class for representing the simulated tree and the TreeSimulatorError exception for handling errors during simulation.

Symbols

BirthDeathFitnessSimulator

Description

This class simulates phylogenetic trees using a forward birth-death process with fitness variations. It allows users to define custom birth and death waiting time distributions, mutation rates, and fitness effects.

Inputs

NameTypeDescription
birth_waiting_distributionCallable[[float], float]A function that samples waiting times from the birth distribution.
initial_birth_scalefloatThe initial scale parameter for the birth distribution.
death_waiting_distributionOptional[Callable[[], float]]A function that samples waiting times from the death distribution. Defaults to no death (infinite waiting time).
mutation_distributionOptional[Callable[[], int]]A function that samples the number of mutations occurring at a division event. Defaults to no mutations.
fitness_distributionOptional[Callable[[], float]]A function that samples the exponential that the fitness base is raised by, determining the strength of fitness mutations. Required if mutation_distribution is provided.
fitness_basefloatThe base that is raised by the value given by the fitness distribution, determining the base strength of fitness mutations. Defaults to Euler’s constant (e).
num_extantOptional[int]The number of extant lineages as a stopping condition.
experiment_timeOptional[float]The total experiment time as a stopping condition.
collapse_unifurcationsboolWhether to collapse unifurcations in the resulting tree. Defaults to True.
random_seedintA seed for reproducibility.
initial_treeOptional[CassiopeiaTree]A tree used for initializing the simulation.

Outputs

The class itself does not have outputs. Its main functionality is to simulate a tree, which is returned by the simulate_tree method.

Internal Logic

The simulation process involves the following steps:

  1. Initialization: A tree is initialized with a single root node.
  2. Lineage Queue: A priority queue is created to store active lineages, sorted by their next event time.
  3. Event Sampling: For each lineage in the queue, the waiting times for birth and death events are sampled.
  4. Lineage Update: The lineage with the earliest event is processed. If it’s a birth event, a new lineage is created and added to the queue. If it’s a death event, the lineage is marked as inactive.
  5. Fitness Update: At each birth event, the fitness of the new lineage is updated based on the mutation_distribution and fitness_distribution.
  6. Stopping Condition: The simulation continues until either the desired number of extant lineages (num_extant) is reached or the experiment time (experiment_time) is exceeded.
  7. Tree Population: The simulated tree is populated with metadata, including branch lengths, node times, and character states.

Side Effects

The simulate_tree method modifies the internal state of the BirthDeathFitnessSimulator object by creating and populating a CassiopeiaTree object.

simulate_tree

Description

This method simulates a phylogenetic tree using the defined birth-death process with fitness variations.

Inputs

This method does not take any explicit inputs.

Outputs

NameTypeDescription
treeCassiopeiaTreeA CassiopeiaTree object representing the simulated tree.

Internal Logic

The method follows the internal logic described for the BirthDeathFitnessSimulator class. It uses helper functions like initialize_tree, make_initial_lineage_dict, make_lineage_dict, and sample_lineage_event to manage the simulation process.

Error Handling

The method raises a TreeSimulatorError if all lineages die before a stopping condition is met.

sample_lineage_event

Description

This helper function samples a birth or death event for a given lineage and updates the tree accordingly.

Inputs

NameTypeDescription
lineageDict[str, Union[int, float]]A dictionary representing the current lineage.
current_lineagesPriorityQueueThe priority queue of active lineages.
treenx.DiGraphThe NetworkX DiGraph representing the tree.
namesGeneratorA generator for unique node names.
observed_nodesList[str]A list of observed nodes.

Outputs

This method does not have any explicit outputs. It modifies the tree and current_lineages objects in place.

Internal Logic

The method samples birth and death waiting times for the lineage. If a birth event occurs first, a new lineage is created, its fitness is updated, and it’s added to the queue. If a death event occurs first, the lineage is marked as inactive. The method also handles cases where the lineage would live past the experiment time, cutting off the branch length and marking the lineage as inactive.

Error Handling

The method raises a TreeSimulatorError if a negative waiting time is sampled or a non-active lineage is passed.

update_fitness

Description

This method updates the fitness of a lineage by scaling its birth scale parameter based on the number and strength of mutations.

Inputs

NameTypeDescription
birth_scalefloatThe current birth scale parameter of the lineage.

Outputs

NameTypeDescription
updated_birth_scalefloatThe updated birth scale parameter.

Internal Logic

The method samples the number of mutations using the mutation_distribution. For each mutation, it calculates a multiplicative factor by exponentiating the fitness_base by a value sampled from the fitness_distribution. The birth scale parameter is then multiplied by the product of these factors.

Error Handling

The method raises a TreeSimulatorError if a negative number of mutations is sampled.

populate_tree_from_simulation

Description

This method populates the simulated tree with metadata, including branch lengths, node times, and character states.

Inputs

NameTypeDescription
treenx.DiGraphThe NetworkX DiGraph representing the simulated tree.
observed_nodesList[str]A list of observed nodes.

Outputs

NameTypeDescription
cassiopeia_treeCassiopeiaTreeA CassiopeiaTree object representing the populated tree.

Internal Logic

The method creates a CassiopeiaTree object from the simulated tree. It then iterates through the nodes, setting the time, birth scale, and character states for each node. Finally, it prunes dead lineages and collapses unifurcations if specified.

Error Handling

The method raises a TreeSimulatorError if all lineages died before a stopping condition was met.

Dependencies

This class depends on the following external libraries:

DependencyPurpose
networkxRepresenting and manipulating the tree structure.
numpyRandom number generation and array operations.
queueMaintaining the priority queue of lineages.

Error Handling

The BirthDeathFitnessSimulator class uses the TreeSimulatorError exception to handle various error conditions, such as invalid input parameters, negative waiting times, and all lineages dying before a stopping condition is met.