High-level description
Thecassiopeia/simulator
directory contains a collection of classes and modules for simulating various aspects of single-cell lineage tracing experiments. These simulators can generate synthetic phylogenetic trees, spatial data, lineage tracing data, and perform leaf subsampling. The main components include:
- Tree simulators (e.g., BirthDeathFitnessSimulator, CompleteBinarySimulator)
- Data simulators (e.g., Cas9LineageTracingDataSimulator, BrownianSpatialDataSimulator)
- Leaf subsamplers (e.g., UniformLeafSubsampler, SpatialLeafSubsampler)
- Specialized simulators (e.g., ecDNABirthDeathSimulator)
What does it do?
Thecassiopeia/simulator
directory provides tools to:
- Generate synthetic phylogenetic trees with various growth models and fitness effects.
- Simulate lineage tracing data, including Cas9-based editing and sequential recording.
- Create spatial data for cells in a lineage tree.
- Subsample leaves from a tree to mimic experimental sampling or create supercellular states.
- Simulate specialized scenarios like extrachromosomal DNA (ecDNA) evolution.
Entry points
The main entry points for using the simulators are:TreeSimulator
abstract base class: This is the starting point for implementing new tree simulation models.DataSimulator
abstract base class: This is the base class for implementing new data simulation models.LeafSubsampler
abstract base class: This is the base class for implementing new leaf subsampling strategies.
BirthDeathFitnessSimulator
, Cas9LineageTracingDataSimulator
, and UniformLeafSubsampler
, provide specific simulation and subsampling functionalities.
The __init__.py
file in this directory serves as the main interface for importing and using the various simulator classes.
Key Files
BirthDeathFitnessSimulator.py
: Implements a birth-death process with fitness variations for tree simulation.Cas9LineageTracingDataSimulator.py
: Simulates Cas9-based lineage tracing data.BrownianSpatialDataSimulator.py
: Generates spatial data for cells using a Brownian motion model.UniformLeafSubsampler.py
: Implements uniform random subsampling of leaves from a tree.ecDNABirthDeathSimulator.py
: Simulates the evolution of cell populations with extrachromosomal DNA.
Dependencies
The simulator modules rely on several external libraries:networkx
: Used for graph operations and tree manipulations.numpy
: Used for numerical computations and random number generation.pandas
: Used for handling data structures like character matrices.scipy
: Used for various scientific computing tasks, including spatial algorithms.sklearn
: Used for nearest neighbor searches in spatial simulations.
cassiopeia.data.CassiopeiaTree
class for representing and manipulating phylogenetic trees.
Configuration
Most simulator classes accept various parameters during initialization to configure their behavior. Common configuration options include:- Simulation duration or stopping conditions (e.g.,
experiment_time
,num_extant
) - Mutation rates and distributions
- Fitness parameters
- Spatial simulation parameters (e.g.,
diffusion_coefficient
) - Subsampling ratios or target numbers of leaves
cassiopeia/simulator
directory provides a comprehensive set of tools for generating synthetic single-cell lineage tracing data, enabling researchers to test and validate analysis methods, explore experimental designs, and gain insights into the underlying biological processes.