High-level description

The SequentialLineageTracingDataSimulator class simulates lineage tracing data generated by sequential Cas9-based technologies, such as the DNA Typewriter. It overlays simulated edits onto a CassiopeiaTree, mimicking the sequential recording process of these technologies. The simulator considers factors like initiation and continuation rates of Cas9 recording, cassette architecture, state distributions, and silencing rates to generate realistic data.

Code Structure

The SequentialLineageTracingDataSimulator class inherits from the LineageTracingDataSimulator class. It primarily implements the overlay_data method, which simulates the sequential editing process on the provided CassiopeiaTree. The class also includes helper functions like edit_site and silence_cassettes to manage individual editing events and cassette silencing, respectively.

References

  • cassiopeia.data.CassiopeiaTree: The simulator operates on a CassiopeiaTree object, modifying its character states to represent the simulated lineage tracing data.
  • cassiopeia.simulator.LineageTracingDataSimulator: This class inherits from the LineageTracingDataSimulator class, providing a framework for simulating lineage tracing data.

Symbols

SequentialLineageTracingDataSimulator

Description

This class simulates sequential Cas9-based lineage tracing data and overlays it onto a CassiopeiaTree. It models the sequential editing process on a “DNA tape” or “cassette” where only one site can be edited at a time.

Inputs

NameTypeDescription
number_of_cassettesintNumber of cassettes in the system.
size_of_cassetteintNumber of editable target sites per cassette.
initiation_ratefloatExponential parameter for the Cas9 initiation rate.
continuation_ratefloatExponential parameter for the Cas9 continuation rate.
state_priorsDict[int, float]Dictionary mapping states to their prior probabilities.
heritable_silencing_ratefloatSilencing rate for the cassettes, simulating heritable missing data events.
stochastic_silencing_ratefloatRate at which to randomly drop out cassettes, simulating dropout due to low sensitivity of assays.
heritable_missing_data_stateintInteger representing data that has gone missing due to a heritable event.
stochastic_missing_data_stateintInteger representing data that has gone missing due to stochastic dropout.
random_seedOptional[int]Numpy random seed for deterministic simulations.

Outputs

This class doesn’t directly return any output. It modifies the provided CassiopeiaTree in place.

Internal Logic

The simulator initializes a character matrix representing the cassettes and their states. It then iterates through each node in the tree, simulating the editing process based on the node’s lineage and lifetime. For each cassette, it determines if it’s initiated and simulates edits based on the continuation rate. It also applies heritable and stochastic silencing to the cassettes, mimicking real-world data variability. Finally, it updates the CassiopeiaTree with the simulated character matrix.

overlay_data

Description

This method overlays the simulated Cas9-based lineage tracing data onto the provided CassiopeiaTree.

Inputs

NameTypeDescription
treeCassiopeiaTreeThe CassiopeiaTree object to overlay the simulated data onto.

Outputs

This method doesn’t return any output. It modifies the input CassiopeiaTree in place.

Internal Logic

The method first initializes a character matrix with all sites set to an unedited state. It then traverses the tree in depth-first order. For each node, it simulates the Cas9 editing process based on the node’s lifetime and the parent’s character state. It then applies heritable and stochastic silencing to the character array. Finally, it updates the CassiopeiaTree with the simulated character matrix.

edit_site

Description

This helper function edits a specific site in the character array based on the provided state priors.

Inputs

NameTypeDescription
character_arrayList[int]The character array representing the cassette states.
siteintThe index of the site to edit.
state_priorsDict[int, float]Dictionary mapping states to their prior probabilities.

Outputs

NameTypeDescription
character_arrayList[int]The updated character array with the edited site.

Internal Logic

The function randomly samples a state from the state priors based on their probabilities and updates the character array at the specified site with the chosen state.

silence_cassettes

Description

This helper function simulates the silencing of cassettes in the character array based on the provided silencing rate.

Inputs

NameTypeDescription
character_arrayList[int]The character array representing the cassette states.
silencing_ratefloatThe probability of silencing a cassette.
missing_stateintThe state to use for representing silenced cassettes.

Outputs

NameTypeDescription
updated_character_arrayList[int]The updated character array with silenced cassettes.

Internal Logic

The function iterates through each cassette and, based on the silencing rate, randomly determines whether to silence it. If a cassette is silenced, all its sites in the character array are set to the specified missing state.