High-level description

The UniformLeafSubsampler class is a subclass of LeafSubsampler that performs uniform random sampling of leaves in a CassiopeiaTree. It creates a new CassiopeiaTree containing only the lineages of the sampled leaves, preserving character states, metadata, and dissimilarity maps for the sampled cells.

References

This code references the following symbols:

  • cassiopeia.data.CassiopeiaTree
  • cassiopeia.simulator.LeafSubsampler.LeafSubsampler
  • cassiopeia.simulator.LeafSubsampler.LeafSubsamplerError

Symbols

UniformLeafSubsampler

Description

This class implements the logic for uniformly subsampling leaves from a CassiopeiaTree. It provides options to specify the sample size either as a ratio of the total number of leaves or as an explicit number.

Inputs

NameTypeDescription
ratioOptional[float]The proportion of leaves to sample.
number_of_leavesOptional[int]The exact number of leaves to sample.

Outputs

This class doesn’t directly return any values. It modifies the input CassiopeiaTree object.

Internal Logic

The __init__ method initializes the UniformLeafSubsampler object, ensuring that either ratio or number_of_leaves is provided, but not both.

The subsample_leaves method performs the actual subsampling. It first determines the desired sample size based on the provided ratio or number_of_leaves. Then, it randomly selects leaves to remove and prunes the tree accordingly. Finally, it optionally collapses any remaining unifurcations (nodes with a single child) to maintain a valid tree structure.

Side Effects

  • Modifies the input CassiopeiaTree object in place.

Performance Considerations

The performance of this class depends on the size of the input tree and the desired sample size. The random selection of leaves and tree pruning operations have a time complexity that scales with the number of nodes and edges in the tree.