SpatialLeafSubsampler.py
High-level description
The SpatialLeafSubsampler
class is a subclass of LeafSubsampler
that subsamples leaves within a specified region of interest in a CassiopeiaTree
with spatial information. It supports subsampling using a bounding box or a numpy mask, and allows for downsampling by ratio or a fixed number of leaves.
Code Structure
The SpatialLeafSubsampler
class has a single method, subsample_leaves
, which takes a CassiopeiaTree
as input and returns a new CassiopeiaTree
containing the subsampled leaves. The class constructor initializes the region of interest and downsampling parameters.
References
CassiopeiaTree
: TheSpatialLeafSubsampler
operates onCassiopeiaTree
objects.LeafSubsampler
: TheSpatialLeafSubsampler
is a subclass ofLeafSubsampler
.
Symbols
SpatialLeafSubsampler
Description
This class subsamples leaves within a region of interest of a CassiopeiaTree
with spatial information and produces a new CassiopeiaTree
containing the sampled leaves.
Inputs
Name | Type | Description |
---|---|---|
bounding_box | Optional[List[tuple]] | A list of tuples of the form (min, max) for each dimension of the bounding box. |
space | Optional[np.ndarray] | Numpy array mask representing the space that cells will be sampled from. |
ratio | Optional[float] | Specifies the number of leaves to be sampled as a ratio of the total number of leaves in the region of interest. |
number_of_leaves | Optional[int] | Explicitly specifies the number of leaves to be sampled. |
attribute_key | Optional[str] | The key in the CassiopeiaTree’s node attributes that contains the spatial coordinates of the leaves. Defaults to “spatial”. |
merge_cells | Optional[bool] | Whether or not to merge cells contained within the same pixel in the space. Defaults to False. |
Outputs
None
Internal Logic
The constructor checks for valid input parameters and stores them as instance variables.
subsample_leaves
Description
This method subsamples the leaves of a CassiopeiaTree
to those within a region of interest and returns a tree pruned to contain lineages relevant to only leaves in the sample (the “induced subtree” on the sample).
Inputs
Name | Type | Description |
---|---|---|
tree | CassiopeiaTree | The CassiopeiaTree for which to subsample leaves. |
keep_singular_root_edge | Optional[bool] | Whether or not to collapse the single edge leading from the root in the subsample, if it exists. Defaults to True. |
collapse_duplicates | bool | Whether or not to collapse duplicated character states, so that only unique character states are present in each ambiguous state. Defaults to True. |
Outputs
Name | Type | Description |
---|---|---|
subsampled_tree | CassiopeiaTree | A new CassiopeiaTree that is the induced subtree on a sample of the leaves in the given tree. |
Internal Logic
- Check for spatial information: Ensures the tree has spatial information associated with the leaves.
- Subset leaves:
- If
bounding_box
is provided, keeps leaves within the specified bounding box. - If
space
is provided, keeps leaves within the specified mask.
- If
- Determine number of leaves to sample:
- If
ratio
is provided, calculates the number of leaves to sample based on the ratio and the number of leaves in the region of interest. - If
number_of_leaves
is provided, uses the specified number. - If neither is provided, keeps all leaves in the region of interest.
- If
- Sample leaves: Randomly samples the specified number of leaves from the subset of leaves within the region of interest.
- Prune lineages: Removes leaves not in the sample and prunes lineages no longer relevant to the sampled leaves.
- Merge cells (optional): If
merge_cells
is True, merges cells within the same pixel in the space, creating a new cell with ambiguous character states. - Collapse unifurcations: Collapses unifurcations in the tree, preserving node times.
- Copy and annotate branch lengths and times: Copies branch lengths and times from the original tree to the subsampled tree.
Side Effects
The subsample_leaves
method modifies the input CassiopeiaTree
object in place if copy
is set to False.
Error Handling
The constructor and subsample_leaves
method raise LeafSubsamplerError
for invalid input parameters or if the tree does not have the required spatial information.
Dependencies
numpy
: Used for array operations and random sampling.pandas
: Used for character matrix manipulation.networkx
: Used for tree manipulation.collections
: Used for creating a defaultdict.copy
: Used for deep copying the tree.warnings
: Used for issuing warnings.
Configuration
The SpatialLeafSubsampler
class is configured using the parameters passed to its constructor.
Logging
The code does not implement any specific logging mechanisms.
API/Interface Reference
The SpatialLeafSubsampler
class exposes a single public method, subsample_leaves
, which can be used to subsample the leaves of a CassiopeiaTree
object.