Overview
High-level description
This directory contains the implementation of a beta coalescent tree simulator and a Site Frequency Spectrum (SFS) calculator. The code is designed to model and analyze the evolutionary relationships and genetic diversity in populations using coalescent theory and beta coalescent processes.
What does it do?
The code in this directory simulates the evolutionary history of genetic sequences under different coalescent models, particularly the beta coalescent model. It generates genealogical trees that represent the ancestral relationships between individuals in a population sample. Using these trees, it calculates the Site Frequency Spectrum (SFS), which is a summary statistic of genetic variation in the population. The SFS represents the distribution of allele frequencies in the sample, providing insights into the population’s genetic diversity and evolutionary history.
Key Files
-
betatree.py
: This file contains thebetatree
class, which is responsible for simulating beta coalescent trees. It implements methods for tree initialization, coalescence events, and tree structure manipulation. -
sfs.py
andsfs_py3.py
: These files define theSFS
class, which extends the functionality ofbetatree
to calculate the Site Frequency Spectrum. TheSFS
class generates multiple trees, accumulates allele frequency information, and computes the SFS. It also provides methods for binning the SFS and saving/loading SFS data.
Dependencies
The code relies on several external libraries:
- NumPy: Used for numerical computations and array operations.
- SciPy: Specifically,
scipy.special
is used for special mathematical functions like the gamma function. - BioPython: The
Bio.Phylo
module is used for working with phylogenetic trees. - Matplotlib: Used for plotting the SFS (in example usage).
Configuration
The code does not use explicit configuration files. Instead, key parameters are passed as arguments to the class constructors:
sample_size
: The number of individuals in the sample.alpha
: The alpha parameter of the beta coalescent model (default is 2, which corresponds to the Kingman coalescent).
These parameters can be adjusted when initializing the betatree
or SFS
objects to simulate different evolutionary scenarios.
The SFS
class also allows for configuration of the SFS calculation and binning process through method parameters:
ntrees
: The number of trees to generate for SFS calculation.mode
: The binning mode for the SFS (linear, log, or logit).bins
: The number of bins or custom bin edges for SFS binning.
These configurations allow users to fine-tune the SFS calculation and analysis based on their specific research needs.