> ## Documentation Index
> Fetch the complete documentation index at: https://demo.agenticlabs.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

## High-level description

This script generates and annotates a collection of phylogenetic trees (a "forest") using the `jungle` library. It allows specifying the number of trees, leaves per tree, and a shape parameter controlling the tree structure. The resulting annotated forest is then saved to a compressed pickle file.

## References

This script references the `jungle` library (`jg`) for phylogenetic tree generation and manipulation.

## Symbols

### `generate_annotate_forest.py`

#### Description

This script generates a collection of phylogenetic trees, annotates them with various features, and saves the resulting forest data structure to a file.

#### Inputs

This script takes its input from command line arguments:

| Name         | Type  | Description                                                                                   |
| :----------- | :---- | :-------------------------------------------------------------------------------------------- |
| `n_leaves`   | int   | Number of leaves (tips) in each generated tree.                                               |
| `n_trees`    | int   | Number of trees to generate for the forest.                                                   |
| `alpha`      | float | Shape parameter influencing the tree structure (2.0 for neutral, 1.0 for positive selection). |
| `output_dir` | str   | Path to the directory where the output file will be saved.                                    |

#### Outputs

The script generates a gzipped pickle file containing the annotated forest data structure. The file name is based on the input parameters and a unique identifier.

#### Internal Logic

1. **Parameter Parsing:** Reads input parameters from command line arguments.
2. **Output File Naming:** Constructs the output file name based on input parameters and a UUID.
3. **Forest Generation:** Uses the `jungle` library to generate a forest of `n_trees` trees, each with `n_leaves` leaves, using the specified `alpha` shape parameter.
4. **Forest Annotation:**
   * Resolves potential multifurcations (polytomies) in the generated trees.
   * Annotates the trees with standard node features (e.g., distance to root).
   * Calculates and annotates the Colless index for each tree, a measure of tree balance.
5. **Forest Serialization:** Saves the annotated forest object to a gzipped pickle file.

#### Side Effects

* Creates a file in the specified output directory.
* Prints status messages and timing information to the console if `verbose` is True.

## Dependencies

| Dependency | Purpose                                                                              |
| :--------- | :----------------------------------------------------------------------------------- |
| `jungle`   | Provides functionality for phylogenetic tree generation, manipulation, and analysis. |
| `sys`      | Used for accessing command line arguments.                                           |
| `time`     | Used for tracking the script's execution time.                                       |
| `uuid`     | Used for generating a unique identifier for the output file.                         |
| `pickle`   | Used for serializing and saving the forest object.                                   |
| `gzip`     | Used for compressing the output pickle file.                                         |

## Configuration

The script's behavior is controlled by command line arguments, as described in the "Inputs" section.

## Logging

The script prints informative messages to the console if the `verbose` variable is set to `True`. This includes parameter values, progress updates, and timing information.
