Skip to content

Commit b62a863

Browse files
authored
Merge pull request #116 from MannLabs/update_developers_readme
update developer readme
2 parents 38b5bed + acf0847 commit b62a863

2 files changed

Lines changed: 29 additions & 28 deletions

File tree

DEVELOPERS.md

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -9,57 +9,57 @@ AlphaQuant is designed with modularity in mind to allow practitioners to introdu
99

1010
## 1. Ion-Level Statistical Testing
1111

12-
**Where to modify:** `alphaquant/diffquant/diff_analysis.py`
12+
**Where to modify:** [`alphaquant/diffquant/diff_analysis.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py)
1313

1414
**How it works:** Each ion (fragment, peptide, etc.) is tested independently for differential expression. The test produces three key outputs: `p_val` (p-value), `fc` (log2 fold change), and `z_val` (z-score for aggregation).
1515

1616
**Main class:**
17-
- **`DifferentialIon`** - The default method that uses intensity-dependent empirical background distributions to compute p-values and z-scores. It accounts for technical variation by comparing observed fold changes against distributions derived from similarly abundant ions in the dataset. The core statistical logic is in the `_calc_diffreg_peptide()` method.
17+
- [**`DifferentialIon`**](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py#L10) - The default method that uses intensity-dependent empirical background distributions to compute p-values and z-scores. It accounts for technical variation by comparing observed fold changes against distributions derived from similarly abundant ions in the dataset. The core statistical logic is in the [`_calc_diffreg_peptide()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py#L46) method.
1818

19-
**How to extend:** We've included `DifferentialIonTTest` in the same file as example code demonstrating how to implement alternative tests. This variant uses Welch's t-test with robust variance estimation. Note that this example has not been extensively benchmarked and is included for educational purposes to demonstrate the interface.
19+
**How to extend:** We've included [`DifferentialIonTTest`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py#L99) in the same file as example code demonstrating how to implement alternative tests. This variant uses Welch's t-test with robust variance estimation. Note that this example has not been extensively benchmarked and is included for educational purposes to demonstrate the interface.
2020

2121
1. Create a new class (e.g., `DifferentialIonMyMethod`) with the same interface:
2222
- `__init__()` should accept `(noNanvals_from, noNanvals_to, ...)` and any method-specific parameters
2323
- Set attributes: `name`, `p_val`, `fc`, `z_val`, `usable`
2424
2. Implement your statistical test in a method (e.g., `_calc_mymethod()`)
25-
3. Modify `alphaquant/diffquant/condpair_analysis.py` (lines 67-70) to instantiate your class
25+
3. Modify [`alphaquant/diffquant/condpair_analysis.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/condpair_analysis.py#L67-L70) (lines 67-70) to instantiate your class
2626
4. Optionally, add a parameter to `run_pipeline()` to select between methods
2727

2828
The key requirement is that your class must output `p_val`, `fc`, and `z_val` for each ion—these are used by the tree aggregation framework.
2929

3030
## 2. Tree-Based Ion Propagation
3131

32-
**Where to modify:** `alphaquant/cluster/cluster_utils.py` and `alphaquant/cluster/cluster_ions.py`
32+
**Where to modify:** [`alphaquant/cluster/cluster_utils.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py) and [`alphaquant/cluster/cluster_ions.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_ions.py)
3333

3434
**How it works:** Statistics from child nodes (e.g., fragments) are aggregated to parent nodes (e.g., peptides → proteins) in a hierarchical tree. Z-values are combined using Stouffer's method, and fold changes are summarized using medians.
3535

3636
**Key functions:**
37-
- **`aggregate_node_properties()`** - The core function that propagates statistics up the tree. It combines z-values, fold changes, and quality metrics from children to parents.
38-
- **`sum_and_re_scale_zvalues()`** - Implements Stouffer's Z-score method: sums z-values and divides by sqrt(n), then rescales to maintain standard normal distribution.
39-
- **`transform_znormed_to_pval()`** - Converts aggregated z-scores back to two-sided p-values.
37+
- [**`aggregate_node_properties()`**](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py#L22) - The core function that propagates statistics up the tree. It combines z-values, fold changes, and quality metrics from children to parents.
38+
- [**`sum_and_re_scale_zvalues()`**](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py#L266) - Implements Stouffer's Z-score method: sums z-values and divides by sqrt(n), then rescales to maintain standard normal distribution.
39+
- [**`transform_znormed_to_pval()`**](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py#L289) - Converts aggregated z-scores back to two-sided p-values.
4040

4141
**How to extend:** If you want to use different aggregation methods:
42-
1. Modify `sum_and_re_scale_zvalues()` to implement your preferred meta-analysis method (e.g., Fisher's method, weighted Z-scores, etc.)
43-
2. If your method changes the distribution, update `transform_znormed_to_pval()` accordingly
44-
3. For fold-change aggregation, modify line 67 in `aggregate_node_properties()` where `node.fc = np.median(fcs)` is set
42+
1. Modify [`sum_and_re_scale_zvalues()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py#L266) to implement your preferred meta-analysis method (e.g., Fisher's method, weighted Z-scores, etc.)
43+
2. If your method changes the distribution, update [`transform_znormed_to_pval()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py#L289) accordingly
44+
3. For fold-change aggregation, modify [line 67 in `aggregate_node_properties()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py#L67) where `node.fc = np.median(fcs)` is set
4545

46-
The tree traversal itself is in `cluster_ions.py`:
47-
- **`cluster_along_specified_levels()`** - Iterates through tree levels bottom-to-top
48-
- **`get_scored_clusterselected_ions()`** - Entry point for the hierarchical workflow
46+
The tree traversal itself is in [`cluster_ions.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_ions.py):
47+
- [**`cluster_along_specified_levels()`**](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_ions.py#L117) - Iterates through tree levels bottom-to-top
48+
- [**`get_scored_clusterselected_ions()`**](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_ions.py#L31) - Entry point for the hierarchical workflow
4949

5050
## 3. Multiple Testing Correction
5151

52-
**Where to modify:** `alphaquant/tables/diffquant_table.py` and `alphaquant/tables/proteoformtable.py`
52+
**Where to modify:** [`alphaquant/tables/diffquant_table.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/tables/diffquant_table.py) and [`alphaquant/tables/proteoformtable.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/tables/proteoformtable.py)
5353

5454
**How it works:** FDR correction is applied separately to different result tables during output generation. The method outputs p-values in all tables, so you can always recalculate q-values from the output files.
5555

5656
**Key functions:**
57-
- **Protein results** (`alphaquant/tables/diffquant_table.py`):
58-
- `_add_fdr_fc_based_set()` - Applies Benjamini-Hochberg to intensity-based proteins
59-
- `_add_fdr_counting_based_set()` - Applies adjusted Benjamini-Hochberg to proteins detected only via missing values
57+
- **Protein results** ([`alphaquant/tables/diffquant_table.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/tables/diffquant_table.py)):
58+
- [`_add_fdr_fc_based_set()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/tables/diffquant_table.py#L107) - Applies Benjamini-Hochberg to intensity-based proteins
59+
- [`_add_fdr_counting_based_set()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/tables/diffquant_table.py#L124) - Applies adjusted Benjamini-Hochberg to proteins detected only via missing values
6060

61-
- **Proteoform results** (`alphaquant/tables/proteoformtable.py`):
62-
- `_annotate_fdr_column()` - Applies Benjamini-Hochberg to test if alternative proteoforms differ from the reference
61+
- **Proteoform results** ([`alphaquant/tables/proteoformtable.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/tables/proteoformtable.py)):
62+
- [`_annotate_fdr_column()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/tables/proteoformtable.py#L59) - Applies Benjamini-Hochberg to test if alternative proteoforms differ from the reference
6363

6464
**How to extend:**
6565
1. Modify the relevant function to use a different method (e.g., Bonferroni, Storey's q-value, etc.)
@@ -68,30 +68,30 @@ The tree traversal itself is in `cluster_ions.py`:
6868

6969
## 4. Outlier Robustness
7070

71-
**Where to modify:** `alphaquant/diffquant/diff_analysis.py` and `alphaquant/cluster/cluster_utils.py`
71+
**Where to modify:** [`alphaquant/diffquant/diff_analysis.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py) and [`alphaquant/cluster/cluster_utils.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py)
7272

7373
**How it works:** AlphaQuant applies outlier correction at two levels to make results robust to technical variation and biological heterogeneity.
7474

7575
**Key functions:**
76-
- **`calc_outlier_scaling_factor()`** (in `diff_analysis.py`) - Compares between-replicate variance to expected technical variance and inflates estimates when replicates show unusual variability
77-
- **`remove_outlier_fragion_childs()`** (in `cluster_utils.py`) - Filters extreme fragments before aggregating to peptides (keeps the 5 most central fragments when >4 are available)
76+
- [**`calc_outlier_scaling_factor()`**](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py#L202) (in [`diff_analysis.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py)) - Compares between-replicate variance to expected technical variance and inflates estimates when replicates show unusual variability
77+
- [**`remove_outlier_fragion_childs()`**](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py#L222) (in [`cluster_utils.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py)) - Filters extreme fragments before aggregating to peptides (keeps the 5 most central fragments when >4 are available)
7878

7979
**How to extend:**
80-
1. Modify the scaling logic in `calc_outlier_scaling_factor()` to use different robust estimators
81-
2. Adjust `remove_outlier_fragion_childs()` to change how many fragments are retained or which criteria are used for selection
80+
1. Modify the scaling logic in [`calc_outlier_scaling_factor()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py#L202) to use different robust estimators
81+
2. Adjust [`remove_outlier_fragion_childs()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/cluster/cluster_utils.py#L222) to change how many fragments are retained or which criteria are used for selection
8282
3. Set `outlier_correction=False` in `run_pipeline()` to disable this feature entirely
8383

8484
## 5. Main Workflow Orchestration
8585

86-
**Where to modify:** `alphaquant/diffquant/condpair_analysis.py`
86+
**Where to modify:** [`alphaquant/diffquant/condpair_analysis.py`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/condpair_analysis.py)
8787

88-
**How it works:** The `analyze_condpair()` function coordinates the complete pipeline for comparing two conditions.
88+
**How it works:** The [`analyze_condpair()`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/condpair_analysis.py#L27) function coordinates the complete pipeline for comparing two conditions.
8989

9090
**Pipeline steps:**
9191
1. Load and filter data for the two conditions
9292
2. Perform normalization (within and between conditions)
9393
3. Create empirical background distributions
94-
4. Compute ion-level differential statistics (`DifferentialIon` or `DifferentialIonTTest`)
94+
4. Compute ion-level differential statistics ([`DifferentialIon`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py#L10) or [`DifferentialIonTTest`](https://github.com/MannLabs/alphaquant/blob/main/alphaquant/diffquant/diff_analysis.py#L99))
9595
5. Build hierarchical trees and perform clustering to identify proteoforms
9696
6. Apply machine learning quality scoring (if enabled)
9797
7. Filter outlier peptides (if enabled)

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ AlphaQuant is designed for proteomics researchers analyzing DDA or DIA experimen
4141
* [**Python and jupyter notebooks**](#python-and-jupyter-notebooks)
4242
* [**Troubleshooting**](#troubleshooting)
4343
* [**Citations**](#citations)
44+
* [**For Developers: Modifying AlphaQuant**](#for-developers-modifying-alphaquant)
4445
* [**How to contribute**](#how-to-contribute)
4546
* [**License**](#license)
4647
* [**Changelog**](#changelog)

0 commit comments

Comments
 (0)