Skip to content

Commit c1cd69a

Browse files
authored
Merge pull request #1 from Sinha-CompBio-Lab/update-readme
Improve README.md formatting, documentation, and updated abstract
2 parents d4e0172 + 62c5259 commit c1cd69a

1 file changed

Lines changed: 8 additions & 8 deletions

File tree

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# TLPath
2-
## Framework for predicting Telomere Length form Whole Slide Image
2+
## Framework for predicting Telomere Length from a Whole Slide Image
33

4-
**Abstract:** Telomeres are protective nucleoprotein complexes at chromosome ends and their shortening in aged tissues is one of the aging hallmarks. This shortening is associated with various age-related diseases and increased mortality risk. However, we lack high-throughput methods to measure telomere length. Prior studies pointed to underlying connection between telomere length and cellular morphology, but lacks a systematic assessment. We developed TLPath, a computational framework that predicts the length of the telomere from routinely available tissue histopathology (H&E) images. TLPath was trained on a paired dataset with both H&E and telomere length labels via Luminex assay from public GTex cohort, comprising XX patch images, 5,285 whole-slide images, from 926 non-disease individuals spanning 18 tissue. TLPath comprises four-steps: 1) preprocess tissue slides 2) extract morphological features from each patch using UNI, a pretrained foundation model, 3) pool these features to create a whole slide-level representation, and, 4) develop a morphological features-based model to predict tissue telomere length. When TLPath was tested on test GTEx data never seen before to the model, it predicts telomere length with an average correlation = 0.517 across 11 tissue types. In comparison, chronological age predict this with a correlation of 0.12. Beyond predicting telomere length in individuals of a wide age-range, we found that TLPath can also predict telomere length in age-matched samples across all tissues. TLPath’s most important features were nucleus-to-cytoplasmic ratio and variation in nucleus shape. TLPath is the first-in-class digital pathology tool to predict telomere length. enabling large-scale quantification.
4+
**Abstract:** Telomere dysfunction is a key hallmark of aging linked to numerous age-related diseases including cardiovascular disorders, pulmonary fibrosis, and metabolic syndromes. Despite decades of research yielding strong evidence linking telomere biology to aging processes, the field faces a critical bottleneck: current telomere measurement methods require specialized molecular techniques that prevent large-scale studies and clinical implementation. Here we present TLPath, a novel deep learning framework that extracts normal tissue architecture from routine histopathology (H&E) images to predict bulk-tissue telomere length. Trained on the Genotype-Tissue Expression cohort comprising >7.3 million patch images from >5,000 whole-slide images across 919 individuals, TLPath makes a remarkable discovery: the extracted morphological features spontaneously separate young, middle-aged, and elderly individuals within most tissue types—demonstrating for the first time that aging causes substantial architectural changes in tissues detectable without explicit age supervision. These extracted features can predict bulk-telomere length with significant accuracy (>0.51 in well-represented tissues), outperforming chronological age as a predictor (correlation = 0.20) and identifying age-discordant cases – detecting both accelerated telomeres shortening in young individuals and preserved telomeres in older individuals. Mechanistic interpretation reveals that TLPath leverages established senescence morphological markers, including nuclear-to-cytoplasmic ratio and nuclear shape variation, for its predictions. We applied TLPath in ~2,800 new GTEx biopsies where concordant with known association, the predicted telomere length is shorter across most tissues from individuals with Type 1/2 diabetes. Overall, we demonstrate that aging substantially alters tissue morphology, which TLPath captures and uses to predict telomere length, enabling large-scale telomere biology studies using existing tissue archives.
55

66
![alt text](docs/pics/Title_Photo.png)
77

@@ -15,18 +15,18 @@ conda env create -f env.yaml
1515
conda activate TLPath
1616
```
1717
### 1. Get Access
18-
To preprocess and get the UNI features from the H&e slides you need access to UNI model weight. Please follow the instructions [here](https://github.com/mahmoodlab/UNI) to get access to UNI weights. For ease of reproducibility we have provided the whole slide level features which are mean aggregation of patch level features with this code. You may find it at `{ZENODO_PLACEHOLDER}`
18+
To pre-process and get the UNI features from the H&E slides you need access to UNI model weights. Please follow the instructions [here](https://github.com/mahmoodlab/UNI) to get access to UNI weights. For ease of reproducibility we have provided the whole slide level features which are the mean aggregated patch level features. You may find it at `{ZENODO_PLACEHOLDER}`
1919

2020
### 2. Running Inference
21-
To run an inference on the UNI features please follow the guide in the notebook `run_inference.ipynb`
21+
To run an inference on the UNI features please follow the guide in the notebook: `run_inference.ipynb`
2222

2323
### 3. Training TLPath
24-
To train TLPath please follow the notebook `train_TLPath.ipynb`. TLPath can also be trained using CLI.
25-
Telomere data file can be downloaded from : https://gtexportal.org/home/downloads/egtex/telomeres
24+
To train TLPath please follow the notebook: `train_TLPath.ipynb`. TLPath can also be trained via CLI.
25+
GTEx telomere data is publicly available and can be downloaded from: https://gtexportal.org/home/downloads/egtex/telomeres
2626
`python /tlpath/model.py --telomere-file /path/to/telomere.csv --features_dir /path/to/features --output-dir /path/to/output --config /path/to/config.yaml --tissues-to-skip Tissue1 Tissue2`
2727

2828
- `--telomere-file` → Path to the telomere length data CSV file.
2929
- `--features_dir` → Directory containing patch features.
30-
- `--output-dir (optional)` → Directory to save results and models (default: results/TLPath).
30+
- `--output-dir (optional)` → Directory to save results and models. (default: results/TLPath)
3131
- `--config (optional)` → Path to a YAML configuration file.
32-
- `--tissues-to-skip (optional)` → List of tissues to exclude from analysis.(default: None)
32+
- `--tissues-to-skip (optional)` → List of tissues to exclude from analysis. (default: None)

0 commit comments

Comments
 (0)