Skip to content

NUBagciLab/Prostate-Lesion-Segmentation

 
 

Repository files navigation

Align then Refine: Text-Guided 3D Prostate Lesion Segmentation

Our paper, Align then Refine: Text-Guided 3D Prostate Lesion Segmentation, has been accepted for presentation at the 48th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE EMBC 2026).

This README provides instructions for using the customized training and inference pipeline in this repository, including multi-encoder design, text conditioning, and attention-based refinement.

1) What to run

Main scripts used in this repo:

  • tools/train/run_text_single.sh: base text trainer run
  • tools/train/run_text_attn_best.sh: attention run with tuned defaults + pretrained checkpoint
  • tools/infer/run_text_test.sh: inference + optional evaluation
  • tools/eval/compute_segmentation_metrics.py: extra metric report

Tools are organized by purpose:

  • tools/train: training and finetuning launchers
  • tools/infer: prediction/inference launchers
  • tools/eval: postprocessing and metric utilities

2) Environment

cd <PROJECT_ROOT>

# 1) Create the base nnU-Net v2 environment (follow the official nnU-Net setup)
conda create -n nnunetv2_repro python=3.10 -y
conda activate nnunetv2_repro

# install this fork in editable mode
pip install -e .

# 2) Extra dependency used by this text-guided pipeline
# (install if not already present in your nnU-Net environment)
pip install open-clip-torch

3) Required paths

Set these before training or inference:

export nnUNet_raw=<NNUNET_RAW>
export nnUNet_preprocessed=<NNUNET_PREPROCESSED>
export nnUNet_results=<NNUNET_RESULTS>

4) Dataset and fold defaults

Examples below use:

  • Dataset: Dataset2203_picai_split
  • Config: 3d_fullres
  • Plans: nnUNetPlans
  • Fold: 1

Adjust as needed.

5) Training and Inference Workflow

Stage A: Base text model

# QUICK=0 is closer to full training
QUICK=0 bash tools/train/run_text_single.sh 0 1 Dataset2203_picai_split 3d_fullres nnUNetPlans tversky

Args:

  • <GPU_ID> [FOLD] [DATASET] [CONFIG] [PLANS] [LOSS]
  • [LOSS]: dice | tversky | focal_tversky (mapped internally to *_topk)

Common env vars:

  • QUICK (1 debug/faster, 0 fuller run)
  • NNUNET_TRAINER, NNUNET_PRETRAINED_WEIGHTS
  • NNUNET_RESULTS_TAG / NNUNET_RESULTS_DIR
  • NNUNET_ITERS_PER_EPOCH, NNUNET_VAL_ITERS
  • NNUNET_TEXT_PROMPTS, NNUNET_TEXT_MODEL, NNUNET_TEXT_EMBED_DIM
  • NNUNET_TEXT_MODULATION (none|film|gate)
  • NNUNET_USE_ALIGNMENT_HEAD, NNUNET_RETURN_HEATMAP
  • NNUNET_LAMBDA_ALIGN, NNUNET_LAMBDA_HEAT
  • NNUNET_AUX_WARMUP_EPOCHS, NNUNET_AUX_RAMP_EPOCHS

Loss options for run_text_single.sh (6th argument):

  • dice
  • tversky
  • focal_tversky

Examples:

# Dice + TopKCE
QUICK=0 bash tools/train/run_text_single.sh 0 1 Dataset2203_picai_split 3d_fullres nnUNetPlans dice

# Tversky + TopKCE
QUICK=0 bash tools/train/run_text_single.sh 0 1 Dataset2203_picai_split 3d_fullres nnUNetPlans tversky

# Focal-Tversky + TopKCE
QUICK=0 bash tools/train/run_text_single.sh 0 1 Dataset2203_picai_split 3d_fullres nnUNetPlans focal_tversky

Notes:

  • tools/train/run_text_attn_best.sh launches with tversky.
  • You can still override the internal loss selection via env, e.g. NNUNET_TEXT_LOSS=dice_topk before launch.

Expected output model folder:

<NNUNET_RESULTS>/
  Dataset2203_picai_split/
  nnUNetTrainerMultiEncoderUNetText__nnUNetPlans__3d_fullres/fold_1

Stage B: Attention fine-tuning

# auto-loads fold checkpoint_best.pth from Stage A if present
QUICK=0 bash tools/train/run_text_attn_best.sh 0 1 Dataset2203_picai_split 3d_fullres nnUNetPlans

Args:

  • <GPU_ID> [FOLD] [DATASET] [CONFIG] [PLANS] [PRETRAINED_CKPT]

Key env vars:

  • NNUNET_CROSS_GAMMA_INIT (default 0.10)
  • NNUNET_CROSS_ALPHA (default 0.12)
  • NNUNET_CROSS_TAU (default 0.44)
  • ATTN_WARMUP_EPOCHS (default 0)
  • BASE_LR_REFINER (default 5e-4)
  • NNUNET_PRETRAINED_WEIGHTS (override preload checkpoint)
  • NNUNET_RESULTS_TAG (output experiment tag)

6) Inference

bash tools/infer/run_text_test.sh <INPUT_DIR> <OUTPUT_DIR> 0 <MODEL_PATH>
  • INPUT_DIR: nnUNet-style input images directory
  • OUTPUT_DIR: prediction output directory
  • 0: GPU id

Args:

  • <INPUT_DIR> <OUTPUT_DIR> [GPU_ID] [MODEL_PATH]
  • [MODEL_PATH]: optional model path. This can be a checkpoint file, a fold_* directory, or a model/results directory.

Key env vars:

  • NNUNET_DATASET, NNUNET_CONFIG, NNUNET_PLANS
  • NNUNET_TEST_TRAINER, NNUNET_FOLD, NNUNET_CHECKPOINT_FILE
  • NNUNET_SKIP_EVAL, NNUNET_GT_DIR

7) Optional: extra metric computation

python tools/eval/compute_segmentation_metrics.py \
  --pred-dir <OUTPUT_DIR> \
  --gt-dir <GT_DIR> \
  --output-dir <METRIC_OUT_DIR>

Acknowledgement

This project is built on top of the excellent nnU-Net framework:

We gratefully acknowledge and thank the nnU-Net authors and contributors for open-sourcing and maintaining this powerful toolkit.

Citation

If you find this work useful, please cite:

@misc{sun2026alignrefinetextguided3d,
      title={Align then Refine: Text-Guided 3D Prostate Lesion Segmentation}, 
      author={Cuiling Sun and Linkai Peng and Adam Murphy and Elif Keles and Hiten D. Patel and Ashley Ross and Frank Miller and Baris Turkbey and Andrea Mia Bejar and Halil Ertugrul Aktas and Gorkem Durak and Ulas Bagci},
      year={2026},
      eprint={2604.18713},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2604.18713}, 
}

If you use our codebase, please also cite:

@article{isensee2021nnu,
  title   = {nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation},
  author  = {Isensee, Fabian and Jaeger, Paul F. and Kohl, Simon A. A. and Petersen, Jens and Maier-Hein, Klaus H.},
  journal = {Nature Methods},
  volume  = {18},
  number  = {2},
  pages   = {203--211},
  year    = {2021},
  doi     = {10.1038/s41592-020-01008-z}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.3%
  • Shell 2.7%