Merge pull request #11 from ChristianHinge/dev/documentation

ChristianHinge · web-flow · commit fbcf8b20787f · 2026-04-08T12:06:08.000+02:00
Change hu_to_mu formula (slight bug)
diff --git a/.gitignore b/.gitignore
@@ -1,5 +1,6 @@
 .old/
 conversion/
+docs_scripts/
 CLAUDE.md
 outputs*/
 docs_scripts/
diff --git a/docs/data-background.md b/docs/data-background.md
@@ -1,3 +1,5 @@
+[WIP]
+
 # PET Imaging Background
 
 This guide is for participants who are familiar with medical imaging (MRI, CT) but have not worked extensively with PET. It covers the concepts you need to understand the challenge task, the data, and the evaluation metrics.
diff --git a/docs/reconstruction.md b/docs/reconstruction.md
@@ -1,25 +1,40 @@
-# PET Reconstruction Pipeline
+# PET Reconstruction
 
-The reconstruction pipeline (`src/recon/`) converts a pseudo-CT into an attenuation-corrected PET image using [STIR](http://stir.sourceforge.net/) (Software for Tomographic Image Reconstruction). You do **not** need to understand or modify the pipeline to participate — it is run by the challenge organisers on your submissions. This guide is for participants who want to run it locally for closed-loop training or debugging.
+You do **not** need to understand reconstruction or attenuation correction to participate, however, having an intuition of the first reconstruction steps can be a **significant advantage** when designing your pseudo-CT algorithm and loss function. 
+
+## Introduction
+In PET, the radioactive tracer (FDG, a glucose analogue) emits positrons that immediately annihilate with nearby electrons, releasing two 511 keV photons traveling in exactly opposite directions. The detector rings record both photons simultaneously — a *coincidence event* — telling us which straight line through the patient the emission occurred on, but not where along that line<sup>1</sup>. Tallying all such lines produces a **sinogram**, the raw projection data from which the activity image is reconstructed. However, some photons are absorbed (attenuated) by tissue before reaching the detectors — an effect that is more pronounced for deep structures and dense bone. **Attenuation correction** compensates for this, and it is what your pseudo-CT enables.
+
+The PET reconstruction algorithm is titled Ordinary Poisson Ordered Subsets Expectation Maximization (OP-OSEM). OP-OSEM is simply a maximum likelihood expectation maximization algorithm where the observed data (the sinograms) are partioned and processed in chunks (subsets) to  dramatically accelerate the reconstruction speed. The reconstruction for BIC-MAC performs OP-OPSEM for `5 subsets x 4 iterations = 20 subiterations`. Three sinograms are used in OP-OSEM of which the last two must be attenuation corrected: `prompts_rd85[.s/.hs]`, `add_nac_rd85[.s/.hs]`, and `mult_nac_rd85[.s/.hs]` <sup>2</sup> 
+
+The reconstruction pipeline for BIC-MAC (`src/recon/`) uses [STIR](http://stir.sourceforge.net/) (Software for Tomographic Image Reconstruction) to perform reconstruction. (see [Further Reading](#further-reading) for a primer)
+
+> <sup>1</sup> A simplification. Modern scanners like the Siemens Quadra used for BIC-MAC can detect the tiny time of flight (TOF) difference between the arriving photons and infer their origin (roughly) on the LOR. Consequently, TOF-enabled sinograms have an extra dimension.
+> 
+> <sup>2</sup> RD85 means maximum ring difference of 85 and defines how oblique the LORs are allowed to be. The Siemens Quadra allows up to RD322, but RD85 was chosen to reduce the sinogram size.
 
 ---
 
-## Pipeline Steps
+## BIC-MAC reconstruction steps
+Given a CT (ground-truth or pseudo-CT) and the subject's sinogram data (`recon/`), the the reconstruction pipeline exectutes the following steps to arrive at a reconstructed attenuation-corrected PET NIfTI image:
+
+1. **Superimpose bed pixelated face** - The pseudo-CT face is replaced by a pre-saved pixelated face. Likewise, everything outside a ~1cm rim (pillows, bed, hair, air) is replaced by the ground truth image (see `ct_face_and_bed.nii.gz` and `face_and_bed_mask.nii.gz`). Consequently, the pseudo-CT algorithm will not benefit from trying to predict these areas. The `prediction_mask.nii.gz` under `ct-label` is the *inverse* mask of `face_and_bed_mask.nii.gz` and may be used to restrict training to the relevant body region. You can inspect the face-swapped intermediate file (`intermediates/ct_face_swapped.nii.gz`)
+
+2. **HU → μ-map** — The pseudo-CT is a volume of Hounsfield units (HU), a relative X-ray density scale where air = −1000 and water = 0. PET reconstruction needs instead the linear attenuation coefficient μ (cm⁻¹) at the 511 keV photon energy of PET annihilation events. The conversion uses a bilinear model (Carney et al. 2006): a steep linear segment maps soft tissue (HU ≤ 47, dominated by water) and a flatter segment maps dense materials like bone (HU > 47). This means that errors in HU are not equally costly in soft tissue and in bone when converting to a mumap. You can inspect the mu_map intermediate file (`intermediates/mu_map.nii.gz`).
+
+3. **Smooth μ-map** — A 4 mm FWHM Gaussian blur is applied to the μ-map before any sinogram operations. This is standard clinical practice to reduce the effect of CT noise and slight patient movement. Consequently, very fine structural detail in your pseudo-CT may be blurred away before it ever influences the PET sinogram, and by extension, the PET-based metrics. You can inspect the smoothed mu_map intermediate file (`intermediates/mumap_smoothed.nii.gz`)
+
+4. **Resample to STIR** — Resamples the μ-map onto STIR's z-axis grid (ring spacing 3.29 mm), snapping the origin to the STIR coordinate system. A technical prerequisite for STIR's forward projection. The intermediate files are (`intermediates/mumap_stir[.hv/.ahv/.v]`)
+
+5. **Compute ACF sinogram** — The μ-map is *forward projected* along every line of response (LOR) in the PET scanner geometry, computing the total integrated attenuation each annihilation photon pair experiences along that path. The result is the **attenuation correction factor (ACF)** sinogram, which has the same shape as the other PET sinograms, except from the fact that it lacks a TOF-dimension. The intermediate files are(`intermediates/acf[.hs/.v]`)
+
+6.–7. **Apply ACF to sinograms** — The ACF sinogram is multiplied into both the *multiplicative* sinogram and the *additive* sinogram. This encodes the predicted attenuation into the reconstruction inputs. The intermediate files are (`intermediates/add[.hs/.s]` and `intermediates/mult[.hs/.s]`)
 
-Given a CT (ground-truth or pseudo-CT) and the subject's sinogram data (`recon/`), the pipeline produces a reconstructed PET NIfTI:
+8. **OSEM reconstruction** — OP-OSEM is run for 20 subiterations (5 subsets, 4 iterations). The result is smoothed by a 4 mm FWHM Gaussian post-filter to reduce noise. The intermediate files are (`intermediates/pet_20[.ahv/.hv/.v]`).
 
-1. **Validate CT** — checks shape, affine, and HU range against the ground-truth CT
-2. **Swap face and bed** — replaces the face and scanner bed region with ground-truth CT values (so face/bed prediction is not penalised)
-3. **HU → μ-map** — converts Hounsfield units to linear attenuation coefficients at 511 keV using the Carney et al. (2006) bilinear model at 120 kVp
-4. **Smooth μ-map** — applies a 4 mm FWHM Gaussian to match scanner resolution
-5. **Resample to STIR** — resamples the μ-map onto the STIR z-axis grid (ring spacing 3.29114 mm)
-6. **Compute ACF sinogram** — forward-projects the μ-map to produce the attenuation correction factor (ACF) sinogram
-7. **Apply ACF to additive sinogram** — multiplies ACF into the scatter+randoms estimate
-8. **Apply ACF to multiplicative sinogram** — multiplies ACF into the detector normalisation sinogram
-9. **OSEM reconstruction** — reconstructs PET via ordered-subsets expectation maximisation with a 4 mm post-filter
-10. **Convert to NIfTI** — writes the reconstructed PET with the correct bed/gantry offset origin
+9. **Convert to NIfTI** — Writes the reconstructed PET volume as a NIfTI file with the correct origin, accounting for the scanner bed position and gantry offset stored in `recon/offset.json`. The final output is `pet.nii.gz`
 
-Intermediate outputs (μ-map, ACF sinogram, STIR-format files) are written to `output_dir/intermediates/`. The pipeline skips steps whose outputs already exist, so it resumes automatically from a partial run.
+Intermediate outputs are written to `output_dir/intermediates/`. The pipeline skips steps whose outputs already exist, so it resumes automatically from a partial run.
 
 ---
 
diff --git a/docs/rules.md b/docs/rules.md
@@ -8,7 +8,7 @@ This document describes the rules governing participation in the Big Cross-Modal
 
 **Additional training data is allowed.** as long as it was released, publicly available and accessible to all participants without restrictions prior the start of the challenge (April 1st). Private datasets are NOT allowed. The use of public datasets must be disclosed in the submitted methodology paper. Please see [tips-and-faq.md](tips-and-faq.md) for suggested public datasets. If you are unsure whether a particular dataset fulfills the above criteria, please send an email to bic-mac-challenge@outlook.com. 
 
-**Pretrained networks are allowed** if they were publicly available and accessible to all participants without restrictions (e.g. on GitHub, Huggingface, Zenodo, or a comparable platform) *prior to the start of the challenge* (April 1st) . You may use these as initialization, feature extractors or preprocessing, but the fine-tuning data must be limited to the provided dataset.
+**Pretrained networks are allowed** if they were publicly available and accessible to all participants without restrictions (e.g. on GitHub, Huggingface, Zenodo, or a comparable platform) *prior to the start of the challenge* (April 1st). You may use these as initialization, feature extractors or preprocessing, but the fine-tuning data must be limited to the provided dataset.
 
 **Any preprocessing, manual labelling or augmentation of the BIC-MAC dataset and public datasets is allowed**, as long as it does not conflict with the other rules or the BIC-MAC Data User Agreement.
 
diff --git a/docs/tips-and-faq.md b/docs/tips-and-faq.md
@@ -1,6 +1,11 @@
-## Suggested public datasets
 
+# Tips
 
+We strongly suggest you start by reading [data-background.md](data-background.md) and [reconstruction.md](reconstruction.md).
+
+
+### Suggested public datasets
+We recommend the following public datasets if you wish to perform pretraining etc. 
 ### PET/CT
 > Note that all PET images in the following datasets are attenuation corrected (usually by the accompanying CT), which means that the PET may encode some of the CT information.
 - [Vienna QUADRA_HC](https://zenodo.org/records/16588733) 96 whole-body 18F-FDG PET/CT studies from 48 participants. Like BIC-MAC, the PET/CT is acquired on a Siemens Biograph Vision Quadra and the participants are healthy controls. [citation](https://www.nature.com/articles/s41597-025-05997-4)
@@ -47,3 +52,90 @@
 
 ### Chest X-ray (Topogram-like)
 - [CheXpert](https://stanfordmlgroup.github.io/competitions/chexpert/) 224,316 chest radiographs of 65,240 patients with 14 pathology labels. [citation](https://arxiv.org/abs/1901.07031)
+
+---
+
+
+# FAQ
+
+
+**Do I need to understand PET reconstruction to participate?**
+
+No. You only need to predict a pseudo-CT from the input features. The reconstruction pipeline is provided and run for you. See [reconstruction.md](reconstruction.md) if you want to understand what the reconstruction does and why CT quality matters for PET accuracy.
+
+
+**What data can my pseudo-CT model use as input?**
+
+Any and all files under the `features/` folder. The baseline uses just the `nacpet.nii.gz`, but you are free to combine modalities and demographic features in any way you see fit. 
+
+**Do I need to resample or register any of the images?**
+
+All images under `features/` have been resampled to the dimensions of `ct.nii.gz (512x512x531)`. The topogram is 2D and therefore resampled to `(512x1x531)`. Prior to resampling, the MR have been rigidly translated to PET/CT space. The MRI aligns crudely with the PET/CT/Topogram, and it is up to you to decide whether your model should incoorporate registration as a preprocessing step. 
+
+**Can I use other data than the BIC-MAC dataset?**
+Additional training data is allowed, as long as it was released, publicly available and accessible to all participants without restrictions prior the start of the challenge (April 1st). Private datasets are NOT allowed. 
+
+**Can I use pretrained models?**
+
+Yes, you are allowed to use and finetune pretrained models, as longs as they were publicly available and accessible to all participants without restrictions (e.g. on GitHub, Huggingface, Zenodo, or a comparable platform) *prior to the start of the challenge* (April 1st).
+
+**What subjects have sinogram data for local reconstruction?**
+
+Sinogram data is provided for the following subjects:
+- `train/`: `sub-000`, `sub-001`, `sub-002`, `sub-005`, `sub-006`, `sub-008`, `sub-013`, `sub-014`
+- `val/`: `sub-004`, `sub-009` ,`sub-010` ,`sub-018` (all of them)
+
+The remaining 67 training subjects have `features/` and `ct-label/` only — you can train and evaluate CT metrics on them, but cannot run closed-loop PET reconstruction locally. We chose to provide sinogram data for only 8 of the 67 training subjects to keep the dataset size managable. 
+
+**Why is `prediction_mask.nii.gz` in `ct-label/`?**
+
+It marks the voxels your model is responsible for predicting (body minus face and scanner bed). During training you may want to restrict your loss to this mask so the model is not penalised for face/bed regions that are overwritten anyway during reconstruction.
+
+**The MRI comes in chunks — do I need to stitch them?**
+
+Pre-stitched versions (`mri_combined_in_phase.nii.gz`, `mri_combined_out_phase.nii.gz`) are provided if you want a single whole-body volume. The individual chunks (`mri_chunk_{0-3}_{in/out}_phase.nii.gz`) are available if you prefer to work per bed position. 
+
+
+**What format does my pseudo-CT need to be in?**
+
+A NIfTI file (`.nii.gz`) in Hounsfield units, with the same shape and affine as `features/nacpet.nii.gz`. Copying the header directly from the NAC-PET when saving is the safest approach:
+
+```python
+ref = nib.load("features/nacpet.nii.gz")
+nib.save(nib.Nifti1Image(pred_hu, ref.affine, ref.header), "ct.nii.gz")
+```
+
+**Do I need to install STIR locally?**
+
+No, you do not even need to run reconstruction locally - unless you want to validate using the PET-based challenge metrics. If we you do wish to do reconstruction, we recommend using the Docker image, which includes STIR and all dependencies. The image wraps the python code in (`src/recon`) (see ['reconstruction.md](reconstruction.md) for details). Alternatively, you can run the reconstruction locally if you have a local STIR build. Please see [STIR User Guide](https://stir.sourceforge.net/documentation/STIR-UsersGuide.pdf) for installation instructions. IMPORTANT: Make sure to install STIR from source and not a prepackaged version, since the critical reconstruction bugs related to Quadra Sinograms remain present in version 6.3. 
+
+**Reconstruction is slow — how long should I expect it to take?**
+
+Roughly 20–120 minutes per subject on a modern CPU, dominated by the OSEM reconstruction step (step 9). Intermediate outputs are cached, so re-runs resume from where they left off unless `OVERWRITE=1` is set.
+
+**How do I debug a failed reconstruction?**
+
+Check `output_dir/intermediates/recon.log` for the full STIR log. Rerun with `VERBOSE=1` (Docker) or `-v` (Python) to stream STIR output to the terminal in real time.
+
+**Which metrics are reported on the validation leaderboard?**
+
+Four metrics in total: Whole-body SUV MAE, Brain Outlier Score, Organ Bias, and CT μ-MAE. See the evaluation section of the main README for descriptions. The Brain Outlier Score is a dataset-level metric — it cannot be computed for a single subject. The fifth and final metric "TAC Bias" is only computed for the final test set. The metric calculation requires reconstruction using dynamic sinograms, which are unfortunately too large to share. 
+
+**Can I evaluate without running reconstruction?**
+
+Yes — CT μ-MAE only requires your pseudo-CT and the ground-truth CT. Pass `--pred_ct` without `--pred_pet` to `eval_subject.py`:
+
+```bash
+python src/evaluation/eval_subject.py \
+  --subject_dir data/sub-000 \
+  --pred_ct outputs/sub-000/ct.nii.gz
+```
+
+**Do I need to submit both a pseudo-CT and a PET?**
+For the validation phase, you can submit CT-only, PET-only, or both to CodaBench in a zip file. Please see [Submission Guide](submission-guide.md) for instructions. Submitting both PET and CT unlocks all four metrics. Note that to submit a PET image, you have to run reconstruction locally. For the Final Test phase, the organizers will run reconstruction so you only submit the the Docker image with your pseudo-CT model. 
+
+**How can I make sure that my submitted Docker image will work?**
+Once the validation phase starts, you can submit your pseudo-CT container for "Dry-Run". The organizers will the run your container on the hardware used for final evaluation and report back the CT-based metrics for the validation set. This way you can check that the container runs successfully and within the 5-minute time limit. 
+
+**Does my final container need to be the same as the dry-run container?**
+No. But we recommend doing a dry-run for the container you intend to submit for the final validation to ensure that it will not crash or run out of memory. 
diff --git a/pyproject.toml b/pyproject.toml
@@ -11,3 +11,8 @@ dependencies = [
 
 [tool.setuptools.packages.find]
 where = ["src"]
+
+[dependency-groups]
+dev = [
+    "matplotlib>=3.10.8",
+]
diff --git a/src/evaluation/metrics/ct_whole_body_mae.py b/src/evaluation/metrics/ct_whole_body_mae.py
@@ -8,16 +8,27 @@
 import numpy as np
 import nibabel as nib
 
-def hu_to_mu(ct_path, kvp=120):
-    """Carney et al. 2006 (Med Phys 33:976-983) bilinear HU to mu at 511 keV."""
-    bone_slope = {80: 3.84e-5, 100: 4.56e-5, 120: 5.10e-5, 140: 5.64e-5}
+def hu_to_mu(ct_path):
+    """
+    Carney et al. 2006 (Med Phys 33:976-983) 
+    bilinear HU to mu at 511 keV for 120 kVp.
+    """
+    # Carney parameters for KVP 120: (slope 'a', intercept 'b', breakpoint 'bp' in HU+1000)
+    a, b, bp = (5.10e-5, 4.71e-2, 1047) # 1047 corresponds to 47 HU
 
+    # Load NIfTI CT image
     ct = nib.load(ct_path)
     hu = ct.get_fdata(dtype=np.float32)
 
-    mu = np.where(hu <= 0,
-                  9.6e-5 * (hu + 1000),
-                  9.6e-5 * 1000 + bone_slope[kvp] * hu)
+    # Pre-calculate (HU + 1000) to optimize the np.where evaluations
+    hu1000 = hu + 1000
+
+    # Apply the Carney bilinear scaling
+    mu = np.where(hu1000 < bp,
+                  9.6e-5 * hu1000,
+                  a * hu1000 + b)
+
+    # Ensure no negative attenuation values from extreme CT noise
     mu = np.clip(mu, 0, None)
 
     return nib.Nifti1Image(mu, ct.affine, ct.header)
diff --git a/src/recon/ct_to_acf.py b/src/recon/ct_to_acf.py
@@ -100,16 +100,27 @@ def swap_face_from_gt(pred_ct_path, ct_face_and_bed_path, face_mask_path, output
 
     return result_img
 
-def hu_to_mu(ct_path, kvp=120):
-    """Carney et al. 2006 (Med Phys 33:976-983) bilinear HU to mu at 511 keV."""
-    bone_slope = {80: 3.84e-5, 100: 4.56e-5, 120: 5.10e-5, 140: 5.64e-5}
-
+def hu_to_mu(ct_path):
+    """
+    Carney et al. 2006 (Med Phys 33:976-983) 
+    bilinear HU to mu at 511 keV for 120 kVp.
+    """
+    # Carney parameters for KVP 120: (slope 'a', intercept 'b', breakpoint 'bp' in HU+1000)
+    a, b, bp = (5.10e-5, 4.71e-2, 1047) # 1047 corresponds to 47 HU
+    
+    # Load NIfTI CT image
     ct = nib.load(ct_path)
     hu = ct.get_fdata(dtype=np.float32)
 
-    mu = np.where(hu <= 0,
-                  9.6e-5 * (hu + 1000),
-                  9.6e-5 * 1000 + bone_slope[kvp] * hu)
+    # Pre-calculate (HU + 1000) to optimize the np.where evaluations
+    hu1000 = hu + 1000
+
+    # Apply the Carney bilinear scaling
+    mu = np.where(hu1000 < bp,
+                  9.6e-5 * hu1000,
+                  a * hu1000 + b)
+
+    # Ensure no negative attenuation values from extreme CT noise
     mu = np.clip(mu, 0, None)
 
     return nib.Nifti1Image(mu, ct.affine, ct.header)
diff --git a/src/recon/main.py b/src/recon/main.py
@@ -79,7 +79,7 @@ def reconstruction_pipeline(
     if not os.path.exists(mumap_nifti_path) or overwrite:
         t = time.perf_counter()
         log.info("[3/10] Converting CT to mu-map...")
-        mu = hu_to_mu(ct_path, kvp=120)
+        mu = hu_to_mu(ct_path)
         mu.to_filename(mumap_nifti_path)
         log.info(f"      done ({time.perf_counter()-t:.1f}s)")
     else:
diff --git a/uv.lock b/uv.lock

Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,5 @@`
	`1`	`+[WIP]`
	`2`	`+`
`1`	`3`	`# PET Imaging Background`
`2`	`4`
`3`	`5`	`This guide is for participants who are familiar with medical imaging (MRI, CT) but have not worked extensively with PET. It covers the concepts you need to understand the challenge task, the data, and the evaluation metrics.`