Classification of prostate histological images using Vision Transformers: an analysis with stain normalization and ensemble learning

This repository contains the source code of the paper entitled "Classification of prostate histological images using Vision Transformers: an analysis with stain normalization and ensemble learning" https://doi.org/10.5753/sibgrapi.est.2025.38333.

Structure

process_dataset.py: This script performs dataset preprocessing. Images are resized and normalized using the mean and standard deviation computed from the dataset used to pretrain the ViT model. The processed images are converted into tensors to enable efficient data handling in the main training pipeline (train.py).
train.py: This is the main training script. It implements the complete experimental pipeline, including loading the preprocessed dataset, fine-tuning the Vision Transformer google/vit-large-patch16-224, extracting deep features from the fine-tuned ViT, training classical machine learning classifiers, building a majority voting ensemble model, and computing evaluation metrics.
dataset.py: Auxiliary module invoked by train.py. This script processes the label table and performs label mapping: Assigns label 0 to the non-cancerous class and 1 to cancerous classes.

Prerequisites

CrowdGleason dataset, available at https://www.sciencedirect.com/science/article/pii/S0169260724004656, as well as the stain-normalized versions generated using the SW-CCN and BKSVD methods.

Pipeline Usage

Run process_dataset.py, providing the path to the CrowdGleason dataset and the destination path where the processed images will be saved.
In train.py, edit: the path to the directory containing the processed images and the paths to the label tables for the training, validation, and test splits.
Execute train.py to start the training and evaluation process.

Important Note

Before training, train.py loads the entire dataset into RAM to speed up the training process. If your system does not have sufficient memory, this behavior must be modified accordingly to avoid memory issues.

Citation

For use and distribution, please cite:

@inproceedings{de2025classification,
  title={Classification of prostate histological images using Vision Transformers: an analysis with stain normalization and ensemble learning},
  author={de Albuquerque, Bet{\^a}nia Caroline Silva and Neves, Leandro Alves and do Nascimento, Marcelo Zanchetta and Tosta, Tha{\'\i}na Aparecida Azevedo and others},
  booktitle={Conference on Graphics, Patterns and Images (SIBGRAPI)},
  pages={377--381},
  year={2025},
  organization={SBC}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
dataset.py		dataset.py
process_dataset.py		process_dataset.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classification of prostate histological images using Vision Transformers: an analysis with stain normalization and ensemble learning

Structure

Prerequisites

Pipeline Usage

Important Note

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Classification of prostate histological images using Vision Transformers: an analysis with stain normalization and ensemble learning

Structure

Prerequisites

Pipeline Usage

Important Note

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages