This project is a Python/Snakemake reimplementation of the PRISM pipeline, originally developed in R by:
Ghaddar B, Blaser M, De S Rutgers, The State University of New Jersey https://github.com/sjdlabgroup/PRISM
The algorithm and pipeline logic are derived from the original R implementation. The pre-trained XGBoost model (resources/prismxg_exported.bin) was exported from the original R format (.RDS) to Python-compatible binary format; model parameters are unmodified.
This pipeline requires kreport2mpa.py from KrakenTools, which is licensed under GPL v3:
Copyright (C) 2017-2020 Jennifer Lu https://github.com/jenniferlu717/KrakenTools
kreport2mpa.py is not bundled in this repository due to license incompatibility with the PRISM Non-commercial Research License. Install it separately via:
conda install -c bioconda krakentoolsThis Python/Snakemake reimplementation has only been tested on one colorectal cancer WGS sample. While every effort has been made to ensure behavioral consistency with the original R version, the purpose of this project is to understand and improve the PRISM workflow, and we cannot guarantee the analysis conclusions presented in the original PRISM paper. Users should independently validate results for their specific data types and research questions.
This project contains components under two separate licenses:
- PRISM algorithm, pre-trained model, and feature definitions: PRISM Non-commercial Research License (RU-NCRL) from Rutgers, The State University of New Jersey.
- Python/Snakemake implementation code: MIT License, Copyright (c) 2025 Rui Li.
Any use of the complete pipeline is subject to the RU-NCRL non-commercial restriction. See LICENSE for the full text. For commercial use of the PRISM algorithm, contact Rutgers at innovate@research.rutgers.edu (Docket #2025-029).