A privacy-preserving image forgery detection framework using Federated Learning (FL).
This project combines the FedYogi optimizer with CNN and ResNet50 architectures to improve convergence and stability on non-IID forensic datasets while preserving data privacy.
The framework is trained and evaluated using the CASIA 1.0 Image Tampering Dataset for detecting manipulated and forged images.
- Federated Learning-based image forgery detection
- Privacy-preserving distributed training
- FedYogi optimizer integration
- CNN-based forgery detection model
- ResNet50-based transfer learning model
- Non-IID client data simulation
- CASIA 1.0 dataset support
- Deepfake and image tampering analysis
- Federated model aggregation
- Digital forensic image validation
- Performance evaluation metrics and visualizations
Digital forensics faces significant challenges in detecting manipulated and forged images while maintaining data privacy and security.
Traditional centralized machine learning approaches require sharing raw forensic images across servers, which introduces:
- privacy risks
- data leakage
- legal concerns
- centralized attack surfaces
Federated Learning (FL) solves this issue by allowing multiple clients to collaboratively train models without sharing raw images.
Only model updates are shared with the central server.
This project uses FedYogi, an adaptive federated optimization algorithm designed to improve:
- convergence stability
- performance under non-IID data
- distributed learning efficiency
Federated_Optimization_for_DF/
├── Model/ # CNN and ResNet50 model architectures
├── Model_Image/ # Sample images for testing
├── Results/ # Evaluation metrics and visualizations
├── 596.pdf # Research paper / documentation
└── README.md
- 800 images
- 8 categories
- 100 images per category
| Code | Category |
|---|---|
| ani | Animal |
| arc | Architecture |
| art | Art |
| cha | Characters |
| nat | Nature |
| pla | Plants |
| sec | Scene |
| txt | Texture |
Sp_D_CND_A_pla0005_pla0023_0281.jpg
Meaning:
Sp→ SplicingD→ Different source imagepla0005→ source imagepla0023→ target image0281→ tampered image ID
Sp_S_CND_A_pla0016_pla0016_0196.jpg
Meaning:
S→ Same image used for tampering
CASIA 1.0 dataset is distributed among multiple simulated clients to create a federated learning environment.
Each client independently trains:
- CNN model
- ResNet50 model
on local forensic data.
FedYogi optimizer aggregates model updates at the central server.
Performance is evaluated using:
- Accuracy
- Precision
- Recall
- F1-Score
Raw forensic images never leave client systems.
Only encrypted or aggregated model parameters are exchanged.
mt = β1 * mt-1 + (1-β1) * gt
vt = vt-1 - (1-β2) * sign(vt-1 - g²t)
wt+1 = wt - η * mt / (√vt + ε)
Where:
gt→ aggregated gradientβ1, β2→ momentum coefficientsη→ learning rateε→ numerical stability constant
- Python
- TensorFlow
- Keras
- CNN
- ResNet50
- Federated Learning
- NumPy
- Pandas
- OpenCV
- Matplotlib
- Scikit-learn
- FedYogi Optimizer
- Distributed Training
- Non-IID Data Handling
- Accuracy Metrics
- F1 Score
- Precision / Recall
- Training Visualization
git clone https://github.com/code-with-nc/Federated-Optimization-for-DF-Image-Forgery-Detection-Using-FedYogi-with-CNN-and-ResNet50.gitgit clone git@github.com:code-with-nc/Federated-Optimization-for-DF-Image-Forgery-Detection-Using-FedYogi-with-CNN-and-ResNet50.gitcd Federated-Optimization-for-DF-Image-Forgery-Detection-Using-FedYogi-with-CNN-and-ResNet50python3 -m venv venv
source venv/bin/activatepython -m venv venv
venv\Scripts\activatepip install -r requirements.txtDownload CASIA 1.0 dataset and place it inside project directory.
Example:
project/
├── dataset/
├── Model/
├── Results/
└── README.md
python train_federated.pypython cnn_model.pypython resnet50_model.pypython evaluate.pypython results_visualization.py- Load CASIA 1.0 dataset
- Split dataset among clients
- Initialize CNN / ResNet50 models
- Train local client models
- Aggregate updates using FedYogi
- Evaluate global model
- Generate graphs and metrics
- Compare centralized vs federated performance
| Model | Accuracy |
|---|---|
| CNN | 57.27% |
| ResNet50 | 58.43% |
| Model | Peak Accuracy |
|---|---|
| CNN | ~53.49% |
| ResNet50 | ~55.52% |
- FedYogi improves training stability
- Better convergence under non-IID client data
- Privacy preserved throughout training
- Competitive accuracy achieved without centralized data sharing
- ResNet50 outperformed CNN in most experiments
This project demonstrates that:
- Federated Learning can be effectively applied in digital forensics
- Privacy-preserving training is possible without sharing raw forensic data
- FedYogi provides improved optimization stability for non-IID environments
- ResNet50 achieves better forgery detection performance compared to traditional CNN models
- Federated optimization can support secure AI-driven forensic investigation systems
The framework contributes toward:
- privacy-aware cyber forensic systems
- distributed AI security research
- secure deepfake and image forgery detection workflows
After completing this project, learners will be able to:
- Understand Federated Learning workflows
- Implement privacy-preserving AI systems
- Train CNN and ResNet50 models
- Simulate non-IID federated environments
- Use FedYogi optimizer
- Perform image forgery detection
- Work with forensic image datasets
- Evaluate distributed machine learning models
- Analyze convergence behavior in federated systems
- Raw images never leave client systems
- Reduces centralized data exposure
- Supports distributed forensic collaboration
- Improves privacy compliance
- Enhances secure AI training workflows
If you use this repository or dataset, please cite:
@inproceedings{Dong2013,
doi = {10.1109/chinasip.2013.6625374},
url = {https://doi.org/10.1109/chinasip.2013.6625374},
year = {2013},
month = jul,
publisher = {IEEE},
author = {Jing Dong and Wei Wang and Tieniu Tan},
title = {CASIA Image Tampering Detection Evaluation Database},
booktitle = {2013 IEEE China Summit and International Conference on Signal and Information Processing}
}@article{pham2019hybrid,
title={Hybrid Image-Retrieval Method for Image-Splicing Validation},
author={Pham, Nam Thanh and Lee, Jong-Weon and Kwon, Goo-Rak and Park, Chun-Su},
journal={Symmetry},
volume={11},
number={1},
pages={83},
year={2019},
publisher={MDPI}
}- Differential Privacy integration
- Secure aggregation
- Homomorphic encryption
- Real-time federated deployment
- Blockchain-based audit validation
- Multi-modal forgery detection
- Federated adversarial defense
- Explainable AI integration
This repository is developed strictly for:
- academic research
- educational purposes
- cyber forensic investigation
- privacy-preserving AI experimentation
Do not use the framework for malicious surveillance, unauthorized data analysis, or unethical AI activities.
MIT License
(or according to institutional/project requirements)
Narayani
GitHub: https://github.com/code-with-nc