A Vietnamese Sign Language recognition system optimized for CPU and small datasets.
- Put videos in
Dataset/Video/and labels inDataset/Text/Label.csvwith columnsVideo,Label. - Run processing:
python main.py --process-datato generateData/keypoints.npyandData/labels.npy. - Train:
python main.py --trainto train models and write results toModels/.
- Python 3.10 (recommended)
- TensorFlow 2.15, MediaPipe 0.10.5, NumPy 1.26.x
- Missing libraries: run
pip install -r requirements.txtor follow console logs for suggested packages. - Cannot open camera: check
camera_indexin config or try indices 0/1/2. - Low FPS: reduce
frame_width/height, increaseframe_skipin config.
logging:
level: INFO
file: Logs/app.log
training:
camera_index: 0
frame_width: 640
frame_height: 480
fps: 30
flip_horizontal: true
frame_skip: 2
sequence_length: 30
min_detection_confidence: 0.5
min_tracking_confidence: 0.5
prediction_threshold: 0.7- Lightweight Deep Learning models optimized for CPU
- Real-time video processing (15–25 FPS)
- Modern and intuitive PyQt5 interface with vertical sidebar
- Smart data augmentation for limited datasets
- Supports traditional ML, Deep Learning, and Ensemble methods
- Modular architecture, detailed logging, and YAML configuration
- Core: Detection and training engines
- Data: Video processing, keypoints extraction, augmentation
- UI: User interface for real-time visualization
- Utils: Config manager, logging system
- Configs: YAML configuration files
- Models: Trained models storage
- Dataset: Raw video data and labels
- Logs: Application and training logs
Requirements: Python 3.8+, ≥8GB RAM, camera for real-time detection
pip install -r requirements.txt
# or
pip install tensorflow-cpu opencv-python mediapipe PyQt5 scikit-learn-
Run GUI application:
python main.py --gui -
Train model:
python main.py --train --config custom.yaml -
Process video dataset:
python main.py --process-data
-
Prepare Dataset
- Place videos in
Dataset/Video/ - Store labels in
Dataset/Text/Label.csv
- Place videos in
-
Process Data
- Run
python main.py --process-data - Extract keypoints, augment data, save to
Data/
- Run
-
Train Model
- Run
python main.py --train - Perform cross-validation, select best model, save to
Models/
- Run
-
Run Real-Time Detection
- Run
python main.py --gui - Load trained model, open camera, recognize signs in real-time
- Run