Machine Learning you can actually see.
- Current Release: v0.1.0
Overview
GlassBoxML is a theory-first machine learning library built from scratch using pure NumPy.
Unlike traditional libraries that prioritize abstraction and convenience, GlassBoxML emphasizes transparency and understanding. Every model exposes:
what it learns how it learns where it fails
This project bridges the gap between mathematical learning theory and practical implementation.
Philosophy
Most ML libraries behave like black boxes:
model.fit(X, y)
# magic happens
GlassBoxML is different:
model.fit(X, y)
model.loss_history
model.gradients
model.assumptions
model.failure_modes
model.generalization_estimate
You don’t just train models — you inspect learning itself.
Goals Implement core ML algorithms from first principles Expose optimization behavior during training Make model assumptions explicit Demonstrate overfitting and generalization Provide educational transparency without sacrificing code quality Non-Goals Competing with high-performance libraries like scikit-learn GPU acceleration Massive algorithm coverage Production deployment pipelines
This is a learning and reasoning library, not a benchmarking tool.
- Linear Regression
- Logistic Regression
- k‑Nearest Neighbors
- Ridge Regression
- Decision Trees
- Random Forest
- SVM
- Batch Gradient Descent
- Stochastic Gradient Descent
- Momentum
- Loss curves
- Bias–variance indicators
- Overfitting detection
- Condition number warnings
- Generalization estimates
- Capacity indicators
- Noise sensitivity analysis
from glassboxml import LinearRegression
model = LinearRegression()
model.fit(X, y)
print(model.loss_history)
print(model.explain())
print(model.diagnose()) # basic model insights (expanded in future versions)
---
glassboxml/
│
├── core/ # optimizers, model selection, base classes
├── models/ # ML algorithms
├── diagnostics/ # overfitting & model insights
├── datasets/ # synthetic data generators
├── metrics/ # evaluation metrics
├── preprocessing/ # scaling and transformations
├── tuning/ # hyperparameter search
└── examples/ # demos & experiments
For Contributors
git clone https://github.com/hogwarts-coder10/GlassBox-ML.git
cd GlassBox-ML
pip install -e.For users
pip install glassboxmlGlassBoxML is designed to be light-weight
- numpy
- matplotlib
from glassboxml.models import LogisticRegression
# Sample Data
X = [[1,2],[2,3],[3,4]]
y = [0,1,1]
model = LogisiticRegression()
model.fit(X,y)
preds = moedel.predict(X)
print(preds)
Modern ML education often teaches usage before understanding.
This creates developers who can:
train models ❌ but not explain, debug, or trust them ❌
GlassBoxML reverses that:
Understand → Implement → Experiment → Trust
Check out the full documentation here: 👉🏻 https://github.com/Hogwarts-coder10/GlassBox-ML/wiki
This project values clarity over cleverness.
Contributions should:
Prefer readable, math-aligned code Include explanation comments Demonstrate failure cases, not just success
GlassBoxML will expand to include core preprocessing and feature engineering tools with deep interpretability.
-
TF-IDF Vectorizer
- Transparent text feature extraction
- Full explanation of term weighting and normalization
-
Label Encoder
- Clear mapping of categorical values
- Diagnostics for unseen or imbalanced labels
All new components will integrate with:
explain()→ step-by-step breakdown of transformationsdiagnose()→ insights into data quality and potential issues
- Cleaner and more intuitive public API
- Reduced exposure of internal components
- Better consistency across models and utilities
GlassBoxML continues to focus on making machine learning understandable, not just usable.