Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Git
.git
.gitignore
.github

# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
ENV/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
pip-log.txt
pip-delete-this-directory.txt
.pytest_cache/
.coverage
htmlcov/

# Node.js
node_modules/
npm-debug.log*
yarn-debug.log*
yarn-error.log*
.pnp
.pnp.js
frontend/build
frontend/.env.local
frontend/.env.development.local
frontend/.env.test.local
frontend/.env.production.local

# IDEs
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store

# Environment
.env
.env.local
.env.*.local

# Uploads (will be handled by volumes)
backend/uploads/*
!backend/uploads/.gitkeep

# Database
*.db
*.sqlite
*.sqlite3

# Logs
*.log
logs/

# Documentation
docs/assets/
*.md
!README.md

# Testing
.pytest_cache/
coverage/
.coverage

# OS
Thumbs.db
.DS_Store
70 changes: 61 additions & 9 deletions Dockerfile.txt
Original file line number Diff line number Diff line change
@@ -1,18 +1,70 @@
# Use official Python base image
FROM python:3.11-slim
# ML Simulator - Multi-stage Dockerfile
# Author: Akshit
# Date: October 13, 2025
# Purpose: Containerize the ML Simulator application

# ================================
# Stage 1: Build Frontend
# ================================
FROM node:18-alpine AS frontend-build

# Set working directory
WORKDIR /app/frontend

# Copy frontend package files
COPY frontend/package*.json ./

# Install dependencies
RUN npm ci --only=production

# Copy frontend source code
COPY frontend/ ./

# Build the React app
RUN npm run build

# ================================
# Stage 2: Backend + Frontend
# ================================
FROM python:3.9-slim

# Set environment variables
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
gcc \
g++ \
&& rm -rf /var/lib/apt/lists/*

# Create app directory
WORKDIR /app

# Copy requirements file and install dependencies
# Copy backend requirements and install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the app files
COPY . .
# Copy backend code
COPY backend/ ./backend/

# Copy built frontend from previous stage
COPY --from=frontend-build /app/frontend/build ./frontend/build

# Create uploads directory
RUN mkdir -p backend/uploads/datasets backend/uploads/resumes

# Expose port
EXPOSE 5000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1

# Expose a port if your app runs a server (optional)
# EXPOSE 5000
# Set working directory to backend
WORKDIR /app/backend

# Default command to run the simulator
CMD ["python", "main.py"]
# Run the application
CMD ["python", "app.py"]
53 changes: 53 additions & 0 deletions Docs/Knn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# K-Nearest Neighbors (KNN) - Documentation

## 📋 Overview

KNN is a simple, instance-based learning algorithm that classifies data points based on the classes of their k nearest neighbors[web:100][web:102].

**Key Characteristics:**
- **Type**: Instance-based Learning
- **Algorithm**: Distance-based classification
- **Output**: Class based on neighbor voting
- **Best For**: Small to medium datasets, pattern recognition

## 🎯 Purpose and Use Cases

- **Recommendation Systems**: Similar user preferences
- **Pattern Recognition**: Handwriting, image recognition
- **Anomaly Detection**: Identifying outliers
- **Medical Diagnosis**: Similar patient cases
- **Text Classification**: Document similarity

## 📊 Key Parameters

| Parameter | Description | Default | Recommendation |
|-----------|-------------|---------|----------------|
| **n_neighbors (k)** | Number of neighbors | 5 | 3-15 (odd numbers) |
| **weights** | Vote weighting | uniform | uniform/distance |
| **metric** | Distance measure | euclidean | euclidean/manhattan |

## 💡 Choosing K Value

- **Small k (3-5)**: More sensitive to noise, complex boundaries
- **Large k (10-20)**: Smoother boundaries, may miss patterns
- **Rule of thumb**: √n where n = number of samples
- **Use odd k**: Avoids tie votes in binary classification

## 🐛 Common Issues

### Slow Prediction
- Reduce training data size
- Use approximate methods
- Try other algorithms for large datasets

### Poor Performance
- Scale features (very important for KNN!)
- Try different k values
- Check for irrelevant features

---

**Last Updated**: October 13, 2025
**Version**: 1.0
**Author**: Akshit
**Hacktoberfest 2025 Contribution** 🎃
44 changes: 44 additions & 0 deletions Docs/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# ML Simulator - Model Documentation

Welcome to the ML Simulator documentation! This directory contains comprehensive guides for each machine learning model available in the simulator.

## 📚 Available Models

| Model | Type | Documentation | Use Case |
|-------|------|---------------|----------|
| [Logistic Regression](logistic_regression.md) | Classification | Binary classification | Disease prediction, spam detection |
| [Linear Regression](linear_regression.md) | Regression | Continuous prediction | Price prediction, trend analysis |
| [Decision Tree](decision_tree.md) | Classification/Regression | Tree-based decisions | Credit scoring, diagnosis |
| [Random Forest](random_forest.md) | Ensemble | Multiple trees | Complex classification tasks |
| [K-Nearest Neighbors](knn.md) | Classification/Regression | Instance-based | Pattern recognition |
| [Support Vector Machine](svm.md) | Classification | Maximum margin | Text classification, image recognition |

## 🚀 Quick Start

Each model documentation includes:
- ✅ **Overview**: What the model does and when to use it
- ✅ **How to Run**: Step-by-step instructions
- ✅ **Parameter Explanations**: What each setting means
- ✅ **Plot Interpretations**: Understanding visualizations
- ✅ **Performance Metrics**: Evaluating model quality
- ✅ **Troubleshooting**: Common issues and solutions
- ✅ **Examples**: Real-world use cases

## 📖 How to Use This Documentation

1. Select the model you want to learn about from the table above
2. Click on the documentation link
3. Follow the step-by-step guide
4. Review the screenshot examples
5. Apply to your own dataset

## 🎯 Contributing

Found an error or want to improve the documentation? See our [CONTRIBUTING.md](../CONTRIBUTING.md) for guidelines.

---

**Last Updated**: October 13, 2025
**Version**: 1.0
**Author**: Akshit
**Hacktoberfest 2025 Contribution** 🎃
Loading