Feature/ai model training#17
Conversation
Introduces a new training_config.yaml file specifying model, LoRA, training, dataset, and generation settings for the Lab68Dev AI model. This configuration will be used to control training and inference parameters.
Introduces generate_dataset.py to create synthetic training and validation datasets for the Lab68Dev AI model. The script generates structured task and Q&A examples, formats them for TinyLlama chat, and saves them as JSONL files for model training.
Introduces generate_dataset_backup.py for creating synthetic training data for task creation and technical Q&A. The script generates structured prompts and responses for software development tasks and technical explanations, supporting AI model training.
Implements a FastAPI server to serve the Lab68Dev AI model with endpoints for health checks, text generation, and task creation. Loads model and tokenizer on startup, supports CORS, and provides structured request/response models.
Introduces documentation for setting up, training, and running the custom NLP model, including hardware requirements and model details.
Introduces a requirements.txt file specifying core machine learning, inference server, utility, and testing dependencies for the ai-model project.
Introduces ai-model/train.py, a script to fine-tune TinyLlama using LoRA and 4-bit quantization for task creation and tech Q&A. The script loads configuration from YAML, sets up model and tokenizer, loads datasets, configures training arguments, and saves the trained model and tokenizer.
Eliminated the retrieval and injection of RAG context in the chat API endpoint. The endpoint now directly forwards user messages to the Ollama model without attempting to augment them with RAG context.
Deleted lib/services/rag-service.ts, which contained the RAG (Retrieval-Augmented Generation) service for document embedding, storage, and retrieval. This removes all related logic for managing and searching knowledge base documents.
Deleted the 'index-knowledge' and 'index-knowledge:clear' scripts, and removed the '@xenova/transformers', 'ai', and 'chromadb' dependencies from package.json as they are no longer needed.
Deleted scripts/index-knowledge.js, which handled indexing documentation and platform features into the RAG system. This may indicate a change in how knowledge indexing is managed or a migration to a different approach.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Cleaned up the scripts section by removing an unnecessary trailing comma after the 'start:next' script.
There was a problem hiding this comment.
Pull request overview
This pull request introduces a complete AI model training and inference pipeline while removing the existing RAG (Retrieval-Augmented Generation) system. The PR replaces browser/server-based RAG embeddings with a standalone Python-based training pipeline for fine-tuning TinyLlama for software development tasks.
Changes:
- Removed RAG-based knowledge base system including embeddings service, indexing scripts, and related dependencies
- Added Python-based AI model training pipeline with LoRA fine-tuning for TinyLlama
- Introduced FastAPI inference server for serving the fine-tuned model
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/index-knowledge.js | Removed knowledge base indexing script (RAG removal) |
| lib/services/rag-service.ts | Removed RAG embeddings and document search service |
| app/api/chat/route.ts | Removed RAG context retrieval from chat API |
| package.json | Removed RAG-related dependencies and indexing scripts; version bump to 0.1.1 |
| ai-model/train.py | New training script with LoRA configuration and 4-bit quantization |
| ai-model/requirements.txt | Python dependencies for training and inference |
| ai-model/inference/server.py | FastAPI server for model inference with generation endpoints |
| ai-model/data/generate_dataset.py | Synthetic dataset generator for training examples |
| ai-model/data/generate_dataset_backup.py | Incomplete backup dataset generator (truncated) |
| ai-model/config/training_config.yaml | Centralized training hyperparameters and model configuration |
| ai-model/README.md | Setup and usage documentation for the AI training pipeline |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@DongDuong2001 I've opened a new pull request, #18, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: DongDuong2001 <64120873+DongDuong2001@users.noreply.github.com>
Co-authored-by: DongDuong2001 <64120873+DongDuong2001@users.noreply.github.com>
…ssary lock Co-authored-by: DongDuong2001 <64120873+DongDuong2001@users.noreply.github.com>
[WIP] WIP Address feedback on AI model training feature implementation
Bump pnpm version from 8 to 10 across all jobs in the GitHub Actions CI workflow to ensure compatibility with the latest features and improvements.
Cleaned up pnpm-lock.yaml by removing several unused packages and their dependencies, including ai, chromadb, @xenova/transformers, and related libraries. This reduces lockfile size and helps maintain a leaner dependency tree.
This pull request introduces an end-to-end training and inference pipeline for a custom AI assistant model tailored for software development tasks and technical Q&A. It includes scripts for synthetic dataset generation, a configurable training setup, a FastAPI-based inference server, and all necessary dependencies. The most important changes are grouped below:
1. Dataset Generation and Training Pipeline
generate_dataset.pyand a more advancedgenerate_dataset_backup.pyto synthesize ~4000 examples (task creation and tech Q&A) in TinyLlama chat format for model fine-tuning. These scripts use templates and randomization to create diverse, structured prompts and responses. [1] [2]training_config.yamlfor reproducible training runs, specifying model, LoRA, dataset, and generation hyperparameters.2. Inference Server
inference/server.py, a FastAPI app for serving the fine-tuned model with endpoints for text generation and structured task creation. The server loads LoRA adapters if available, applies chat formatting, and exposes CORS for integration.3. Documentation and Dependencies
README.mdwith setup, training, inference, and hardware requirements to guide users through the pipeline.requirements.txtlisting all dependencies for training, inference, and utilities, ensuring reproducibility.