Overview
Fine-tune a LoRA adapter on Llama 3.1-8B using past tailored resumes to improve quality and speed.
Problem
After Tier A (40-50% improvement), further improvements require:
- Model-specific knowledge of resume tailoring patterns
- Consistent "Sidney voice" across all resumes
- Faster inference (2x speedup)
- Better handling of edge cases
Solution
Train a LoRA adapter using Supervised Fine-Tuning (SFT):
- Collect 100-200 SFT pairs from past tailored resumes
- Train LoRA adapter on Llama 3.1-8B
- Deploy via vLLM or Ollama
- A/B test against base model
Deliverables
1. Data Collection
- Source: Past tailored resumes + job descriptions
- Format: JSONL with instruction, input, output
- Target: 100-200 pairs across 5 task types:
- Technical role tailoring
- Leadership role tailoring
- Career transition tailoring
- Skill emphasis tailoring
- Industry-specific tailoring
- Quality: Manual review and curation
2. Training Configuration
- Model: Llama 3.1-8B
- Framework: Unsloth or Axolotl
- LoRA Config:
- Rank (r): 16
- Alpha: 32
- Target modules: q_proj, v_proj
- Training Params:
- Epochs: 2
- Learning rate: 2e-4
- Batch size: 4 (per GPU)
- Max seq length: 2048
- Precision: bf16
3. Training Pipeline
- File:
scripts/train_lora_adapter.py
- Features:
- Data loading and preprocessing
- Training loop with validation
- Checkpoint saving
- Loss tracking and logging
- Hardware detection (GPU/CPU)
4. Serving Setup
5. A/B Testing
- Test Set: 20 diverse job descriptions
- Metrics:
- Quality rating (manual eval)
- Latency (inference time)
- Consistency ("Sidney voice")
- Hallucination rate
- Success: LoRA model > base model on all metrics
6. Documentation
- File:
docs/TIER_B_LORA_TRAINING.md
- Contents:
- Data collection process
- Training procedure
- Serving setup
- A/B test results
- Deployment guide
Success Criteria
- ✅ 100-200 SFT pairs collected and curated
- ✅ LoRA adapter trained successfully
- ✅ Training loss converges
- ✅ Serving endpoint responds < 5s
- ✅ A/B test shows improvement on all metrics
- ✅ Quality rating: 4.5/5 or higher
- ✅ Latency: < 15s end-to-end
- ✅ Hallucination rate: < 0.5%
- ✅ Documentation complete
Demonstrable Improvements
- Quality: 20-30% additional improvement over Tier A
- Speed: 2x faster inference (< 15s vs < 30s)
- Consistency: Consistent "Sidney voice" across all resumes
- Reliability: Lower hallucination rate (< 0.5%)
- Customization: Model learns your specific tailoring style
Implementation Guide
See docs/TIER_B_LORA_TRAINING.md for detailed instructions.
Estimated Effort
- Time: 3-4 weeks
- Difficulty: High
- Dependencies: Tier A complete
- Hardware: Single GPU (24GB+) or cloud rental
Files to Create
data/sft_pairs.jsonl - Training data
scripts/train_lora_adapter.py - Training script
scripts/serve_lora_adapter.py - Serving script
docs/TIER_B_LORA_TRAINING.md - Documentation
Files to Modify
n8n/n8n/workflows/tailor.json - Update LLM endpoint (optional)
Related Issues
Acceptance Criteria
Notes
- This is optional and can be done after Tier A
- Requires GPU with 24GB+ VRAM or cloud rental
- Training time: 2-4 hours on single GPU
- Can be parallelized across multiple GPUs
Labels
- enhancement
- rag
- n8n
- tier-b
- fine-tuning
- lora
- optional
Overview
Fine-tune a LoRA adapter on Llama 3.1-8B using past tailored resumes to improve quality and speed.
Problem
After Tier A (40-50% improvement), further improvements require:
Solution
Train a LoRA adapter using Supervised Fine-Tuning (SFT):
Deliverables
1. Data Collection
2. Training Configuration
3. Training Pipeline
scripts/train_lora_adapter.py4. Serving Setup
Option A: vLLM with LoRA
http://localhost:8000/v1/chat/completionsOption B: Ollama with LoRA
http://localhost:11434/api/generate5. A/B Testing
6. Documentation
docs/TIER_B_LORA_TRAINING.mdSuccess Criteria
Demonstrable Improvements
Implementation Guide
See
docs/TIER_B_LORA_TRAINING.mdfor detailed instructions.Estimated Effort
Files to Create
data/sft_pairs.jsonl- Training datascripts/train_lora_adapter.py- Training scriptscripts/serve_lora_adapter.py- Serving scriptdocs/TIER_B_LORA_TRAINING.md- DocumentationFiles to Modify
n8n/n8n/workflows/tailor.json- Update LLM endpoint (optional)Related Issues
Acceptance Criteria
Notes
Labels