Tier A Phase 2: Schema Enforcement + Guardrails

## Overview

Add JSON schema validation and guardrails to prevent hallucination and ensure consistent, high-quality resume output.

## Problem

Current LLM output has issues:
- ~5% hallucination rate (fabricated employers/dates)
- ~70% schema compliance (missing fields, inconsistent format)
- No validation of output structure
- No guardrails to prevent fabrication

## Solution

Implement 3-layer validation:
1. **Schema Validation**: Enforce JSON structure and field types
2. **Guardrails Prompt**: Add system prompt rules to prevent hallucination
3. **Post-Processing**: Validate output, fill defaults, trace bullets to sources

## Deliverables

### 1. JSON Schema Definition
- **File**: `n8n/schemas/resume_output.json`
- **Fields**:
  - `professional_summary` (string, required)
  - `top_skills` (array of strings, required)
  - `tailored_bullets` (array of objects with text + source_id, required)
  - `ats_keywords` (array of strings, required)
  - `notes` (string, optional)
- **Validation**: Type checking, required fields, format validation

### 2. Validator Script
- **File**: `n8n/scripts/validate_resume_output.py`
- **Functions**:
  - `validate_schema()` - Check JSON structure
  - `detect_hallucination()` - Check employers exist in `experiences.json`
  - `fill_defaults()` - Add missing fields with safe defaults
  - `trace_bullets()` - Verify bullets link to source IDs
- **Output**: Validated JSON + validation report

### 3. Guardrails Prompt
- **File**: `.agent/prompts/system.tailor.guardrails.md`
- **Rules**:
  - No fabrication of employers/dates
  - Every bullet must have source_id
  - Action-first, past-tense language
  - No generic/template language
  - Specific metrics and achievements
- **Integration**: Update n8n tailor workflow system prompt

### 4. Testing
- Test on 20 sample LLM outputs:
  - Valid output → Should pass validation
  - Missing fields → Should fill with defaults
  - Hallucinated employer → Should be caught
  - Invalid JSON → Should be handled gracefully
  - Low-quality output → Should be flagged

## Success Criteria

- ✅ JSON schema created and validated
- ✅ Validator script created and tested
- ✅ Guardrails prompt created and integrated
- ✅ 100% schema compliance on test outputs
- ✅ Zero hallucinated employers in 20 test runs
- ✅ All missing fields filled with safe defaults
- ✅ All bullets traceable to source IDs
- ✅ Validation latency < 100ms per output

## Demonstrable Improvements

1. **Consistency**: All outputs follow same structure
2. **Reliability**: No hallucinated employers/dates
3. **Traceability**: Every bullet linked to source
4. **Completeness**: All required fields present
5. **Quality**: Guardrails enforce high-quality language

## Implementation Guide

See `n8n/docs/TIER_A_PHASE_2_SCHEMA.md` for detailed instructions.

## Estimated Effort

- **Time**: 2-3 hours
- **Difficulty**: Medium
- **Dependencies**: Phase 1 (FAISS integration)

## Files to Create

- `n8n/schemas/resume_output.json` (~50 lines)
- `n8n/scripts/validate_resume_output.py` (~150 lines)
- `.agent/prompts/system.tailor.guardrails.md` (~50 lines)

## Files to Modify

- `n8n/n8n/workflows/tailor.json` (add validation node + update prompt)

## Related Issues

- Parent: #60 (Main enhancement issue)
- Previous: #61 (Phase 1 - FAISS Integration)
- Next: #62 (Phase 3 - End-to-End Testing)

## Acceptance Criteria

- [ ] JSON schema created and validated
- [ ] Validator script created and tested
- [ ] Guardrails prompt created
- [ ] n8n workflow updated with validation
- [ ] All 20 test outputs pass validation
- [ ] Metrics documented in `test_results_phase_a2.md`
- [ ] Code reviewed and merged

## Labels

- enhancement
- rag
- n8n
- phase-2
- tier-a
- quality

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tier A Phase 2: Schema Enforcement + Guardrails #62

Overview

Problem

Solution

Deliverables

1. JSON Schema Definition

2. Validator Script

3. Guardrails Prompt

4. Testing

Success Criteria

Demonstrable Improvements

Implementation Guide

Estimated Effort

Files to Create

Files to Modify

Related Issues

Acceptance Criteria

Labels

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Tier A Phase 2: Schema Enforcement + Guardrails #62

Description

Overview

Problem

Solution

Deliverables

1. JSON Schema Definition

2. Validator Script

3. Guardrails Prompt

4. Testing

Success Criteria

Demonstrable Improvements

Implementation Guide

Estimated Effort

Files to Create

Files to Modify

Related Issues

Acceptance Criteria

Labels

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions