Skip to content

Tier A Phase 1: Connect n8n to FAISS Vector Store #61

@BPMSoftwareSolutions

Description

@BPMSoftwareSolutions

Overview

Connect n8n workflows to the existing FAISS vector store to enable semantic search instead of keyword matching.

Problem

Current n8n workflows use simple keyword overlap scoring:

Job Description → Extract Keywords → Match Against Experiences → Score

This results in:

  • Low precision (many irrelevant results)
  • Missed semantic matches
  • Lower quality resume tailoring

Solution

Replace keyword matching with FAISS semantic search:

Job Description → FAISS Vector Store → Top-K Relevant Chunks → Rerank → Score

Deliverables

1. Python Bridge Script

  • File: n8n/scripts/query_faiss_vector_store.py
  • Purpose: Wraps src/rag/retriever.py for n8n integration
  • Input: Job description
  • Output: Top-K chunks with similarity scores + source IDs
  • Latency: < 500ms per query

2. Verify FAISS Index

  • Check data/rag/faiss_index.bin exists and is current
  • If stale, rebuild using src/rag/rag_indexer.py
  • Verify index contains 120+ documents

3. Update n8n Ingest Workflow

  • File: n8n/n8n/workflows/ingest.json
  • Add Function node: "Query FAISS Vector Store"
  • Replace "Match Experiences" node
  • Pass retrieved chunks to next stage

4. Testing

  • Test on 5 diverse job descriptions:
    • DevOps role → Check for infrastructure/CI-CD bullets
    • Backend role → Check for API/database bullets
    • Leadership role → Check for mentorship/architecture bullets
    • Security role → Check for security/compliance bullets
    • Data role → Check for analytics/ML bullets

Success Criteria

  • ✅ FAISS index verified and loads successfully
  • ✅ Python bridge script created and tested
  • ✅ n8n ingest workflow updated
  • ✅ Retrieval latency < 500ms per query
  • ✅ Retrieval precision@5 > 0.8 (manual eval on 5 jobs)
  • ✅ All results have source_ids traceable to experiences.json
  • ✅ End-to-end ingest latency < 10s

Demonstrable Improvements

  1. Semantic Relevance: Retrieved bullets are semantically relevant to job description
  2. Precision: Top results are highly relevant (no noise)
  3. Coverage: Captures relevant experience even with different terminology
  4. Traceability: All results linked to source experience IDs

Implementation Guide

See n8n/docs/QUICK_START_TIER_A_PHASE_1.md for step-by-step instructions.

Estimated Effort

  • Time: 2-3 hours
  • Difficulty: Medium
  • Dependencies: None (FAISS already exists)

Files to Create

  • n8n/scripts/query_faiss_vector_store.py (~50 lines)

Files to Modify

  • n8n/n8n/workflows/ingest.json (add Function node)

Related Issues

Acceptance Criteria

  • FAISS index verified
  • Python bridge script created and tested
  • n8n workflow updated
  • All 5 test jobs pass quality checks
  • Metrics documented in test_results_phase_a1.md
  • Code reviewed and merged

Labels

  • enhancement
  • rag
  • n8n
  • phase-1
  • tier-a

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions