Skip to content

Latest commit

 

History

History
194 lines (140 loc) · 5.09 KB

File metadata and controls

194 lines (140 loc) · 5.09 KB

Job Listing Fetcher - Index Integration Complete ✅

Overview

The job listing fetcher has been fully integrated with the system's job listings index. When a job listing is fetched and saved, it is automatically added to data/job_listings/index.json.

Integration Details

What Happens When You Fetch a Job Listing

  1. Fetch HTML - Retrieve job listing from URL or local file
  2. Parse Content - Extract title, company, location, description
  3. Generate Markdown - Create formatted markdown file
  4. Save File - Write to data/job_listings/ directory
  5. Update Index - Automatically add entry to index.json ✨ NEW

Index Entry Structure

Each job listing in the index contains:

{
  "id": "644870ea-db70-49b3-9b1f-a1f4887c3b70",
  "title": "Senior Software Engineer",
  "company": "TechCorp Inc.",
  "location": "San Francisco, CA",
  "file": "test_senior_software_engineer.md",
  "created_at": "2025-10-26T15:24:10.294414Z",
  "description": "Senior Software Engineer at TechCorp Inc. in San Francisco, CA"
}

Key Fields

  • id: Unique UUID for the job listing
  • title: Job title
  • company: Company name
  • location: Job location
  • file: Markdown filename (relative to data/job_listings/)
  • created_at: ISO 8601 timestamp with UTC timezone
  • description: Human-readable description

Implementation

Modified Files

fetch_job_listing.py

  • Added update_job_listings_index() function
  • Integrated index update into fetch_job_listing()
  • Integrated index update into fetch_job_listing_selenium()
  • Automatic UUID generation for each listing
  • ISO 8601 timestamp generation

New Functions

update_job_listings_index(title, company, location, filepath, output_dir)

Updates the job listings index with a new entry.

Parameters:

  • title (str): Job title
  • company (str): Company name
  • location (str): Job location
  • filepath (str): Path to saved markdown file
  • output_dir (str): Output directory (default: "job_listings")

Returns:

  • dict: The created job entry

Example:

from fetch_job_listing import update_job_listings_index

job_entry = update_job_listings_index(
    title="Senior Software Engineer",
    company="TechCorp Inc.",
    location="San Francisco, CA",
    filepath="data/job_listings/senior_software_engineer.md"
)
print(f"Added job with ID: {job_entry['id']}")

Usage Examples

Example 1: Fetch and Auto-Index

from fetch_job_listing import fetch_job_listing

url = "https://example.com/job-listing"
filepath = fetch_job_listing(url)
# Job is automatically added to data/job_listings/index.json

Example 2: Verify Index

import json

with open("data/job_listings/index.json", "r") as f:
    index = json.load(f)

print(f"Total job listings: {len(index['job_listings'])}")
for job in index['job_listings'][-5:]:
    print(f"  - {job['title']} at {job['company']}")

Example 3: Query Index

import json

with open("data/job_listings/index.json", "r") as f:
    index = json.load(f)

# Find all jobs at a specific company
company_name = "TechCorp Inc."
jobs = [j for j in index['job_listings'] if j['company'] == company_name]
print(f"Found {len(jobs)} jobs at {company_name}")

Test Results

Test: Index Integration

Status: PASSED

Results:

  • Index file: data/job_listings/index.json
  • Entries before: 19
  • Entries after: 20
  • New entry added successfully
  • All fields populated correctly

Latest Entry:

Title: Senior Software Engineer
Company: TechCorp Inc.
Location: San Francisco, CA
File: test_senior_software_engineer.md
ID: 644870ea-db70-49b3-9b1f-a1f4887c3b70
Created: 2025-10-26T15:24:10.294414Z

Verification

Run the verification script to check the index:

python verify_index.py

This will display:

  • Total number of job listings
  • Latest 5 entries with all details

Integration with Agent

The AI agent can now:

  1. Query the index to find job listings by company or title
  2. Access job files using the filename from the index
  3. Track job listings with unique IDs and timestamps
  4. Automate workflows that depend on job listing metadata

Benefits

Automatic Tracking - No manual index updates needed ✅ Unique IDs - Each job has a UUID for reliable reference ✅ Timestamps - Know when each job was added ✅ Metadata - Store title, company, location for quick lookup ✅ File References - Link to markdown files for content ✅ Agent Integration - AI agent can query and use the index

Next Steps

  1. ✅ Use fetch_job_listing() to fetch and auto-index jobs
  2. ✅ Query the index with verify_index.py
  3. ✅ Integrate with agent for automated workflows
  4. ✅ Build dashboards using index data

Files

  • fetch_job_listing.py - Main script with index integration
  • test_fetch_with_index.py - Test script demonstrating integration
  • verify_index.py - Verification script to check index
  • data/job_listings/index.json - The job listings index

Status

COMPLETE AND TESTED

The job listing fetcher is fully integrated with the system's index!