Skip to content

Consistent EU doc classification and AI analysis pipeline cleanup#1166

Merged
pethers merged 47 commits intomainfrom
copilot/redesign-article-generator-architecture
Mar 16, 2026
Merged

Consistent EU doc classification and AI analysis pipeline cleanup#1166
pethers merged 47 commits intomainfrom
copilot/redesign-article-generator-architecture

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 13, 2026

  • Inspect latest PR review thread comments and identify required code/doc updates
  • Fix generators.ts leftover legacy references/redeclarations in deep-inspection sections
  • Fix Sankey EU/SKR duplicate node/flow handling and keep localized labels
  • Fix duplicate const filtered snippet in .github/workflows/news-evening-analysis.md
  • Run targeted TypeScript/lint/tests for changed areas
  • Run final automated code review + CodeQL scan
  • Reply to the new PR comment with commit hash and validation status
Original prompt

This section details on the original issue you should resolve

<issue_title>Redesign Article Generator Architecture for AI-First Multi-Iteration Analysis</issue_title>
<issue_description>## 📋 Issue Type
Feature / Architecture Redesign

🎯 Objective

Fundamentally redesign the news article generation architecture (scripts/generate-news-enhanced/generators.ts) so that all political analysis content is AI-generated through multiple LLM iterations rather than being produced by static TypeScript templates with hardcoded text. The current architecture uses template-based generators that produce static SWOT entries, dashboard text, and analysis paragraphs. This must shift to an AI-first pipeline where templates provide consistent HTML structure/styling, but all analytical content (SWOT analysis, political assessments, stakeholder perspectives, policy implications) is generated by the AI agent with iterative quality refinement.

📊 Current State

  • generators.ts (1690 lines): Generators (generateWeekAhead, generateCommitteeReports, generatePropositions, generateMotions, generateDeepInspection) produce content by calling generateArticleContent() from data-transformers.ts which uses hardcoded template strings for analysis
  • buildDeepInspectionSections(): Creates SWOT entries from document metadata titles and hardcoded SWOT_DEFAULTS map with template strings like "Policy initiative and agenda-setting on %t" — this is NOT AI analysis
  • generateDeepInspectionContent(): Generates content sections using document titles/types rather than deep AI-powered analysis of document content
  • Content flow: MCP data → static TypeScript transforms → HTML template → output. No AI analysis step exists in the pipeline
  • Stakeholder analysis: Fixed 3 perspectives (Government, Parliament/Opposition, Private Sector) with hardcoded SWOT text

🚀 Desired State

  • AI-first pipeline: MCP data → document enrichment → AI analysis (iteration 1) → structured analysis output → AI quality review (iteration 2)AI stakeholder expansion (iteration 3) → HTML template rendering → output
  • Templates provide structure only: Article HTML structure, CSS classes, accessibility markup, and Schema.org structured data remain template-driven for consistency
  • AI generates all analysis: SWOT entries, political assessments, stakeholder impact analysis, policy domain identification, risk scoring, and watch point generation are all AI-powered
  • Multi-iteration refinement: Each article goes through 2-3 AI passes: (1) initial analysis, (2) fact-checking and depth enhancement, (3) stakeholder completeness review
  • Configurable analysis depth: CLI flags or workflow inputs control analysis depth (quick/standard/deep) affecting number of AI iterations

🔧 Implementation Approach

Phase 1: Create AI Analysis Pipeline Interface

// New file: scripts/ai-analysis/pipeline.ts
interface AnalysisPipeline {
  analyzeDocuments(docs: RawDocument[], topic: string | null, lang: Language): Promise<AnalysisResult>;
  refineAnalysis(initial: AnalysisResult, docs: RawDocument[]): Promise<AnalysisResult>;
  validateCompleteness(analysis: AnalysisResult): Promise<ValidationResult>;
}

interface AnalysisResult {
  stakeholderSwot: StakeholderSwot[];
  policyAssessment: PolicyAssessment;
  mindmapBranches: MindmapBranch[];
  dashboardData: DashboardData;
  watchPoints: WatchPoint[];
  narrativeContent: string;
  confidenceScore: number;
}

Phase 2: Refactor Generator Functions

  • Extract the hardcoded SWOT defaults from generators.ts lines 1069-1086 into AI analysis prompts
  • Replace buildDeepInspectionSections() with pipeline.analyzeDocuments()
  • Replace generateDeepInspectionContent() with AI-generated narrative that references actual document content
  • Keep generateArticleHTML() template rendering unchanged (consistent styling)

Phase 3: Integrate AI Iterations in Agentic Workflows

  • Update .github/workflows/news-*.md to include analysis depth parameters
  • Add iteration tracking in news/metadata/ for audit trail
  • Implement quality threshold checks between iterations

Phase 4: Apply to All Article Types

  • Extend AI analysis to committee reports, propositions, motions (not just deep-inspection)
  • Each article type gets appropriate analysis depth (e.g., committee-reports = standard, deep-inspection = deep)

🤖 Recommended Agent

code-quality-engineer — This is a major architectural refactoring of the generator pipeline requiring careful decomposition of the existing 1690-line generators.ts into a clean AI-first architecture while preserving backward compatibility with all 12 agentic workflows.

✅ Acceptance Criteria

  • New scripts/ai-analysis/pipeline.ts module with typed interfaces for AI analysis pipeline
  • All analysis text in generated articles originates from AI, not hardcoded templates
  • Multi-iteration analy...

💬 Send tasks to Copilot coding agent from Slack and Teams to turn conversations into code. Copilot posts an update in your thread when it's finished.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agentic-workflow Agentic workflow changes ci-cd CI/CD pipeline changes documentation Documentation updates javascript JavaScript code changes news News articles and content generation refactor Code refactoring size-xl Extra large change (> 1000 lines) testing Test coverage workflow GitHub Actions workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Redesign Article Generator Architecture for AI-First Multi-Iteration Analysis

3 participants