Skip to content

Latest commit

 

History

History
643 lines (492 loc) · 23 KB

File metadata and controls

643 lines (492 loc) · 23 KB

AI Extension Specification — Excel Workbook Risk Diagnostic Tool

This document specifies the generative AI extension to the existing Excel Workbook Risk Diagnostic Tool. It is a self-contained build specification intended for Claude Code. Read the existing README.md first for project context, then implement everything described here.


1. Objectives

Extend the tool with four AI-powered commentary features that help a user quickly understand an unfamiliar Excel workbook before (or alongside) the deterministic risk findings:

# Feature One-line description
1 Findings Narrative Plain-English synthesis of the deterministic risk findings
2 Workbook Purpose + Sheet Narratives What the workbook does and what each tab contributes
3 Key Formula Explanations Plain-English explanation of the most complex formulas
4 Assumption / Input Commentary Discovery-first identification and plausibility review of input parameters

All four features are optional and gated behind a user toggle in the Streamlit sidebar. The deterministic checks continue to work entirely offline and are unaffected.


2. New Files to Create

digest_builder.py     ← Extracts a token-efficient workbook digest for AI
ai_analyser.py        ← Makes API calls and returns AICommentary

3. Existing Files to Modify

File Change
models.py Add new dataclasses: WorkbookDigest, SheetSummary, FormulaExplanation, SheetNarrative, AICommentary
settings.py Add AI-related constants
workbook_analyser.py Call ai_analyser after deterministic checks when AI is enabled
app.py Add AI controls to sidebar; display AI commentary sections in browser
reporter.py Insert AI commentary sections into the PDF report

4. New Dataclasses (models.py)

Add the following dataclasses. Do not modify existing dataclasses.

from dataclasses import dataclass, field
from typing import Optional

@dataclass
class SheetSummary:
    name: str
    row_count: int
    col_count: int
    is_sensitive: bool          # True if name matches actuarial keywords
    headers: list[str]          # First-row and first-column string values, deduplicated
    top_formulas: list[str]     # Up to AI_TOP_FORMULAS_PER_SHEET, sorted by complexity score descending
    named_ranges: list[str]     # Names of named ranges scoped to this sheet
    references_sheets: list[str]  # Other sheet names this sheet's formulas reference

@dataclass
class WorkbookDigest:
    file_name: str
    sheet_summaries: list[SheetSummary]
    workbook_named_ranges: dict[str, str]   # name -> formula or scalar value as string
    vba_present: bool
    vba_module_names: list[str]

@dataclass
class FormulaExplanation:
    sheet_name: str
    cell_address: str
    formula: str
    explanation: str

@dataclass
class SheetNarrative:
    sheet_name: str
    narrative: str

@dataclass
class AICommentary:
    findings_narrative: Optional[str] = None
    workbook_purpose: Optional[str] = None
    sheet_narratives: list[SheetNarrative] = field(default_factory=list)
    formula_explanations: list[FormulaExplanation] = field(default_factory=list)
    assumption_commentary: Optional[str] = None
    api_error: Optional[str] = None   # Populated if any API call fails; partial results still used

5. New Constants (settings.py)

Add the following block at the end of settings.py:

# ---------------------------------------------------------------------------
# AI Commentary settings
# ---------------------------------------------------------------------------
AI_MODEL = "claude-sonnet-4-6"
AI_MAX_TOKENS = 2048
AI_TOP_FORMULAS_PER_SHEET = 15      # Max formulas extracted per sheet for digest
AI_MAX_HEADER_VALUES = 30           # Max header strings extracted per sheet
AI_MAX_SHEETS_FOR_NARRATIVE = 25    # Sheets beyond this count are summarised by name only
AI_MAX_FORMULA_CHARS = 400          # Formulas longer than this are truncated with "..."

6. digest_builder.py — Full Specification

Purpose

Extracts a structured, token-efficient digest from a workbook using openpyxl. Does not call the AI. Returns a WorkbookDigest.

Formula complexity score

Used to rank formulas so the most important ones surface first.

score(formula_string) = len(formula_string) + (formula_string.count("(") * 5)

Header extraction

For a given sheet, collect:

  • All string values from row 1 (first row)
  • All string values from column A (first column)
  • Deduplicate, remove empty/None, truncate list to AI_MAX_HEADER_VALUES

Sheet dependency extraction

Parse each formula string and extract any sheet references of the form SheetName! or 'Sheet Name'!. Return the list of unique referenced sheet names (excluding the current sheet).

Named range extraction

For workbook-level named ranges, store name -> str(range_or_value). For sheet-level named ranges, add the name to the relevant SheetSummary.named_ranges.

VBA extraction

vba_present: True if the file extension is .xlsm. vba_module_names: If vba_checker has already run and its findings are available, extract unique module names from finding details. Otherwise return an empty list. Do not re-extract VBA source here.

Graceful handling

Wrap the entire function in try/except. On any error, return a minimal WorkbookDigest with an empty sheet_summaries list and log the error.

Signature

def build_digest(wb_path: Path, workbook_analyser_result=None) -> WorkbookDigest:
    ...

workbook_analyser_result is the WorkbookAnalysisResult from the deterministic pass; pass it in so the builder can check is_sensitive flags and VBA module names without re-opening the file. It is optional (may be None).


7. ai_analyser.py — Full Specification

Purpose

Takes a WorkbookDigest and a WorkbookAnalysisResult and returns an AICommentary. Makes one API call per feature. Features that are disabled by the user are skipped.

Anthropic SDK usage

Use the anthropic Python SDK (add anthropic>=0.25.0 to requirements.txt).

import anthropic

client = anthropic.Anthropic(api_key=api_key)

response = client.messages.create(
    model=AI_MODEL,
    max_tokens=AI_MAX_TOKENS,
    system=SYSTEM_PROMPT,
    messages=[{"role": "user", "content": user_prompt}]
)
return response.content[0].text

System prompt (shared across all four calls)

You are an expert actuarial model reviewer with deep experience in life
insurance, general insurance, and financial modelling in Excel. You are
reviewing an Excel workbook on behalf of a qualified actuary. Your role is
to provide clear, concise, technically accurate commentary that helps the
actuary quickly understand the workbook's purpose, structure, and risks.

Write in plain English. Be specific and direct. Do not pad responses with
generic statements. If you are uncertain about something, say so briefly and
explain what additional information would resolve the uncertainty. Never
fabricate values or claim to have seen things not present in the data
provided to you.

Signature

def run_ai_analysis(
    digest: WorkbookDigest,
    findings_result: WorkbookAnalysisResult,
    api_key: str,
    run_findings_narrative: bool = True,
    run_purpose_and_sheets: bool = True,
    run_formula_explanations: bool = True,
    run_assumption_commentary: bool = True,
) -> AICommentary:
    ...

Populate AICommentary.api_error with a short description if any individual call fails; continue with remaining calls rather than aborting entirely.


8. AI Feature Prompts — Exact Specification

Each sub-section below specifies: (a) what context to include in the user prompt, (b) the exact prompt template, and (c) how to parse the response.


Feature 1 — Findings Narrative

Context to inject:

  • Workbook file name
  • RAG rating and total risk score
  • High finding count, Medium finding count, Low finding count
  • Full findings list formatted as: [SEVERITY] Category — Description (Sheet: X, Cell: Y)
    • Sort: High first, then Medium, then Low
    • Cap at 40 findings to avoid token overflow; if more exist, append a line: (plus N additional Low findings not shown)

Prompt template:

Workbook: {file_name}
RAG Rating: {rag_rating} (Risk Score: {risk_score})
Findings: {high_count} High, {medium_count} Medium, {low_count} Low

Findings detail:
{findings_list}

---

Write a findings narrative for the executive summary of a model governance
report. The narrative should:
- Open with an overall characterisation of the risk level and the most
  important issue(s) to address
- Group related findings into themes rather than listing them one by one
- Call out any findings on actuarially sensitive sheets specifically
- Close with a brief statement of what actions are recommended before the
  workbook is used in production
- Length: 150 to 250 words
- Tone: professional, direct, suitable for a senior actuary or risk manager

Parsing: Use response text directly. No structured output required.


Feature 2 — Workbook Purpose and Sheet Narratives

Context to inject:

  • Workbook file name
  • List of all sheet names (in workbook order), flagging sensitive sheets
  • For each sheet (up to AI_MAX_SHEETS_FOR_NARRATIVE): name, row/col count, headers (up to AI_MAX_HEADER_VALUES), top 5 formula strings, named ranges, sheets it references
  • Workbook-level named ranges (name only, not values, to avoid token bloat)
  • VBA present flag and module names

Prompt template:

Workbook: {file_name}
Sheets (in order): {sheet_list_with_sensitive_flags}

Sheet details:
{per_sheet_detail_block}

Workbook-level named ranges: {named_ranges_list}
VBA: {vba_summary}

---

Provide two things:

1. WORKBOOK PURPOSE (100–180 words)
A plain-English description of what this workbook does, the type of model
or analysis it represents, the likely business context, and the overall
calculation flow from inputs to outputs. If the workbook type is uncertain,
state the most likely interpretation and briefly note what is ambiguous.

2. SHEET NARRATIVES
For each sheet, write 2–4 sentences describing: what the sheet contains or
calculates, its role in the overall model, and any notable structural
observations (e.g. it appears to be a lookup table, a results output, a
control sheet, an assumption repository). If a sheet name is ambiguous,
say so.

Format your response as valid JSON matching this schema exactly:
{{
  "workbook_purpose": "<string>",
  "sheet_narratives": {{
    "<sheet_name>": "<narrative string>",
    ...
  }}
}}
Return only the JSON object. No preamble, no markdown fences.

Parsing: Parse as JSON. Extract workbook_purpose as a string and sheet_narratives as a dict[str, str]. Convert to list[SheetNarrative]. If JSON parsing fails, store the raw response text in workbook_purpose and leave sheet_narratives empty — do not raise an exception.


Feature 3 — Key Formula Explanations

Context to inject: Select the top formulas across the entire workbook by complexity score. Cap at 25 formulas total. For each, include: sheet name, cell address, formula string (truncated to AI_MAX_FORMULA_CHARS if needed).

Prioritise formulas from sensitive sheets: fill the first 15 slots from sensitive sheets (if enough exist), then fill remaining slots from other sheets.

Prompt template:

Workbook: {file_name}

The following are the most structurally complex formulas in this workbook,
selected by formula length and nesting depth:

{formula_list}
(Format: SHEET::CELL — FORMULA)

---

For each formula, provide a plain-English explanation of what it calculates.
Your explanation should:
- Describe the purpose of the calculation in business/actuarial terms where
  inferable from context (sheet name, cell address, surrounding named ranges)
- Identify the key inputs and what they represent
- Flag any structural concerns (e.g. deeply nested IFs that are hard to audit,
  approximate-match lookups, volatile functions)
- Be 2–5 sentences per formula

Format your response as valid JSON matching this schema exactly:
[
  {{
    "sheet_name": "<string>",
    "cell_address": "<string>",
    "formula": "<string>",
    "explanation": "<string>"
  }},
  ...
]
Return only the JSON array. No preamble, no markdown fences.

Parsing: Parse as a JSON array. Convert each element to a FormulaExplanation. If JSON parsing fails, store one FormulaExplanation with sheet_name="Parse Error", cell_address="", formula="", and the raw response text as explanation.


Feature 4 — Assumption / Input Commentary

Context to inject: For each sheet, include: sheet name, is_sensitive flag, headers, all formula strings that are not themselves formulas (i.e. cells whose value is a number or string but which appear to be referenced by formulas on other sheets — proxy this by checking if the sheet is referenced in WorkbookDigest.sheet_summaries[other].references_sheets), and the full list of named ranges defined on the sheet with their values where available.

Also include the 5 most "formula-free" sheets (sheets with the lowest ratio of formula cells to total populated cells) as these are most likely to be input/assumption sheets.

Prompt template:

Workbook: {file_name}

The following information has been extracted from sheets that appear to
contain input parameters or assumptions (identified by low formula density
and/or being referenced by other sheets):

{input_sheet_detail_block}

Named ranges with values:
{named_ranges_with_values}

---

Provide assumption and input parameter commentary covering:

1. PARAMETER DISCOVERY
Identify what appear to be the key input parameters or assumption sets in
this workbook. Do not restrict yourself to actuarial assumptions — identify
whatever inputs drive the calculations, regardless of domain. Group them
logically (e.g. "rate assumptions", "volume inputs", "expense parameters").

2. PLAUSIBILITY COMMENTARY
For each parameter group identified, comment on whether the values appear
internally consistent (e.g. rates within a plausible range, consistent
units, no obvious sign errors, no suspicious round numbers that might
indicate placeholders). Be specific about any values that look anomalous.
If you cannot assess plausibility from the data available, say so briefly.

3. STRUCTURAL OBSERVATIONS
Comment on how assumptions flow into calculations — e.g. whether they are
centralised in one sheet or scattered, whether named ranges are used
consistently, whether there are hardcoded values in formula sheets that
should instead be parameterised.

Length: 200–350 words total.
Tone: professional, advisory, suitable for a model governance review.

Parsing: Use response text directly. No structured output required.


9. workbook_analyser.py — Modifications

After the existing deterministic analysis block, add:

# AI Commentary (optional)
ai_commentary = None
if ai_config is not None and ai_config.get("enabled"):
    from digest_builder import build_digest
    from ai_analyser import run_ai_analysis
    digest = build_digest(wb_path, result)
    ai_commentary = run_ai_analysis(
        digest=digest,
        findings_result=result,
        api_key=ai_config["api_key"],
        run_findings_narrative=ai_config.get("findings_narrative", True),
        run_purpose_and_sheets=ai_config.get("purpose_and_sheets", True),
        run_formula_explanations=ai_config.get("formula_explanations", True),
        run_assumption_commentary=ai_config.get("assumption_commentary", True),
    )
result.ai_commentary = ai_commentary

Add ai_commentary: Optional[AICommentary] = None to WorkbookAnalysisResult in models.py.

Pass ai_config as a new optional parameter to analyse_workbook():

def analyse_workbook(wb_path: Path, settings: Settings = None, ai_config: dict = None) -> WorkbookAnalysisResult:

10. app.py — Sidebar UI Additions

Add a new expander section in the sidebar titled "AI Commentary (Optional)", positioned below the existing Settings panel.

┌─────────────────────────────────────────┐
│ 🤖 AI Commentary (Optional)             │
│                                         │
│ Enable AI commentary  [checkbox]        │
│                                         │
│ [When enabled and no key is set:]       │
│                                         │
│ Anthropic API Key  [password input]     │
│ [Set API Key]  button                   │
│ (stored in session only, never saved)   │
│                                         │
│ [When key has been set:]                │
│ ✅ API key set for this session.        │
│ [Remove API key]  button                │
│                                         │
│ Features to include:                    │
│  ☑ Findings narrative                  │
│  ☑ Workbook purpose & sheet summaries  │
│  ☑ Key formula explanations            │
│  ☑ Assumption commentary               │
│                                         │
│ ⚠ AI analysis adds ~30–60 seconds.     │
└─────────────────────────────────────────┘

Collect these values and pass them as ai_config dict to analyse_workbook.

In the main results area, after the existing RAG badge and finding counts, add an "AI Commentary" tab alongside the existing findings tabs (or as a new expander section if tabs are not already used). Display:

  • Workbook purpose statement (if generated)
  • Sheet narratives as an expandable list (one expander per sheet)
  • Assumption commentary (if generated)
  • Formula explanations as an expandable list (one per formula)
  • Findings narrative (if generated)

If ai_commentary.api_error is not None, show a yellow warning banner with the error text above the AI section.


11. reporter.py — PDF Additions

Insert a new section titled "AI Workbook Intelligence" between the Executive Summary and the Findings by Category section. This section only appears if result.ai_commentary is not None.

Section structure:

11.1 Executive Summary (updated) If findings_narrative is populated, replace the current auto-generated executive summary text with the AI narrative. If not populated, keep the existing deterministic summary.

11.2 AI Workbook Intelligence section (new, AI only)

Sub-sections (omit any that are not populated):

  1. Workbook Purpose Render workbook_purpose as a paragraph with a light blue background box. Label: "AI-Generated — Interpretive Only"

  2. Sheet-by-Sheet Narratives Render as a two-column table: Sheet Name (left, 30% width) | Narrative (right, 70% width). Sensitive sheets should have their name rendered in bold.

  3. Key Formula Explanations Render as a table with columns: Sheet | Cell | Formula (monospace, truncated to 60 chars with "…") | Explanation.

  4. Assumption and Input Commentary Render assumption_commentary as a paragraph with a light yellow background box. Label: "AI-Generated — Interpretive Only"

Disclaimer footer (add to every AI section): "AI commentary is generated from formula metadata and structural analysis. It is interpretive and may not fully reflect model intent. Always validate with the model owner."

Style guidance: Use a distinct but subtle background colour (light blue #EBF4FF for purpose/sheets, light yellow #FFFDE7 for assumptions) to visually distinguish AI content from deterministic findings throughout the report.


12. requirements.txt — Addition

Add:

anthropic>=0.25.0

13. Error Handling Principles

  • Any failure in digest_builder or ai_analyser must never cause the deterministic analysis to fail. Wrap all AI calls in try/except.
  • If the API key is missing or invalid, surface the error in AICommentary.api_error and in a Streamlit warning banner. Do not raise.
  • If an individual feature call fails (e.g. formula explanations), populate api_error with a message and continue with the remaining features.
  • Token limits: if the digest exceeds an estimated 6,000 tokens for any single prompt, truncate sheet details starting from the least sensitive, smallest sheets first until the estimate is within limit. Log a warning if truncation occurs.

14. Testing Guidance

Use the existing test_formulas.xlsx test file. With AI enabled, expected outputs include:

  • Workbook purpose: should identify a life insurance assumption/projection model
  • Sheet narratives: should describe "Assumptions" and "Mortality" sheets correctly
  • Formula explanations: =NOW() and =INDIRECT("A1") should be flagged as volatile; the VLOOKUP without 4th argument should be explained as approximate match risk
  • Findings narrative: should synthesise the formula and structural findings into a coherent paragraph

There is no automated assertion framework required — manual review of outputs is sufficient for this phase.


15. Implementation Order (Recommended)

  1. Add dataclasses to models.py
  2. Add constants to settings.py
  3. Build digest_builder.py and test with a sample workbook
  4. Build ai_analyser.py — implement Feature 1 first (simplest, no JSON parsing)
  5. Add Features 2, 3, 4 to ai_analyser.py
  6. Modify workbook_analyser.py to wire in AI config
  7. Update app.py sidebar and results display
  8. Update reporter.py with AI sections
  9. Update requirements.txt
  10. End-to-end test with all three test workbooks


Implementation Status

All items in this specification have been implemented. The build is complete and the app runs on Python 3.9+.

Post-build fixes applied

Fix Files changed Description
Nested expanders app.py Streamlit forbids st.expander inside another st.expander. The two inner expanders for sheet narratives (Section 10) and formula explanations (Section 10) have been replaced with HTML <details>/<summary> elements rendered via unsafe_allow_html=True. The outer per-file expander is unchanged.
API key UX app.py Replaced the always-visible password input with a form-based entry: the field clears on submit (clear_on_submit=True) and is replaced by a confirmation message once the key is set. A Remove API key button allows the user to clear it. The key is still held in session state only.
PDF overflow crash reporter.py AI-generated text in the formula explanations table, sheet narratives table, and coloured-box sections had no character limit. When the AI returned a long explanation, a single table row could exceed ReportLab's 688pt page frame, crashing PDF generation with "Flowable … too large on page N". Fixed by capping cell content before wrapping in a Paragraph: fe.explanation[:600], sn.narrative[:800], and _coloured_box text [:1500]. Limits are calibrated from column geometry so no row can exceed ~28% of the page frame.

End of specification.