This document specifies the generative AI extension to the existing Excel
Workbook Risk Diagnostic Tool. It is a self-contained build specification
intended for Claude Code. Read the existing README.md first for project
context, then implement everything described here.
Extend the tool with four AI-powered commentary features that help a user quickly understand an unfamiliar Excel workbook before (or alongside) the deterministic risk findings:
| # | Feature | One-line description |
|---|---|---|
| 1 | Findings Narrative | Plain-English synthesis of the deterministic risk findings |
| 2 | Workbook Purpose + Sheet Narratives | What the workbook does and what each tab contributes |
| 3 | Key Formula Explanations | Plain-English explanation of the most complex formulas |
| 4 | Assumption / Input Commentary | Discovery-first identification and plausibility review of input parameters |
All four features are optional and gated behind a user toggle in the Streamlit sidebar. The deterministic checks continue to work entirely offline and are unaffected.
digest_builder.py ← Extracts a token-efficient workbook digest for AI
ai_analyser.py ← Makes API calls and returns AICommentary
| File | Change |
|---|---|
models.py |
Add new dataclasses: WorkbookDigest, SheetSummary, FormulaExplanation, SheetNarrative, AICommentary |
settings.py |
Add AI-related constants |
workbook_analyser.py |
Call ai_analyser after deterministic checks when AI is enabled |
app.py |
Add AI controls to sidebar; display AI commentary sections in browser |
reporter.py |
Insert AI commentary sections into the PDF report |
Add the following dataclasses. Do not modify existing dataclasses.
from dataclasses import dataclass, field
from typing import Optional
@dataclass
class SheetSummary:
name: str
row_count: int
col_count: int
is_sensitive: bool # True if name matches actuarial keywords
headers: list[str] # First-row and first-column string values, deduplicated
top_formulas: list[str] # Up to AI_TOP_FORMULAS_PER_SHEET, sorted by complexity score descending
named_ranges: list[str] # Names of named ranges scoped to this sheet
references_sheets: list[str] # Other sheet names this sheet's formulas reference
@dataclass
class WorkbookDigest:
file_name: str
sheet_summaries: list[SheetSummary]
workbook_named_ranges: dict[str, str] # name -> formula or scalar value as string
vba_present: bool
vba_module_names: list[str]
@dataclass
class FormulaExplanation:
sheet_name: str
cell_address: str
formula: str
explanation: str
@dataclass
class SheetNarrative:
sheet_name: str
narrative: str
@dataclass
class AICommentary:
findings_narrative: Optional[str] = None
workbook_purpose: Optional[str] = None
sheet_narratives: list[SheetNarrative] = field(default_factory=list)
formula_explanations: list[FormulaExplanation] = field(default_factory=list)
assumption_commentary: Optional[str] = None
api_error: Optional[str] = None # Populated if any API call fails; partial results still usedAdd the following block at the end of settings.py:
# ---------------------------------------------------------------------------
# AI Commentary settings
# ---------------------------------------------------------------------------
AI_MODEL = "claude-sonnet-4-6"
AI_MAX_TOKENS = 2048
AI_TOP_FORMULAS_PER_SHEET = 15 # Max formulas extracted per sheet for digest
AI_MAX_HEADER_VALUES = 30 # Max header strings extracted per sheet
AI_MAX_SHEETS_FOR_NARRATIVE = 25 # Sheets beyond this count are summarised by name only
AI_MAX_FORMULA_CHARS = 400 # Formulas longer than this are truncated with "..."Extracts a structured, token-efficient digest from a workbook using openpyxl.
Does not call the AI. Returns a WorkbookDigest.
Used to rank formulas so the most important ones surface first.
score(formula_string) = len(formula_string) + (formula_string.count("(") * 5)
For a given sheet, collect:
- All string values from row 1 (first row)
- All string values from column A (first column)
- Deduplicate, remove empty/None, truncate list to
AI_MAX_HEADER_VALUES
Parse each formula string and extract any sheet references of the form
SheetName! or 'Sheet Name'!. Return the list of unique referenced sheet
names (excluding the current sheet).
For workbook-level named ranges, store name -> str(range_or_value).
For sheet-level named ranges, add the name to the relevant SheetSummary.named_ranges.
vba_present: True if the file extension is .xlsm.
vba_module_names: If vba_checker has already run and its findings are
available, extract unique module names from finding details. Otherwise return
an empty list. Do not re-extract VBA source here.
Wrap the entire function in try/except. On any error, return a minimal
WorkbookDigest with an empty sheet_summaries list and log the error.
def build_digest(wb_path: Path, workbook_analyser_result=None) -> WorkbookDigest:
...workbook_analyser_result is the WorkbookAnalysisResult from the
deterministic pass; pass it in so the builder can check is_sensitive flags
and VBA module names without re-opening the file. It is optional (may be
None).
Takes a WorkbookDigest and a WorkbookAnalysisResult and returns an
AICommentary. Makes one API call per feature. Features that are disabled
by the user are skipped.
Use the anthropic Python SDK (add anthropic>=0.25.0 to requirements.txt).
import anthropic
client = anthropic.Anthropic(api_key=api_key)
response = client.messages.create(
model=AI_MODEL,
max_tokens=AI_MAX_TOKENS,
system=SYSTEM_PROMPT,
messages=[{"role": "user", "content": user_prompt}]
)
return response.content[0].textYou are an expert actuarial model reviewer with deep experience in life
insurance, general insurance, and financial modelling in Excel. You are
reviewing an Excel workbook on behalf of a qualified actuary. Your role is
to provide clear, concise, technically accurate commentary that helps the
actuary quickly understand the workbook's purpose, structure, and risks.
Write in plain English. Be specific and direct. Do not pad responses with
generic statements. If you are uncertain about something, say so briefly and
explain what additional information would resolve the uncertainty. Never
fabricate values or claim to have seen things not present in the data
provided to you.
def run_ai_analysis(
digest: WorkbookDigest,
findings_result: WorkbookAnalysisResult,
api_key: str,
run_findings_narrative: bool = True,
run_purpose_and_sheets: bool = True,
run_formula_explanations: bool = True,
run_assumption_commentary: bool = True,
) -> AICommentary:
...Populate AICommentary.api_error with a short description if any individual
call fails; continue with remaining calls rather than aborting entirely.
Each sub-section below specifies: (a) what context to include in the user prompt, (b) the exact prompt template, and (c) how to parse the response.
Context to inject:
- Workbook file name
- RAG rating and total risk score
- High finding count, Medium finding count, Low finding count
- Full findings list formatted as:
[SEVERITY] Category — Description (Sheet: X, Cell: Y)- Sort: High first, then Medium, then Low
- Cap at 40 findings to avoid token overflow; if more exist, append a line:
(plus N additional Low findings not shown)
Prompt template:
Workbook: {file_name}
RAG Rating: {rag_rating} (Risk Score: {risk_score})
Findings: {high_count} High, {medium_count} Medium, {low_count} Low
Findings detail:
{findings_list}
---
Write a findings narrative for the executive summary of a model governance
report. The narrative should:
- Open with an overall characterisation of the risk level and the most
important issue(s) to address
- Group related findings into themes rather than listing them one by one
- Call out any findings on actuarially sensitive sheets specifically
- Close with a brief statement of what actions are recommended before the
workbook is used in production
- Length: 150 to 250 words
- Tone: professional, direct, suitable for a senior actuary or risk manager
Parsing: Use response text directly. No structured output required.
Context to inject:
- Workbook file name
- List of all sheet names (in workbook order), flagging sensitive sheets
- For each sheet (up to
AI_MAX_SHEETS_FOR_NARRATIVE): name, row/col count, headers (up toAI_MAX_HEADER_VALUES), top 5 formula strings, named ranges, sheets it references - Workbook-level named ranges (name only, not values, to avoid token bloat)
- VBA present flag and module names
Prompt template:
Workbook: {file_name}
Sheets (in order): {sheet_list_with_sensitive_flags}
Sheet details:
{per_sheet_detail_block}
Workbook-level named ranges: {named_ranges_list}
VBA: {vba_summary}
---
Provide two things:
1. WORKBOOK PURPOSE (100–180 words)
A plain-English description of what this workbook does, the type of model
or analysis it represents, the likely business context, and the overall
calculation flow from inputs to outputs. If the workbook type is uncertain,
state the most likely interpretation and briefly note what is ambiguous.
2. SHEET NARRATIVES
For each sheet, write 2–4 sentences describing: what the sheet contains or
calculates, its role in the overall model, and any notable structural
observations (e.g. it appears to be a lookup table, a results output, a
control sheet, an assumption repository). If a sheet name is ambiguous,
say so.
Format your response as valid JSON matching this schema exactly:
{{
"workbook_purpose": "<string>",
"sheet_narratives": {{
"<sheet_name>": "<narrative string>",
...
}}
}}
Return only the JSON object. No preamble, no markdown fences.
Parsing:
Parse as JSON. Extract workbook_purpose as a string and
sheet_narratives as a dict[str, str]. Convert to list[SheetNarrative].
If JSON parsing fails, store the raw response text in workbook_purpose and
leave sheet_narratives empty — do not raise an exception.
Context to inject:
Select the top formulas across the entire workbook by complexity score.
Cap at 25 formulas total. For each, include: sheet name, cell address, formula
string (truncated to AI_MAX_FORMULA_CHARS if needed).
Prioritise formulas from sensitive sheets: fill the first 15 slots from sensitive sheets (if enough exist), then fill remaining slots from other sheets.
Prompt template:
Workbook: {file_name}
The following are the most structurally complex formulas in this workbook,
selected by formula length and nesting depth:
{formula_list}
(Format: SHEET::CELL — FORMULA)
---
For each formula, provide a plain-English explanation of what it calculates.
Your explanation should:
- Describe the purpose of the calculation in business/actuarial terms where
inferable from context (sheet name, cell address, surrounding named ranges)
- Identify the key inputs and what they represent
- Flag any structural concerns (e.g. deeply nested IFs that are hard to audit,
approximate-match lookups, volatile functions)
- Be 2–5 sentences per formula
Format your response as valid JSON matching this schema exactly:
[
{{
"sheet_name": "<string>",
"cell_address": "<string>",
"formula": "<string>",
"explanation": "<string>"
}},
...
]
Return only the JSON array. No preamble, no markdown fences.
Parsing:
Parse as a JSON array. Convert each element to a FormulaExplanation.
If JSON parsing fails, store one FormulaExplanation with sheet_name="Parse Error",
cell_address="", formula="", and the raw response text as explanation.
Context to inject:
For each sheet, include: sheet name, is_sensitive flag, headers, all
formula strings that are not themselves formulas (i.e. cells whose value
is a number or string but which appear to be referenced by formulas on other
sheets — proxy this by checking if the sheet is referenced in
WorkbookDigest.sheet_summaries[other].references_sheets), and the full
list of named ranges defined on the sheet with their values where available.
Also include the 5 most "formula-free" sheets (sheets with the lowest ratio of formula cells to total populated cells) as these are most likely to be input/assumption sheets.
Prompt template:
Workbook: {file_name}
The following information has been extracted from sheets that appear to
contain input parameters or assumptions (identified by low formula density
and/or being referenced by other sheets):
{input_sheet_detail_block}
Named ranges with values:
{named_ranges_with_values}
---
Provide assumption and input parameter commentary covering:
1. PARAMETER DISCOVERY
Identify what appear to be the key input parameters or assumption sets in
this workbook. Do not restrict yourself to actuarial assumptions — identify
whatever inputs drive the calculations, regardless of domain. Group them
logically (e.g. "rate assumptions", "volume inputs", "expense parameters").
2. PLAUSIBILITY COMMENTARY
For each parameter group identified, comment on whether the values appear
internally consistent (e.g. rates within a plausible range, consistent
units, no obvious sign errors, no suspicious round numbers that might
indicate placeholders). Be specific about any values that look anomalous.
If you cannot assess plausibility from the data available, say so briefly.
3. STRUCTURAL OBSERVATIONS
Comment on how assumptions flow into calculations — e.g. whether they are
centralised in one sheet or scattered, whether named ranges are used
consistently, whether there are hardcoded values in formula sheets that
should instead be parameterised.
Length: 200–350 words total.
Tone: professional, advisory, suitable for a model governance review.
Parsing: Use response text directly. No structured output required.
After the existing deterministic analysis block, add:
# AI Commentary (optional)
ai_commentary = None
if ai_config is not None and ai_config.get("enabled"):
from digest_builder import build_digest
from ai_analyser import run_ai_analysis
digest = build_digest(wb_path, result)
ai_commentary = run_ai_analysis(
digest=digest,
findings_result=result,
api_key=ai_config["api_key"],
run_findings_narrative=ai_config.get("findings_narrative", True),
run_purpose_and_sheets=ai_config.get("purpose_and_sheets", True),
run_formula_explanations=ai_config.get("formula_explanations", True),
run_assumption_commentary=ai_config.get("assumption_commentary", True),
)
result.ai_commentary = ai_commentaryAdd ai_commentary: Optional[AICommentary] = None to WorkbookAnalysisResult
in models.py.
Pass ai_config as a new optional parameter to analyse_workbook():
def analyse_workbook(wb_path: Path, settings: Settings = None, ai_config: dict = None) -> WorkbookAnalysisResult:Add a new expander section in the sidebar titled "AI Commentary (Optional)", positioned below the existing Settings panel.
┌─────────────────────────────────────────┐
│ 🤖 AI Commentary (Optional) │
│ │
│ Enable AI commentary [checkbox] │
│ │
│ [When enabled and no key is set:] │
│ │
│ Anthropic API Key [password input] │
│ [Set API Key] button │
│ (stored in session only, never saved) │
│ │
│ [When key has been set:] │
│ ✅ API key set for this session. │
│ [Remove API key] button │
│ │
│ Features to include: │
│ ☑ Findings narrative │
│ ☑ Workbook purpose & sheet summaries │
│ ☑ Key formula explanations │
│ ☑ Assumption commentary │
│ │
│ ⚠ AI analysis adds ~30–60 seconds. │
└─────────────────────────────────────────┘
Collect these values and pass them as ai_config dict to analyse_workbook.
In the main results area, after the existing RAG badge and finding counts, add an "AI Commentary" tab alongside the existing findings tabs (or as a new expander section if tabs are not already used). Display:
- Workbook purpose statement (if generated)
- Sheet narratives as an expandable list (one expander per sheet)
- Assumption commentary (if generated)
- Formula explanations as an expandable list (one per formula)
- Findings narrative (if generated)
If ai_commentary.api_error is not None, show a yellow warning banner with
the error text above the AI section.
Insert a new section titled "AI Workbook Intelligence" between the
Executive Summary and the Findings by Category section. This section only
appears if result.ai_commentary is not None.
11.1 Executive Summary (updated)
If findings_narrative is populated, replace the current auto-generated
executive summary text with the AI narrative. If not populated, keep the
existing deterministic summary.
11.2 AI Workbook Intelligence section (new, AI only)
Sub-sections (omit any that are not populated):
-
Workbook Purpose Render
workbook_purposeas a paragraph with a light blue background box. Label: "AI-Generated — Interpretive Only" -
Sheet-by-Sheet Narratives Render as a two-column table: Sheet Name (left, 30% width) | Narrative (right, 70% width). Sensitive sheets should have their name rendered in bold.
-
Key Formula Explanations Render as a table with columns: Sheet | Cell | Formula (monospace, truncated to 60 chars with "…") | Explanation.
-
Assumption and Input Commentary Render
assumption_commentaryas a paragraph with a light yellow background box. Label: "AI-Generated — Interpretive Only"
Disclaimer footer (add to every AI section): "AI commentary is generated from formula metadata and structural analysis. It is interpretive and may not fully reflect model intent. Always validate with the model owner."
Style guidance: Use a distinct but subtle background colour (light blue #EBF4FF
for purpose/sheets, light yellow #FFFDE7 for assumptions) to visually
distinguish AI content from deterministic findings throughout the report.
Add:
anthropic>=0.25.0
- Any failure in
digest_builderorai_analysermust never cause the deterministic analysis to fail. Wrap all AI calls in try/except. - If the API key is missing or invalid, surface the error in
AICommentary.api_errorand in a Streamlit warning banner. Do not raise. - If an individual feature call fails (e.g. formula explanations), populate
api_errorwith a message and continue with the remaining features. - Token limits: if the digest exceeds an estimated 6,000 tokens for any single prompt, truncate sheet details starting from the least sensitive, smallest sheets first until the estimate is within limit. Log a warning if truncation occurs.
Use the existing test_formulas.xlsx test file. With AI enabled, expected
outputs include:
- Workbook purpose: should identify a life insurance assumption/projection model
- Sheet narratives: should describe "Assumptions" and "Mortality" sheets correctly
- Formula explanations:
=NOW()and=INDIRECT("A1")should be flagged as volatile; the VLOOKUP without 4th argument should be explained as approximate match risk - Findings narrative: should synthesise the formula and structural findings into a coherent paragraph
There is no automated assertion framework required — manual review of outputs is sufficient for this phase.
- Add dataclasses to
models.py - Add constants to
settings.py - Build
digest_builder.pyand test with a sample workbook - Build
ai_analyser.py— implement Feature 1 first (simplest, no JSON parsing) - Add Features 2, 3, 4 to
ai_analyser.py - Modify
workbook_analyser.pyto wire in AI config - Update
app.pysidebar and results display - Update
reporter.pywith AI sections - Update
requirements.txt - End-to-end test with all three test workbooks
All items in this specification have been implemented. The build is complete and the app runs on Python 3.9+.
| Fix | Files changed | Description |
|---|---|---|
| Nested expanders | app.py |
Streamlit forbids st.expander inside another st.expander. The two inner expanders for sheet narratives (Section 10) and formula explanations (Section 10) have been replaced with HTML <details>/<summary> elements rendered via unsafe_allow_html=True. The outer per-file expander is unchanged. |
| API key UX | app.py |
Replaced the always-visible password input with a form-based entry: the field clears on submit (clear_on_submit=True) and is replaced by a confirmation message once the key is set. A Remove API key button allows the user to clear it. The key is still held in session state only. |
| PDF overflow crash | reporter.py |
AI-generated text in the formula explanations table, sheet narratives table, and coloured-box sections had no character limit. When the AI returned a long explanation, a single table row could exceed ReportLab's 688pt page frame, crashing PDF generation with "Flowable … too large on page N". Fixed by capping cell content before wrapping in a Paragraph: fe.explanation[:600], sn.narrative[:800], and _coloured_box text [:1500]. Limits are calibrated from column geometry so no row can exceed ~28% of the page frame. |
End of specification.