Commit 994b34c
fix: close 9 of 10 identified pipeline gaps (ops stubs, semantic matching) (#95)
* fix(gaps-2-4-7): wire AuditRow callers, calendar deadline logic, flag resolution API
Gap 2: append_audit_row() now called from PdfIngestOp and IngestStatementOp after
successful ingest, populating the AUDIT.log sheet that previously received only headers.
Gap 4: CheckTaxDeadlineOp::execute() now looks up the deadline in ctx.calendar,
computes next_due via BusinessCalendar::next_due, and emits an advisory issue when
the deadline falls within warn_days_before days. No-op when calendar is unconfigured.
Gap 7: ClassificationEngine::resolve_flag() transitions Open→Resolved flags by tx_id.
MCP bulk_resolve_flags() is wired to use it instead of returning a hard-coded error;
dry_run path preserved, live path now resolves flags through the engine.
Adds 7 unit tests (3 for resolve_flag, 4 for check_tax_deadline).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(gaps-1-3): PDF routing error + work queue Ambiguity/Blocker/DocumentIssue
Gap 1 (PDF routing):
- IngestStatementOp::execute() now detects DocType::Pdf early and returns a
clear InvalidInput error directing callers to PdfIngestOp or the MCP
ingest_pdf tool, instead of crashing inside calamine with a parse error.
- PdfIngestOp doc comment updated: removed "Phase 2 stub" label (the op is
implemented), added subprocess note clarifying reqif-opa-mcp is current and
docling is the intended long-term replacement.
- Added ledger_ops unit test: ingest_statement_op_rejects_pdf_with_clear_error.
Gap 3 (work queue):
- Ambiguity branch: queries classification_state.classifications for tx_ids
with confidence < 60%; emits QueueItemType::Ambiguity items for each.
- Blocker branch: queries document_registry for DocumentStatus::Processing
entries (stuck documents); emits QueueItemType::Blocker as Critical severity.
- DocumentIssue branch: queries document_registry for DocumentStatus::Error(msg)
entries (failed ingests); emits QueueItemType::DocumentIssue as High severity.
All three branches previously returned empty results with TODO comments.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* docs(gap-9): promote batch AllOrNothing rollback guidance to doc comment
Converted the inline TODO comment above batch_classify() into a proper ///
doc comment describing the failure recovery procedure for AllOrNothing mode:
re-query affected tx_ids, reverse via classify_transaction, and why full
transactional rollback is intentionally absent.
Removed a stale TODO above bulk_resolve_flags() — that function has no
batch_mode parameter and the note was not applicable there.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(gap-5): wire semantic rule selection into classify_waterfall
The SemanticRuleSelector trait and its lexical-similarity implementation
were already complete but fully disconnected from the production path:
- build_embedding_index() was never called — semantic_index always empty
- classify_waterfall() called select_rules_deterministic() directly,
bypassing select_rules_semantic() entirely
Changes:
- load_from_dir() now calls build_embedding_index() eagerly after construction,
so the Jaccard/token-similarity index is always populated.
- classify_waterfall() now calls select_rules_semantic(top_k=all_rules) instead
of select_rules_deterministic(); select_rules_semantic falls back to
deterministic automatically when the index is empty, so behaviour is identical
when no index exists and improves (similarity-ranked selection) when it does.
- Updated module-level status comments and SemanticRuleSelector trait doc to
reflect implemented state and the clear upgrade path to real embeddings.
- Updated the cross-lingual integration test ignore message: the test remains
ignored because it requires vector embeddings (cross-lingual matching is out
of reach for Jaccard), but the stale "unimplemented!()" notes are corrected.
- Added 5 unit tests: load_from_dir_builds_semantic_index,
select_rules_semantic_returns_all_rules_for_unrelated_tx,
classify_waterfall_uses_semantic_path, and two lexical_similarity tests.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(gaps-1-3-5): wire ClassifyTransactionsOp, GenerateAuditTrailOp, remove slint_viz dead code
ClassifyTransactionsOp now reads TRANSACTIONS sheet via calamine, runs
RuleRegistry::classify_waterfall over Unclassified rows, and records
each classification decision to MUTATION_HISTORY. Respects dry_run and
account_filter. Closes the scheduler→classify loop (gap priority 1).
GenerateAuditTrailOp now reads TRANSACTIONS and MUTATION_HISTORY from the
source workbook, filters rows by year, and writes a two-sheet audit XLSX
to output_path. Gives CPAs a year-scoped transaction + mutation view
(gap priority 3).
slint_viz: deleted slint_viz.rs, removed its pub mod and pub use re-export
from lib.rs, and dropped it from book/src/SUMMARY.md. Zero callers existed;
misplaced in ledger-core (gap priority 5).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(gaps-4-6): local ReconcileAccountOp + cross-lingual semantic matching
ReconcileAccountOp now performs a local-only pass over the TRANSACTIONS
sheet: detects duplicate tx_ids, date gaps > 90 days, and amount outliers
(|amount| > mean + 3σ). Anomalies are written to MUTATION_HISTORY and
returned as issues. Xero integration remains a documented future pass.
Cross-lingual semantic matching (P6): adds normalize_unicode() (ü→ue,
ä→ae, ö→oe, ß→ss) so German compound words survive tokenization intact.
Adds expand_financial_tokens() with a German/French → English financial
glossary (ausland→foreign, ueberweisung→transfer, arbeitgeber→employer/
income, etc.) applied to the query side of select_rules_semantic. Lowers
MIN_LEXICAL_SIMILARITY 0.05→0.02 to account for larger expanded query
sets. Un-ignores test_semantic_rule_selector_selects_by_embedding: it now
passes via the expansion path, not just the deterministic fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet (coordinator) <coordinator@promptexecution.com.au>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent 5da2f0a commit 994b34c
8 files changed
Lines changed: 978 additions & 169 deletions
File tree
- book/src
- crates
- ledger-core/src
- ledgerr-mcp/src
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
39 | 38 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
213 | 213 | | |
214 | 214 | | |
215 | 215 | | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
216 | 271 | | |
217 | 272 | | |
218 | 273 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
326 | 326 | | |
327 | 327 | | |
328 | 328 | | |
329 | | - | |
330 | | - | |
331 | | - | |
332 | | - | |
333 | | - | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
334 | 333 | | |
335 | | - | |
336 | 334 | | |
337 | | - | |
338 | | - | |
339 | | - | |
340 | | - | |
341 | | - | |
342 | | - | |
343 | | - | |
344 | | - | |
345 | | - | |
346 | | - | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
347 | 338 | | |
348 | | - | |
349 | | - | |
350 | | - | |
351 | | - | |
352 | | - | |
353 | | - | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
354 | 345 | | |
355 | | - | |
356 | | - | |
| 346 | + | |
| 347 | + | |
357 | 348 | | |
358 | 349 | | |
359 | 350 | | |
| |||
367 | 358 | | |
368 | 359 | | |
369 | 360 | | |
370 | | - | |
| 361 | + | |
| 362 | + | |
371 | 363 | | |
372 | 364 | | |
373 | | - | |
| 365 | + | |
374 | 366 | | |
375 | 367 | | |
376 | 368 | | |
| |||
0 commit comments