Commit 9479ea7
committed
Add civic-AI safety caveats to CKAN response formatters
Wire provenance, sample-size, freshness, type-fidelity, and field-validation
guidance into the CKAN plugin's response formatters so the model gets
explicit warnings instead of silently over-trusting open-data responses:
- Provenance headers + echoed query params on every response (_wrap_response,
_format_provenance_header, _params_repr).
- Sample-size caveats: SMALL SAMPLE / single-record banners so counts and
percentages are not generalized from tiny result sets.
- Data freshness: flag datasets last edited beyond their update cadence
(_parse_ckan_iso, _frequency_days, _format_freshness_caveat).
- Stringly-typed columns: warn when date/number values sit in TEXT columns.
- NULL-like frequency: DATA QUALITY caveat for columns that are mostly
empty / "N/A" / "Unknown".
- Field-name validation with "did you mean" suggestions against the real
schema; search-ambiguity detection across candidate datasets.
- search_and_query composite formatter for one-call keyword-to-rows.
Adds 44 tests covering the new formatters. Full suite: 441 passed.
Pre-existing work from the working tree; committed as-is (formatting
normalized with ruff).1 parent 8256db3 commit 9479ea7
2 files changed
Lines changed: 3085 additions & 449 deletions
0 commit comments