Skip to content

Release/5.48.0#68

Merged
datakitchen-devops merged 118 commits into
mainfrom
release/5.48.0
Jun 3, 2026
Merged

Release/5.48.0#68
datakitchen-devops merged 118 commits into
mainfrom
release/5.48.0

Conversation

@aarthy-dk
Copy link
Copy Markdown
Contributor

Features

  • server: harden API + MCP server for production deployments (TG-1065) (4257369)
  • mcp: add run status & history tools (TG-1050) (ac8baa2)
  • mcp: add test definition CRUD tools (TG-1054) (6b2c390)
  • mcp: profiling L3 — cross-column search, frequent values, patterns (TG-1067) (11bba62)
  • add server-side pagination for test definitions (TG-1041) (8d9b600)
  • mcp: profiling L4 — cross-run comparison, trends, schema history (TG-1068) (7eb4c2c)
  • salesforce: add Salesforce Data 360 flavor (354aa95)
  • mcp: schedule CRUD tools (TG-1071) (918088c)
  • TG-1001: exclude monitor suites from all queries (5f4b3b8)
  • mcp: add new tool get_quality_scores (cec8098)
  • mcp: add CRUD tools for quality scores (3818ff0)
  • mcp: single-arg compare_test_runs (TG-1056) (fe37c41)
  • retention: add per-project data retention cleanup (TG-1063) (077c70d)
  • mcp: test definition note CRUD tools (TG-1086) (b95ccea)
  • add feedback popup and help item (e9b3c0e)
  • mcp: add CRUD mcp tools for notifications (8e5d3ae)
  • ui: log UI render errors and show a custom error page (a3173bf)

Bug Fixes

  • monitors: freshness-gate Volume_Trend/Metric_Trend prediction (c44ec72)
  • scorecards: filter categories by CDE (470fc1e)
  • drop args column from quick-start seed insert (39d4798)
  • standalone: resolve embedded host/port at connection-build time (f3a1582)
  • standalone: revert Windows signal forwarding to TerminateProcess (1a6150d)
  • scoring: accept leading-dot decimals in fn_eval (e33ef2f)
  • salesforce: apply MR review feedback (55d6a79)
  • TG-1080: cross-flavor template fixes for QUERY-style tests (183805c)
  • common-models: get_previous returns self in TestRun and ProfilingRun (3e800d9)
  • ui: handle out-of-range dates when serializing results to JSON (b26e147)
  • generation: correct Freshness_Trend tran_date_cols filter precedence (1049e1d)
  • profiling: guard empty SPLIT_PART casts in pattern anomaly criteria (a2ece91)
  • scorecard: improve category layout (2704c7a)
  • reports: correct Column Tags and link layout in test issue report PDF (3a09de1)
  • source-data: preserve datetimes for source-data queries and reports (753fc8e)
  • source-data: handle fractional-second timestamps in parse_fuzzy_date (73b66ef)
  • address review feedback (6c95948)

Refactors

  • ui: add data-value to help e2e tests (571bf41)
  • mcp: add get_column_profile_detail tool (55cca1a)
  • mcp: apply TG-1054 review feedback (e02afc3)
  • consolidate row-limiting clauses into FlavorService (079331d)
  • centralize /api/v1 prefix in api package router (e5f6ac0)
  • extract _check_access helper for API resolvers (61509f3)
  • drop vestigial args column from job_executions and job_schedules (158331d)
  • consolidate cross-cutting enums into common.enums (453203b)
  • gate public job exposure by job_key allowlist, not source (207f8e4)
  • TG-1041: address reviewer feedback on pagination implementation (75eafe5)
  • mcp: apply TG-1067 review feedback (ce8cca3)
  • mcp: apply TG-1068 review feedback (49c3d2c)
  • mcp: apply TG-1071 review feedback (a4037d8)
  • mcp: update inventory tool to display scorecards (167a7b1)
  • TG-1041: address second round of reviewer feedback (b104cb1)
  • models: decouple Streamlit cache from common layer (9f0a452)
  • mcp: apply TG-1086 review feedback (a36ee7d)
  • mcp: remove redundant session flush in schedule tools (e4fef2c)

Documentation

  • mcp: change doc group for test definitions (563dc0c)

luis-dk and others added 30 commits May 4, 2026 18:52
…1065)

Four bundled fixes for production readiness:

- DNS rebinding: pass explicit transport_security to FastMCP with
  allowlist derived from BASE_URL + loopback + TG_MCP_EXTRA_ALLOWED_HOSTS.
  Fixes 421 Misdirected Request for external clients caused by FastMCP's
  loopback-only auto-allowlist.
- Security headers: pure-ASGI SecurityHeadersMiddleware injects HSTS
  (TLS-only by default), X-Content-Type-Options, Referrer-Policy, and
  CSP frame-ancestors on success and error responses across /api/*,
  /oauth/*, /.well-known/*, /mcp.
- Body-size cap: pure-ASGI BodySizeLimitMiddleware rejects requests
  exceeding TG_API_MAX_REQUEST_BODY_BYTES (default 10 MiB) with 413,
  enforced via Content-Length fast-reject and a streaming guard with
  a latch to prevent post-disconnect bypass.
- Graceful shutdown: timeout_graceful_shutdown plumbed to uvicorn.run
  via TG_API_GRACEFUL_SHUTDOWN_TIMEOUT (default 30s).

All settings env-overridable. Pure-ASGI middlewares chosen over
BaseHTTPMiddleware to preserve MCP's text/event-stream transport.

Tests: 11 cases for the two middlewares (covers latch regression),
7 cases for the transport_security helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… 'enterprise'

feat(server): harden API + MCP server for production deployments (TG-1065)

See merge request dkinternal/testgen/dataops-testgen!507
Adds list_profiling_runs, get_profiling_run, and get_test_run.
Renames get_recent_test_runs -> list_test_runs (adding status and
table_group_id filters) and get_test_result_history ->
list_test_result_history (TG-1036).

Pending/queued JEs surface in a dedicated "Pending" section when scoped
by suite or table group via the new JobExecution.select_active_by_kwargs
helper. The same kwargs-search pattern is added on JobSchedule for the
"Next scheduled run" lookup. select_summary on TestRun and ProfilingRun
gains job_execution_id and statuses filters; ProfilingRunSummary now
exposes project_code so the by-id tools no longer need a second query.

ProfilingRun.select_table_breakdown is the per-table breakdown used by
get_profiling_run, written in ORM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(mcp): add run status & history tools (TG-1050)

See merge request dkinternal/testgen/dataops-testgen!509
Add create_test, update_test, validate_custom_test, and bulk_update_tests
MCP tools, gated on the edit permission. Consolidate validation onto
TestDefinition with editable_fields() and validate() methods, enforcing a
whitelist at the MCP boundary so extra_params cannot override identity or
internal columns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
same refactor branch with e2e updates

See merge request dkinternal/testgen/dataops-testgen!463
New tool to get the deep profile for one column.
refactor(mcp): add get_column_profile_detail tool

See merge request dkinternal/testgen/dataops-testgen!508
Address review comments on MR !511:

- create_test takes a single ``fields: dict`` instead of explicit kwargs;
  no field can bypass the editable_fields() whitelist.
- editable_fields() gates column_name (column / custom scopes) and
  impact_dimension (custom / referential scopes) per UI logic.
- Extract ``validate_custom_query`` to ``testgen/common/custom_test_validation.py``;
  used by both UI's validate_test and MCP's validate_custom_test.
- validate_custom_test now wraps user SQL in
  ``SELECT COUNT(*) FROM (...) ERR_TABLE`` for the count and applies a
  flavor-aware ``LIMIT 1`` for the preview row. Matches the test runtime's
  wrapping pattern (correctness parity, blocks DDL/DML).
- Output wording uses "rows matching the failure criteria" throughout;
  PII footer drops the ``view_pii`` jargon.
- bulk_update_tests uses ``result.rowcount`` instead of materialising
  every UUID via ``.returning(id).all()``.
- UI: validate_test calls the shared helper; impact_dimension gate
  simplifies to ``test_scope in ('custom', 'referential')``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…p-test-definition-crud

# Conflicts:
#	testgen/mcp/tools/common.py
- Add ``FlavorService.row_limit_clauses(n)`` returning ``(prefix, suffix)``
  SQL fragments for the flavor's row-limiting style (``LIMIT``/``TOP``/
  ``FETCH FIRST``). Replaces three duplicate inline switches.
- ``data_catalog.py`` and ``refresh_data_chars_query.py`` now call the
  method instead of branching on ``row_limiting_clause``.
- Normalise the access-check projection to literal ``1`` across flavors
  (was ``*`` for ``TOP``, ``1`` for ``LIMIT``/``FETCH``); add a
  parametrised test asserting the SQL per flavor.
- Fix a bug in ``validate_custom_query``: ``fetch_from_target_db``
  returns ``RowMapping`` (column-name access), not tuples. Alias the
  count as ``row_count`` and access by name. Unit-test mocks updated to
  the real return shape.
- Drop the now-redundant ``from __future__ import annotations`` from
  ``custom_test_validation.py``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(mcp): add test definition CRUD tools (TG-1054)

See merge request dkinternal/testgen/dataops-testgen!511
…rns (TG-1067)

Adds three new MCP tools and extends list_column_profiles with predicate filters:

- list_column_profiles: 17 new optional filter args (null/distinct/filled ratios, scores,
  pii/cde booleans, suggested_data_type enum, scoping enums) + ordering enum
- get_column_frequent_values: top-N values for one column with PII redaction
- get_column_patterns: top character patterns for one string column
- search_columns: cross-scope column-name search with per-project match summary

Folds the ticket's original find_columns_by_profile into list_column_profiles since
input scope and output row shape were identical. CDE filter coalesces column- and
table-level flags; MANUAL pii_flag is included in the High risk-level filter
(matches the dq_score_weight_defaults seed).
Stairstep volume/metric series (e.g. weekly-refreshed tables) collapsed
the SARIMAX SE estimate, so every refresh tripped the band as a false
positive.

When the same table has an active Freshness_Trend monitor, prediction
now fits SARIMAX on the value series filtered to fingerprint-change
runs and emits a baseline. Execution dual-branches: band check when
Freshness fired this run, `<> baseline` during stale periods (catches
silent writes that the band check alone would miss). Falls back to a
raw-history SARIMAX fit when the filtered fit cannot run.

Drops the post-resample SARIMAX minimum to 8 and lifts the suite-level
predict_min_lookback to TestThresholdsPrediction.run() so it gates the
raw history once for every branch.

Surfaces the gated baseline as "Threshold" on the sparkline tooltip
alongside the lower/upper bound.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The four API v1 routers each declared `prefix="/api/v1"` at construction.
Move the prefix to a single aggregator router exposed by `testgen.api`,
so the version prefix lives in one place. Each sub-router now declares
only its tags, dependencies, and responses.

`server/__init__.py` mounts the aggregator once instead of including
four routers individually. OpenAPI paths and tag distribution are
unchanged.

The unit test that mounted `jobs.router` directly on a `FastAPI()` test
app now passes `prefix="/api/v1"` at include time to mirror production
wiring; otherwise its `/api/v1/...` requests would 404.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The entity resolver factories in `api/deps.py` each repeated the pattern
"if entity and user has permission on its project, return; else raise
404". Extract this into `_check_access`, where the no-info-leakage
security intent (not-found and unauthorized both surface as the same
404) is documented once.

`resolve_project_code` keeps its own shape — it has no entity lookup,
so it doesn't fit the pattern.

Move the four function-local model imports (TableGroup, TestSuite,
JobExecution, sqlalchemy.select) to module level. No circular import.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…dules

`args` was never read by the dispatch path — `exec_job` invokes
`handler(**je.kwargs)` and ignores args entirely. `JobExecution.submit()`
doesn't accept it, and every UI/plugin call site was passing `args=[]`.
Remove the column from both tables and the call sites that supplied it.

The job_schedules uniqueness constraint included args; the migration
resolves the auto-generated PG constraint name dynamically (truncation
varies), drops the column, and recreates the constraint as
`job_schedules_uniq` over `(project_code, key, kwargs, cron_expr, cron_tz)`.

`get_job_arguments` in the schedule dialog returned `tuple[list, dict]`;
collapse it to `dict` now that the list half is dead.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Move six enums that cross subsystem boundaries into
``testgen/common/enums.py`` so the model, API, MCP, scheduler, and UI
layers share a single source of truth:

- ``JobKey``, ``JobSource`` (were in ``api/schemas.py``)
- ``JobStatus`` (was in ``common/models/job_execution.py``)
- ``Disposition``, ``IssueLikelihood``, ``PiiRisk``
  (were in ``common/models/hygiene_issue.py``)

Every call site is updated to import from ``common.enums`` directly —
no re-exports. ``api/schemas.py`` and the model files import their own
enums back from the new home.

While here, fix ``source="user"`` in the project-settings recalculate
trigger. ``user`` was never in the ``JobSource`` enum, was used in
exactly one place, and is semantically indistinguishable from ``ui``
(a logged-in user triggering a job from a UI page). Coerce to ``ui``
so the audit label matches every other UI-initiated job. No DB
backfill — the API filter for surfaced sources will be tightened
separately.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ``source`` column on ``job_executions`` is a pure audit label — it
records which surface submitted a job. The previous gating filters
(``source != 'system'`` in ``api/jobs.py``, ``api/deps.py``, and
``mcp/tools/execution.py``) conflated audit label with visibility rule
and were under-inclusive: ``run-monitors`` and ``recalculate-project-
scores`` were leaking through to public API listings because they
weren't submitted with ``source='system'``.

Introduce ``PUBLIC_JOB_KEYS`` in ``common/models/job_execution.py``:
the frozenset of ``JobKey`` values that external consumers may see.
Replace the two API source filters with ``job_key.in_(PUBLIC_JOB_KEYS)``.
Delete the MCP filter entirely — each cancel tool pins ``expected_job_key``
to a public kind, so the source filter was redundant there.

Tighten the contract so the audit label can never silently take a
typo or stale value:
- ``JobExecution.submit(source: JobSource)`` instead of ``str``
- ``JobContext.source: JobSource = JobSource.cli`` instead of the bare
  ``"CLI"`` default (downstream Mixpanel sites already ``.upper()`` so
  no telemetry change)
- Add ``system`` to ``JobSource`` so the enum stops omitting a value
  the codebase already writes from the score-rollup callback
- Migrate every production call site from string-literal ``source=...``
  to the matching ``JobSource.<member>``

Behavior change worth noting: the new filter is strictly stronger.
``run-monitors`` and ``recalculate-project-scores`` are now correctly
hidden from ``/api/v1/projects/.../jobs`` and per-job lookup. Public
kinds (``run_profile``, ``run_tests``, ``run_test_generation``) are
unchanged. No DB backfill — existing rows keep their historical source
labels, which are still valid for analytics.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
rboni-dk and others added 26 commits May 28, 2026 13:04
Add create_test_note, update_test_note, and delete_test_note. All gated
on `edit` permission; update and delete additionally require the caller
to be the note's author. Extends `add_note` to return the persisted
instance so the new note ID and timestamp can be surfaced without a
second roundtrip. Adds `Test note ID` column to `list_test_notes` output
to keep the producer→consumer chain workable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(retention): add per-project data retention cleanup

See merge request dkinternal/testgen/dataops-testgen!510
fix CI and remove noisy streamlit logs

See merge request dkinternal/testgen/dataops-testgen!528
- create_test_note: escape note body before rendering via doc.field (MdDoc.escape)
- TestDefinitionNote.update_note: use datetime.now(UTC) to match add_note and the codebase clock convention

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rprise'

feat(mcp): test definition note CRUD tools (TG-1086)

See merge request dkinternal/testgen/dataops-testgen!527
feat(TG-1006): add periodic feedback widget with Slack notifications

See merge request dkinternal/testgen/dataops-testgen!474
- list_notifications
- get_notification
- create_notification
- update_notification
- delete_notification
Entity.save() already flushes the instance, so the explicit
get_current_session().flush() after each sched.save() was redundant.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat(mcp): add CRUD mcp tools for notifications

See merge request dkinternal/testgen/dataops-testgen!524
DataFrame.to_json forces datetimes through pandas' nanosecond Timestamp, raising OverflowError on dates outside 1677-2262 (e.g. SQL Server sentinel dates like 9999-12-31), which crashed the profiling/test results and test definitions pages. Add utils.dataframe_to_json_records() that serializes each cell via make_json_safe, harden make_json_safe to map NaT to null, and swap all to_json(date_unit=s) call sites. Includes regression tests.

TG-1101

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ence

WHERE general_type IN (...) AND a OR b OR c parsed as (... AND a) OR b OR c, letting columns with general_type outside ('A','D','N') into the fingerprint CASE (which has no matching branch), collapsing custom_query to NULL and producing CAST( AS VARCHAR(MAX)) -> SQL Server syntax error 156. Parenthesize the OR group across all 7 flavor templates so every selected column matches a CASE branch.

TG-1102

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The pattern-anomaly criteria cast SPLIT_PART(top_patterns, '|', N)::NUMERIC. When a SPLIT_PART yields '' (single-pattern top_patterns), ''::NUMERIC raises 'invalid input syntax for type numeric'. Postgres gives no WHERE short-circuit guarantee, so the cast is evaluated on rows the other predicates exclude -- crashing the anomaly screen intermittently. Wrap each SPLIT_PART(top_patterns,...) cast in NULLIF(..., '') across Column_Pattern_Mismatch, Table_Pattern_Mismatch, and Invalid_Zip3_USA. Static metadata; refreshed on upgrade.

TG-1103

Co-Authored-By: Claude <noreply@anthropic.com>
Wrap the top-level render dispatch in app.py (the single per-rerun entry covering page, sidebar, logo, auth, config) so uncaught exceptions are logged with a full traceback and a short reference id to the testgen logger -- landing in app.log, which the in-app Application Logs dialog reads and can download. Render a friendly error message (with the reference and support email) instead of Streamlit's generic page; the sidebar stays available to navigate away. Streamlit rerun/stop signals are BaseException subclasses and pass through uncaught.

TG-1104

Co-Authored-By: Claude <noreply@anthropic.com>
…t PDF

The test issue report's summary table addressed its SPAN and background
TableStyle commands by absolute (col, row) coordinates adapted from the
hygiene report, but the test report has one extra data row (separate
Measured Value and Threshold Value rows where hygiene has a single
Detail row). The resulting off-by-one left the Column Tags value cell
unspanned -- its values were dropped and the label pushed to the far
right -- and the "View on TestGen" link in a narrow left cell instead
of spanning the full row.

Shift the Column Tags value span and the link row/background to their
correct indices so the layout matches the hygiene report.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Hygiene and test issue dicts bound for source-data lookups and PDF
reports were routed through the frontend JSON serializer
(make_json_safe / dataframe_to_json_records), which encodes datetimes
as epoch integers. That epoch then leaked into:

- the source-data SQL, where {PROFILE_RUN_DATE} / {TEST_DATE} became
  e.g. CAST('1780...' AS DATE) -- a conversion error on SQL Server (and
  every other flavor);
- the PDF report filename (epoch read by pd.Timestamp as nanoseconds ->
  1970);
- the PDF body dates and the result-history row highlight.

Pass the un-serialized records (NaN -> None) to the query builders and
PDF generators so dates arrive as real datetimes. Using to_dict instead
of to_json also avoids the OverflowError on out-of-range sentinel dates
(it never invokes pandas' nanosecond conversion). Frontend-bound paths
keep the epoch serialization the VanJS components expect.

Also normalize PROFILE_RUN_DATE to a date-only string: Oracle and SAP
HANA templates use TO_DATE(..., 'YYYY-MM-DD'), which rejects a time
component, and the anomaly criteria boundary is date-based
(CURRENT_DATE + INTERVAL '30 year').

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…date

profiling_starttime / test_date reach the source-data query builders as DB
timestamp strings that include microseconds (e.g. "2026-06-02 06:54:30.105548").
parse_fuzzy_date parsed the string branch with
datetime.strptime(value, "%Y-%m-%d %H:%M:%S"), which has no fractional-seconds
directive and raised "unconverted data remains: .105548". This surfaced once
PROFILE_RUN_DATE was routed through parse_fuzzy_date, failing the snowflake
functional tests (test_main, test_sampling).

Parse with datetime.fromisoformat, which handles fractional seconds and the
'T'/space separator. Add regression cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…terprise'

fix: SQL Server profiling crash and Freshness_Trend generation fixes

See merge request dkinternal/testgen/dataops-testgen!532
@datakitchen-devops datakitchen-devops merged commit 0ded486 into main Jun 3, 2026
2 checks passed
@datakitchen-devops datakitchen-devops deleted the release/5.48.0 branch June 3, 2026 02:29
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Coverage

Tests Skipped Failures Errors Time
4 0 💤 0 ❌ 4 🔥 18.153s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants