Release/5.48.0#68
Merged
Merged
Conversation
…1065) Four bundled fixes for production readiness: - DNS rebinding: pass explicit transport_security to FastMCP with allowlist derived from BASE_URL + loopback + TG_MCP_EXTRA_ALLOWED_HOSTS. Fixes 421 Misdirected Request for external clients caused by FastMCP's loopback-only auto-allowlist. - Security headers: pure-ASGI SecurityHeadersMiddleware injects HSTS (TLS-only by default), X-Content-Type-Options, Referrer-Policy, and CSP frame-ancestors on success and error responses across /api/*, /oauth/*, /.well-known/*, /mcp. - Body-size cap: pure-ASGI BodySizeLimitMiddleware rejects requests exceeding TG_API_MAX_REQUEST_BODY_BYTES (default 10 MiB) with 413, enforced via Content-Length fast-reject and a streaming guard with a latch to prevent post-disconnect bypass. - Graceful shutdown: timeout_graceful_shutdown plumbed to uvicorn.run via TG_API_GRACEFUL_SHUTDOWN_TIMEOUT (default 30s). All settings env-overridable. Pure-ASGI middlewares chosen over BaseHTTPMiddleware to preserve MCP's text/event-stream transport. Tests: 11 cases for the two middlewares (covers latch regression), 7 cases for the transport_security helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… 'enterprise' feat(server): harden API + MCP server for production deployments (TG-1065) See merge request dkinternal/testgen/dataops-testgen!507
Adds list_profiling_runs, get_profiling_run, and get_test_run. Renames get_recent_test_runs -> list_test_runs (adding status and table_group_id filters) and get_test_result_history -> list_test_result_history (TG-1036). Pending/queued JEs surface in a dedicated "Pending" section when scoped by suite or table group via the new JobExecution.select_active_by_kwargs helper. The same kwargs-search pattern is added on JobSchedule for the "Next scheduled run" lookup. select_summary on TestRun and ProfilingRun gains job_execution_id and statuses filters; ProfilingRunSummary now exposes project_code so the by-id tools no longer need a second query. ProfilingRun.select_table_breakdown is the per-table breakdown used by get_profiling_run, written in ORM. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(mcp): add run status & history tools (TG-1050) See merge request dkinternal/testgen/dataops-testgen!509
Add create_test, update_test, validate_custom_test, and bulk_update_tests MCP tools, gated on the edit permission. Consolidate validation onto TestDefinition with editable_fields() and validate() methods, enforcing a whitelist at the MCP boundary so extra_params cannot override identity or internal columns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
same refactor branch with e2e updates See merge request dkinternal/testgen/dataops-testgen!463
…p-test-definition-crud
New tool to get the deep profile for one column.
refactor(mcp): add get_column_profile_detail tool See merge request dkinternal/testgen/dataops-testgen!508
Address review comments on MR !511:
- create_test takes a single ``fields: dict`` instead of explicit kwargs;
no field can bypass the editable_fields() whitelist.
- editable_fields() gates column_name (column / custom scopes) and
impact_dimension (custom / referential scopes) per UI logic.
- Extract ``validate_custom_query`` to ``testgen/common/custom_test_validation.py``;
used by both UI's validate_test and MCP's validate_custom_test.
- validate_custom_test now wraps user SQL in
``SELECT COUNT(*) FROM (...) ERR_TABLE`` for the count and applies a
flavor-aware ``LIMIT 1`` for the preview row. Matches the test runtime's
wrapping pattern (correctness parity, blocks DDL/DML).
- Output wording uses "rows matching the failure criteria" throughout;
PII footer drops the ``view_pii`` jargon.
- bulk_update_tests uses ``result.rowcount`` instead of materialising
every UUID via ``.returning(id).all()``.
- UI: validate_test calls the shared helper; impact_dimension gate
simplifies to ``test_scope in ('custom', 'referential')``.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…p-test-definition-crud # Conflicts: # testgen/mcp/tools/common.py
- Add ``FlavorService.row_limit_clauses(n)`` returning ``(prefix, suffix)`` SQL fragments for the flavor's row-limiting style (``LIMIT``/``TOP``/ ``FETCH FIRST``). Replaces three duplicate inline switches. - ``data_catalog.py`` and ``refresh_data_chars_query.py`` now call the method instead of branching on ``row_limiting_clause``. - Normalise the access-check projection to literal ``1`` across flavors (was ``*`` for ``TOP``, ``1`` for ``LIMIT``/``FETCH``); add a parametrised test asserting the SQL per flavor. - Fix a bug in ``validate_custom_query``: ``fetch_from_target_db`` returns ``RowMapping`` (column-name access), not tuples. Alias the count as ``row_count`` and access by name. Unit-test mocks updated to the real return shape. - Drop the now-redundant ``from __future__ import annotations`` from ``custom_test_validation.py``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(mcp): add test definition CRUD tools (TG-1054) See merge request dkinternal/testgen/dataops-testgen!511
…rns (TG-1067) Adds three new MCP tools and extends list_column_profiles with predicate filters: - list_column_profiles: 17 new optional filter args (null/distinct/filled ratios, scores, pii/cde booleans, suggested_data_type enum, scoping enums) + ordering enum - get_column_frequent_values: top-N values for one column with PII redaction - get_column_patterns: top character patterns for one string column - search_columns: cross-scope column-name search with per-project match summary Folds the ticket's original find_columns_by_profile into list_column_profiles since input scope and output row shape were identical. CDE filter coalesces column- and table-level flags; MANUAL pii_flag is included in the High risk-level filter (matches the dq_score_weight_defaults seed).
Stairstep volume/metric series (e.g. weekly-refreshed tables) collapsed the SARIMAX SE estimate, so every refresh tripped the band as a false positive. When the same table has an active Freshness_Trend monitor, prediction now fits SARIMAX on the value series filtered to fingerprint-change runs and emits a baseline. Execution dual-branches: band check when Freshness fired this run, `<> baseline` during stale periods (catches silent writes that the band check alone would miss). Falls back to a raw-history SARIMAX fit when the filtered fit cannot run. Drops the post-resample SARIMAX minimum to 8 and lifts the suite-level predict_min_lookback to TestThresholdsPrediction.run() so it gates the raw history once for every branch. Surfaces the gated baseline as "Threshold" on the sparkline tooltip alongside the lower/upper bound. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The four API v1 routers each declared `prefix="/api/v1"` at construction. Move the prefix to a single aggregator router exposed by `testgen.api`, so the version prefix lives in one place. Each sub-router now declares only its tags, dependencies, and responses. `server/__init__.py` mounts the aggregator once instead of including four routers individually. OpenAPI paths and tag distribution are unchanged. The unit test that mounted `jobs.router` directly on a `FastAPI()` test app now passes `prefix="/api/v1"` at include time to mirror production wiring; otherwise its `/api/v1/...` requests would 404. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The entity resolver factories in `api/deps.py` each repeated the pattern "if entity and user has permission on its project, return; else raise 404". Extract this into `_check_access`, where the no-info-leakage security intent (not-found and unauthorized both surface as the same 404) is documented once. `resolve_project_code` keeps its own shape — it has no entity lookup, so it doesn't fit the pattern. Move the four function-local model imports (TableGroup, TestSuite, JobExecution, sqlalchemy.select) to module level. No circular import. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…dules `args` was never read by the dispatch path — `exec_job` invokes `handler(**je.kwargs)` and ignores args entirely. `JobExecution.submit()` doesn't accept it, and every UI/plugin call site was passing `args=[]`. Remove the column from both tables and the call sites that supplied it. The job_schedules uniqueness constraint included args; the migration resolves the auto-generated PG constraint name dynamically (truncation varies), drops the column, and recreates the constraint as `job_schedules_uniq` over `(project_code, key, kwargs, cron_expr, cron_tz)`. `get_job_arguments` in the schedule dialog returned `tuple[list, dict]`; collapse it to `dict` now that the list half is dead. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Move six enums that cross subsystem boundaries into ``testgen/common/enums.py`` so the model, API, MCP, scheduler, and UI layers share a single source of truth: - ``JobKey``, ``JobSource`` (were in ``api/schemas.py``) - ``JobStatus`` (was in ``common/models/job_execution.py``) - ``Disposition``, ``IssueLikelihood``, ``PiiRisk`` (were in ``common/models/hygiene_issue.py``) Every call site is updated to import from ``common.enums`` directly — no re-exports. ``api/schemas.py`` and the model files import their own enums back from the new home. While here, fix ``source="user"`` in the project-settings recalculate trigger. ``user`` was never in the ``JobSource`` enum, was used in exactly one place, and is semantically indistinguishable from ``ui`` (a logged-in user triggering a job from a UI page). Coerce to ``ui`` so the audit label matches every other UI-initiated job. No DB backfill — the API filter for surfaced sources will be tightened separately. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ``source`` column on ``job_executions`` is a pure audit label — it records which surface submitted a job. The previous gating filters (``source != 'system'`` in ``api/jobs.py``, ``api/deps.py``, and ``mcp/tools/execution.py``) conflated audit label with visibility rule and were under-inclusive: ``run-monitors`` and ``recalculate-project- scores`` were leaking through to public API listings because they weren't submitted with ``source='system'``. Introduce ``PUBLIC_JOB_KEYS`` in ``common/models/job_execution.py``: the frozenset of ``JobKey`` values that external consumers may see. Replace the two API source filters with ``job_key.in_(PUBLIC_JOB_KEYS)``. Delete the MCP filter entirely — each cancel tool pins ``expected_job_key`` to a public kind, so the source filter was redundant there. Tighten the contract so the audit label can never silently take a typo or stale value: - ``JobExecution.submit(source: JobSource)`` instead of ``str`` - ``JobContext.source: JobSource = JobSource.cli`` instead of the bare ``"CLI"`` default (downstream Mixpanel sites already ``.upper()`` so no telemetry change) - Add ``system`` to ``JobSource`` so the enum stops omitting a value the codebase already writes from the score-rollup callback - Migrate every production call site from string-literal ``source=...`` to the matching ``JobSource.<member>`` Behavior change worth noting: the new filter is strictly stronger. ``run-monitors`` and ``recalculate-project-scores`` are now correctly hidden from ``/api/v1/projects/.../jobs`` and per-job lookup. Public kinds (``run_profile``, ``run_tests``, ``run_test_generation``) are unchanged. No DB backfill — existing rows keep their historical source labels, which are still valid for analytics. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add create_test_note, update_test_note, and delete_test_note. All gated on `edit` permission; update and delete additionally require the caller to be the note's author. Extends `add_note` to return the persisted instance so the new note ID and timestamp can be surfaced without a second roundtrip. Adds `Test note ID` column to `list_test_notes` output to keep the producer→consumer chain workable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(retention): add per-project data retention cleanup See merge request dkinternal/testgen/dataops-testgen!510
fix CI and remove noisy streamlit logs See merge request dkinternal/testgen/dataops-testgen!528
…p-test-definition-notes-crud
- create_test_note: escape note body before rendering via doc.field (MdDoc.escape) - TestDefinitionNote.update_note: use datetime.now(UTC) to match add_note and the codebase clock convention Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rprise' feat(mcp): test definition note CRUD tools (TG-1086) See merge request dkinternal/testgen/dataops-testgen!527
feat(TG-1006): add periodic feedback widget with Slack notifications See merge request dkinternal/testgen/dataops-testgen!474
- list_notifications - get_notification - create_notification - update_notification - delete_notification
Entity.save() already flushes the instance, so the explicit get_current_session().flush() after each sched.save() was redundant. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat(mcp): add CRUD mcp tools for notifications See merge request dkinternal/testgen/dataops-testgen!524
DataFrame.to_json forces datetimes through pandas' nanosecond Timestamp, raising OverflowError on dates outside 1677-2262 (e.g. SQL Server sentinel dates like 9999-12-31), which crashed the profiling/test results and test definitions pages. Add utils.dataframe_to_json_records() that serializes each cell via make_json_safe, harden make_json_safe to map NaT to null, and swap all to_json(date_unit=s) call sites. Includes regression tests. TG-1101 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ence
WHERE general_type IN (...) AND a OR b OR c parsed as (... AND a) OR b OR c, letting columns with general_type outside ('A','D','N') into the fingerprint CASE (which has no matching branch), collapsing custom_query to NULL and producing CAST( AS VARCHAR(MAX)) -> SQL Server syntax error 156. Parenthesize the OR group across all 7 flavor templates so every selected column matches a CASE branch.
TG-1102
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The pattern-anomaly criteria cast SPLIT_PART(top_patterns, '|', N)::NUMERIC. When a SPLIT_PART yields '' (single-pattern top_patterns), ''::NUMERIC raises 'invalid input syntax for type numeric'. Postgres gives no WHERE short-circuit guarantee, so the cast is evaluated on rows the other predicates exclude -- crashing the anomaly screen intermittently. Wrap each SPLIT_PART(top_patterns,...) cast in NULLIF(..., '') across Column_Pattern_Mismatch, Table_Pattern_Mismatch, and Invalid_Zip3_USA. Static metadata; refreshed on upgrade. TG-1103 Co-Authored-By: Claude <noreply@anthropic.com>
Wrap the top-level render dispatch in app.py (the single per-rerun entry covering page, sidebar, logo, auth, config) so uncaught exceptions are logged with a full traceback and a short reference id to the testgen logger -- landing in app.log, which the in-app Application Logs dialog reads and can download. Render a friendly error message (with the reference and support email) instead of Streamlit's generic page; the sidebar stays available to navigate away. Streamlit rerun/stop signals are BaseException subclasses and pass through uncaught. TG-1104 Co-Authored-By: Claude <noreply@anthropic.com>
…t PDF The test issue report's summary table addressed its SPAN and background TableStyle commands by absolute (col, row) coordinates adapted from the hygiene report, but the test report has one extra data row (separate Measured Value and Threshold Value rows where hygiene has a single Detail row). The resulting off-by-one left the Column Tags value cell unspanned -- its values were dropped and the label pushed to the far right -- and the "View on TestGen" link in a narrow left cell instead of spanning the full row. Shift the Column Tags value span and the link row/background to their correct indices so the layout matches the hygiene report. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Hygiene and test issue dicts bound for source-data lookups and PDF
reports were routed through the frontend JSON serializer
(make_json_safe / dataframe_to_json_records), which encodes datetimes
as epoch integers. That epoch then leaked into:
- the source-data SQL, where {PROFILE_RUN_DATE} / {TEST_DATE} became
e.g. CAST('1780...' AS DATE) -- a conversion error on SQL Server (and
every other flavor);
- the PDF report filename (epoch read by pd.Timestamp as nanoseconds ->
1970);
- the PDF body dates and the result-history row highlight.
Pass the un-serialized records (NaN -> None) to the query builders and
PDF generators so dates arrive as real datetimes. Using to_dict instead
of to_json also avoids the OverflowError on out-of-range sentinel dates
(it never invokes pandas' nanosecond conversion). Frontend-bound paths
keep the epoch serialization the VanJS components expect.
Also normalize PROFILE_RUN_DATE to a date-only string: Oracle and SAP
HANA templates use TO_DATE(..., 'YYYY-MM-DD'), which rejects a time
component, and the anomaly criteria boundary is date-based
(CURRENT_DATE + INTERVAL '30 year').
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…date profiling_starttime / test_date reach the source-data query builders as DB timestamp strings that include microseconds (e.g. "2026-06-02 06:54:30.105548"). parse_fuzzy_date parsed the string branch with datetime.strptime(value, "%Y-%m-%d %H:%M:%S"), which has no fractional-seconds directive and raised "unconverted data remains: .105548". This surfaced once PROFILE_RUN_DATE was routed through parse_fuzzy_date, failing the snowflake functional tests (test_main, test_sampling). Parse with datetime.fromisoformat, which handles fractional seconds and the 'T'/space separator. Add regression cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…terprise' fix: SQL Server profiling crash and Freshness_Trend generation fixes See merge request dkinternal/testgen/dataops-testgen!532
datakitchen-devops
approved these changes
Jun 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Features
Bug Fixes
Refactors
Documentation