Skip to content

Fix: forward aggregate report_type to TriggerReconAggregateService#2401

Closed
moomindani wants to merge 3 commits intodatabrickslabs:mainfrom
moomindani:fix/recon-aggregate-dispatch-in-trigger-recon
Closed

Fix: forward aggregate report_type to TriggerReconAggregateService#2401
moomindani wants to merge 3 commits intodatabrickslabs:mainfrom
moomindani:fix/recon-aggregate-dispatch-in-trigger-recon

Conversation

@moomindani
Copy link
Copy Markdown
Contributor

Changes

What does this PR do?

Forwards report_type=="aggregate" from TriggerReconService.trigger_recon to TriggerReconAggregateService.trigger_recon_aggregates so programmatic callers that pass an aggregate report type get the actual aggregate comparison instead of a silent no-op.

Relevant implementation details

TriggerReconService.trigger_recon validates report_type against _RECON_REPORT_TYPES = {"schema","data","row","all","aggregate"}, so "aggregate" is accepted at entry. However, _do_recon_one only branches on {"schema","all"} and {"data","row","all"}. The aggregate case falls through, no comparison runs, and recon_capture.start() records the run with status=true. Callers see a successful reconciliation even when source and target differ.

The CLI dispatch in execute.py already routes AGG_RECONCILE_OPERATION_NAME to TriggerReconAggregateService.trigger_recon_aggregates, but programmatic users of the public Reconcile API (notebooks, integrations) hit the silent no-op without this change.

This PR adds an early dispatch at the top of trigger_recon. The import is deferred because trigger_recon_aggregate_service already imports trigger_recon_service and a top-level import would be circular.

Caveats/things to watch out for when reviewing:

  • Deferred import inside the function — done to avoid the circular import; behavior identical to execute.py's existing dispatch
  • All existing report types (schema, data, row, all) keep the original path

Linked issues

None

Functionality

  • added relevant user documentation
  • added new CLI command
  • modified existing command: programmatic API of TriggerReconService.trigger_recon

Tests

  • manually tested
  • added unit tests
  • added integration tests

Manual end-to-end on Databricks Serverless with two tables of different row counts:

  • Before: aggregate=None, status=true, no rows in recon.aggregate_*
  • After: aggregate=False, mismatch=1, rows present in recon.aggregate_metrics

Unit tests added in tests/unit/reconcile/test_verify_reconcile.py:

  • test_trigger_recon_dispatches_aggregate_to_aggregate_service: forwarding works and the data-path setup is skipped
  • test_trigger_recon_does_not_dispatch_non_aggregate: schema / data / row / all keep the original path

moomindani added 2 commits May 1, 2026 11:27
TriggerReconService.trigger_recon validates that report_type is in
{"schema","data","row","all","aggregate"} via _RECON_REPORT_TYPES, so
"aggregate" is accepted at the entry point. However, _do_recon_one only
branches on {"schema","all"} and {"data","row","all"}; the "aggregate"
case falls through, no comparison is performed, and recon_capture.start()
records the run with status=true. Callers see a successful reconciliation
even when source and target differ.

The aggregate path lives in TriggerReconAggregateService.
trigger_recon_aggregates. CLI dispatch in execute.py already routes the
AGG_RECONCILE_OPERATION_NAME to that service, but programmatic callers
of TriggerReconService.trigger_recon (e.g. notebooks built around the
public Reconcile API) hit the silent no-op.

Forward report_type=="aggregate" to TriggerReconAggregateService at the
top of trigger_recon. The import is deferred because
trigger_recon_aggregate_service already imports trigger_recon_service.

Co-authored-by: Isaac
Cover the routing introduced in this branch:
- report_type='aggregate' is forwarded to
  TriggerReconAggregateService.trigger_recon_aggregates without going
  through create_recon_dependencies of the data path
- The aggregate-specific kwargs (table_recon, reconcile_config,
  local_test_run) are passed through unchanged
- All other report_types ('schema', 'data', 'row', 'all') keep the
  existing path and never call the aggregate service

Without the dispatch, the existing _do_recon_one short-circuits silently
for report_type='aggregate' and the run is recorded as successful. The
first test reproduces the original behavior (no aggregate forward) and
fails on origin/main.

Co-authored-by: Isaac
@moomindani moomindani requested a review from a team as a code owner May 1, 2026 04:42
Copy link
Copy Markdown
Contributor

@m-abulazm m-abulazm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? notebook/python users should use TriggerReconAggregateService.trigger_recon_aggregates

@moomindani
Copy link
Copy Markdown
Contributor Author

Fair question — TriggerReconAggregateService.trigger_recon_aggregates does exist as a direct entry point, so this isn't about adding new capability. The motivation is consistency / discoverability of the public TriggerReconService.trigger_recon surface.

trigger_recon already accepts report_type in {"schema","data","row","all","aggregate"} (validated against _RECON_REPORT_TYPES), and for data / row / all / schema it does the right thing. aggregate is the only value that's accepted at the entrypoint but silently no-ops downstream — _do_recon_one only branches on {"schema","all"} and {"data","row","all"}, so the aggregate case falls through and recon_capture.start() records the run with status=true even when source/target differ.

We've actually seen users get caught by this in practice — they followed the same trigger_recon(..., report_type=...) pattern they already use for data/row, passed "aggregate", got back a successful-looking result, and didn't realize no comparison had run. The CLI dispatch in execute.py already routes the aggregate case to TriggerReconAggregateService, so this PR just brings the programmatic API in line with that, keeping a single trigger_recon entry point that behaves uniformly across all accepted report_type values.

Happy to take an alternate fix if you'd prefer — e.g., narrowing _RECON_REPORT_TYPES in trigger_recon to exclude "aggregate" and explicitly documenting that aggregate users must call TriggerReconAggregateService directly. The current shape was chosen to avoid breaking callers that already pass "aggregate".

…ispatch-in-trigger-recon

# Conflicts:
#	src/databricks/labs/lakebridge/reconcile/trigger_recon_service.py
@moomindani
Copy link
Copy Markdown
Contributor Author

Reopened as #2423 on an upstream branch to bypass the fork-PR OIDC restriction on JFrog auth (CI cannot run on fork PRs). All review comments here are preserved as history; further discussion will happen on #2423.

@moomindani moomindani closed this May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants