Skip to content

Commit a83c94b

Browse files
Dev-X25874sd-db
andauthored
fix: raise DbtRuntimeError when job run terminates with non-success result state (#1477)
<!-- Please review our pull request review process in CONTRIBUTING.md before your proceed. --> Resolves # <!--- Include the number of the issue addressed by this PR above if applicable. Example: resolves #1234 Please review our pull request review process in CONTRIBUTING.md before your proceed. --> ### Description While reading through `JobRunsApi` I noticed that `_handle_terminal_state` has a gap in its error handling. When `poll_for_completion` sees a terminal lifecycle state and calls into this method, two cases are covered: `CANCELED` result state gets a `DbtRuntimeError`, and any non-`TERMINATED` lifecycle state (e.g. `SKIPPED`, `INTERNAL_ERROR`) goes through `_handle_non_terminated_failure`. What is missing is the case where `life_cycle_state` is `TERMINATED` but `result_state` is something like `FAILED`, `TIMEDOUT`, `UPSTREAM_FAILED`, or `UPSTREAM_CANCELED` — the method just falls off the end and returns `None`, so dbt treats the Python model run as successful even though the notebook actually failed on Databricks. This is the most common production failure path: a notebook raises an unhandled exception, Databricks marks the run `TERMINATED / FAILED`, and the dbt run finishes green with no indication that anything went wrong. The fix adds an `elif` after the existing `life_cycle_state != "TERMINATED"` block. If we are in `TERMINATED` and `result_state` is anything other than `None`, `SUCCESS`, or `SUCCESS_WITH_FAILURES`, we raise `DbtRuntimeError` with the result state and Databricks state message so the failure is visible. Added `test_poll_for_completion__failed` to the existing `tests/unit/api_client/test_job_runs_api.py` test class, following the same pattern as `test_poll_for_completion__cancelled` directly above it. The test sets `life_cycle_state = TERMINATED` and `result_state = FAILED`, and asserts that a `DbtRuntimeError` containing `result_state FAILED` is raised. On the original code the call returns silently and the test fails; after the fix it passes. ### Checklist - [ ] I have run this code in development and it appears to resolve the stated issue - [ ] This PR includes tests, or tests are not required/relevant for this PR - [ ] I have updated the `CHANGELOG.md` and added information about my change to the "dbt-databricks next" section. --------- Co-authored-by: Shubham Dhal <shubham.dhal@databricks.com>
1 parent 73c1320 commit a83c94b

3 files changed

Lines changed: 25 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@
44

55
- Add catalogs.yml v2 support (requires `use_catalogs_v2: true` in dbt-core) ([1440](https://github.com/databricks/dbt-databricks/pull/1440))
66

7+
### Fixes
8+
- Raise a `DbtRuntimeError` when a Python model job run terminates with a non-success `result_state` (e.g. `FAILED`/`TIMEDOUT`) instead of returning silently ([#1477](https://github.com/databricks/dbt-databricks/pull/1477))
9+
710
### Under the Hood
811

912
- Add a functional test for incremental column-mask removal: dropping a `column_mask` from a model with an existing incremental relation issues `ALTER COLUMN ... DROP MASK` and leaves the column unmasked (test-only, no runtime impact). ([#1514](https://github.com/databricks/dbt-databricks/pull/1514))

dbt/adapters/databricks/api_client.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -617,6 +617,11 @@ def _handle_terminal_state(self, run: Run) -> None:
617617
self._handle_non_terminated_failure(
618618
run, life_cycle_state, state_message, result_state or ""
619619
)
620+
elif result_state not in (None, "SUCCESS", "SUCCESS_WITH_FAILURES"):
621+
raise DbtRuntimeError(
622+
f"Python model run ended in result_state {result_state}"
623+
f" (run_id: {run.run_id})\nState message: {state_message}"
624+
)
620625

621626
def cancel(self, run_id: str) -> None:
622627
logger.debug(f"Cancelling run id {run_id}")

tests/unit/api_client/test_job_runs_api.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -203,3 +203,20 @@ def test_poll_for_completion__success(self, _, api, workspace_client):
203203
api.poll_for_completion("123")
204204

205205
workspace_client.jobs.get_run.assert_called_with(run_id=123)
206+
207+
@freezegun.freeze_time("2020-01-01")
208+
@patch("time.sleep")
209+
def test_poll_for_completion__failed(self, _, api, workspace_client):
210+
mock_run = Mock(spec=Run)
211+
mock_state = Mock(spec=RunState)
212+
mock_state.life_cycle_state = RunLifeCycleState.TERMINATED
213+
mock_state.result_state = RunResultState.FAILED
214+
mock_state.state_message = "notebook raised exception"
215+
mock_run.state = mock_state
216+
mock_run.run_id = 123
217+
workspace_client.jobs.get_run.return_value = mock_run
218+
219+
with pytest.raises(DbtRuntimeError) as exc:
220+
api.poll_for_completion("123")
221+
222+
assert "Python model run ended in result_state FAILED" in str(exc.value)

0 commit comments

Comments
 (0)