[WIP] Add databricks dbconnect init / sync commands#5690
Draft
rugpanov wants to merge 34 commits into
Draft
Conversation
Collaborator
Integration test reportCommit: d787bcb
23 interesting tests: 13 SKIP, 10 RECOVERED
Top 5 slowest tests (at least 2 minutes):
|
Brainstormed design for porting the dbconnect-init.sh demo into a real CLI subcommand namespace with init + sync commands, a shared phase pipeline, full target resolution, a surgical TOML merge, and a stable --json schema. Co-authored-by: Isaac
Bite-sized, TDD task breakdown (11 tasks) covering the command scaffold, result types, envKey mapping, constraint fetch+cache, surgical TOML merge, target resolution, uv package manager, the phase pipeline, Cobra wiring, acceptance tests, and changelog. Co-authored-by: Isaac
Regenerate the golden from the built binary; the prior hand-written version showed the command Short text instead of the rendered Long help. Co-authored-by: Isaac
- Remove noise doc comments from Error() and Unwrap() (idiomatic for standard interface methods) - Replace thin NewError doc comment with meaningful info about fmt.Sprintf and nil handling - Remove YAGNI default case from Mode.String(), use if/return instead Co-authored-by: Isaac
- Replace double TrimPrefix calls with simpler strings.TrimPrefix(strings.ToLower(version), "v") - Hoist pythonVersionRe to package-level var to avoid repeated compilation - Remove noise comment that restated the code Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
The PythonMinorFromRequires call happens after a successful network fetch, so wrapping its error with ErrConstraintFetchFailed was a misattribution. Use ErrValidationFailed instead, which correctly signals that the constraint file content failed to parse rather than that the fetch itself failed. Co-authored-by: Isaac
…etch Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
- Add json tags to PipelineError (code/message/-) so --output json emits the documented contract instead of Go field names - Change uv version probe from "version" subcommand to --version flag to avoid project-scoped failure when no pyproject.toml exists in cwd - Guard renderResult against nil res: synthesize a minimal Result with error populated so JSON mode always emits a structured object - Use i+1 for 1-based phase numbering in text output - Add comment explaining why ValidateTargetFlags is kept alongside MarkFlagsMutuallyExclusive Co-authored-by: Isaac
Add acceptance tests for the dbconnect init/sync feature: - flag-conflict: verifies Cobra mutual exclusion of --cluster/--serverless/--job - no-target: verifies error when no compute target is selected - serverless-check: verifies --serverless v4 --check with stubbed constraint server - serverless-json: verifies --output json with full Result struct - cluster-unsupported: verifies constraint fetch failure for unsupported DBR version - help/test.toml: opts out of bundle-engine matrix for the help case Each case stubs the test server via [[Server]] in test.toml and uses DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE=$DATABRICKS_HOST to point the constraint fetch at the local test server. Co-authored-by: Grigory Panov
no-target and cluster-unsupported tests use commands that must fail; musterr asserts this and fails the test if the command unexpectedly succeeds. errcode is for tolerated failures only. Co-authored-by: Isaac
Co-authored-by: Isaac
Also standardize the serverless-json acceptance uv-version replacement regex to the unwrapped form used by the sibling cases. Co-authored-by: Isaac
…d cluster-unsupported scaffolding Co-authored-by: Isaac
…rors Co-authored-by: Isaac
Co-authored-by: Isaac
These are internal process artifacts and don't belong in the databricks/cli tree. Co-authored-by: Isaac
…taxonomy, camelCase JSON
Aligns `databricks dbconnect` with the reconciled cli-spec:
- Collapse `init`+`sync` into a single `dbconnect sync` that auto-detects
greenfield (no pyproject.toml) vs. merge; command path lives in one constant.
- Add `--constraints-only` (Python + constraints, no databricks-connect pin;
still builds .venv, omits dbconnectVersion, skips the DB Connect assertion).
- Rewrite the `--output json` contract to the camelCase schema: schemaVersion,
command, ok, mode, dryRun, target, resolved, greenfield, plan, phases[] (all
six phases with pending), warnings[], error{code,failurePhase,diskMutated}.
- Rename error codes to the E_* set; report failurePhase at the phase that
detects the failure so it always matches the errored phase in phases[].
- Detect non-uv managers (conda/pip) in preflight and exit cleanly with
E_MANAGER_UNSUPPORTED; a plain PEP 621 pyproject.toml resolves to uv.
- Classify a 404 for a resolved env key as E_ENV_UNSUPPORTED (latest-LTS hint,
no cache fallback) vs. transport failure as E_FETCH; add a writable preflight.
- Default the constraint repo to rugpanov/databricks-environments.
Fixes two bugs the real `uv sync`/validate path exposed (both masked by the
fake package manager and --check in tests):
- uvManager.Validate no longer requires databricks-connect to be importable
(constraints-only left it uninstalled), so validate stops failing after it
has already provisioned the venv.
- Greenfield render now emits project.version, which uv requires for a
[project] table; without it every real greenfield `uv sync` failed.
Co-authored-by: Isaac
8d184bf to
d787bcb
Compare
rugpanov
added a commit
that referenced
this pull request
Jul 3, 2026
First of a stacked series splitting the databricks dbconnect feature (umbrella branch dbconnect-init-sync / PR #5690) into small, single- concern PRs. Each layer is independently reviewable and adds no user-facing surface until the final PR wires the command in. This PR is the foundation the rest of the stack builds on: - result.go: the result types and the --json / E_* error contract that every phase reports through (Result, PipelineError, ErrorCode, PhaseName, PhaseStatus, Mode, TargetInfo, ResolvedInfo, Plan, Warning). - envkey.go: mapping a compute target to an environment key (EnvKeyForServerless, EnvKeyForSparkVersion, NormalizeServerless) and parsing the Python minor from a requires-python specifier. Nothing imports this package yet, so the CLI is unchanged. The unexported filesystem/artifact constants and the canonical phase-order slice live with the pipeline that consumes them (a later PR in the stack), keeping this layer to just the contract types. Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Draft / do-not-merge: opened for early review; not ready to merge.
Changes
Adds a new
databricks dbconnectcommand namespace with two subcommands:databricks dbconnect init— create a freshpyproject.tomland provision a matched.venv.databricks dbconnect sync— merge managed dependencies into an existingpyproject.tomland re-provision.From the selected Databricks compute target (serverless / cluster / job), the command derives and provisions a local Python environment matched to the runtime: the right Python version, the right
databricks-connectpin, and dependency constraints so local resolution matches the Databricks runtime. It runs a phase pipeline: discoveruv→ resolve target → fetch the per-environment constraints (configurable base URL, with an offline cache) → plan → apply → ensure Python →uv sync→ seed pip → validate.Implementation notes:
cmd/dbconnect/) over a unit-testable pipeline (libs/dbconnect/), with aPackageManagerinterface seam (uv implemented; pip/conda can follow).pyproject.tomlmerge that touches only three managed regions and preserves the user's comments, ordering, and their own[tool.uv]keys; idempotent.GetByClusterId→ DBR → envKey, serverless, and job compute) with three-state messaging.~/.config/pip/pip.confindex-url→UV_INDEX_URL(uv ignorespip.conf).--checkdry-run prints the plan + diff and changes nothing;--output jsonemits a stable structured schema, and--debugadds diagnostic logging for troubleshooting on machines we can't access.Why
Promotes a proven proof-of-concept shell script into a real CLI command so the VS Code extension (and users directly) can set up a local environment matched to their compute, instead of guessing Python and
databricks-connectversions. Doing the version/constraint resolution from the compute target avoids local/remote drift.Tests
libs/dbconnect/: merge edge cases (single/multi-line arrays, quote styles, CRLF, idempotency, preserving user[tool.uv]keys), envKey mapping + Python-version parsing, target resolution (precedence + three-state), constraint fetch with offline-cache fallback, and pipeline orchestration incl.--checkgating and validation.acceptance/dbconnect/: serverless--check,--output jsonshape, no-target error, cluster-unsupported, flag conflict, and JSON-mode error exit code..venvwithdatabricks-connect17.x and the injected constraints.Out of scope for this first cut: pip/conda package managers (interface only) and the nearest-supported envKey fallback.
This pull request and its description were written by Isaac.