Skip to content

feat(enable): support GitHub-source pipelines via --service-connection#905

Merged
jamesadevine merged 7 commits into
mainfrom
devinejames/enable-github-source
Jun 8, 2026
Merged

feat(enable): support GitHub-source pipelines via --service-connection#905
jamesadevine merged 7 commits into
mainfrom
devinejames/enable-github-source

Conversation

@jamesadevine

Copy link
Copy Markdown
Collaborator

Summary

Lets ado-aw enable register ADO build definitions whose source YAML lives on GitHub. This unblocks bulk-registering the tests/safe-outputs/*.lock.yml smoke pipelines into msazuresphere/AgentPlayground from a githubnext/ado-aw checkout — previously enable hard-coded "type": "TfsGit" and bailed early on non-ADO remotes (src/enable.rs:148-156, 201-211 on main).

Mental model

AdoContext historically conflated two namespaces. This PR keeps AdoContext strictly as the deployment target ({org_url, project}) and introduces a separate source identity type:

Namespace What it identifies Resolved from
Deployment target Where the pipeline runs (ado_org, ado_project) --org/--project (or git remote when source is ADO Git)
Source identity (RepoSource) Where the markdown lives (provider, owner, repo) parse_git_remote(git remote)

For ADO-source checkouts both halves coincide (no behavioural change). For GitHub source they diverge — the source is (Github, githubnext, ado-aw) while the deployment is (msazuresphere, AgentPlayground).

What's new

Provider autodetect — no --repository-type flag. parse_git_remote tries ADO first (dev.azure.com, legacy *.visualstudio.com), then github.com HTTPS / SSH, then bails. The result feeds into a new RepositoryRef enum that branches build_create_body between the existing TfsGit body and a GitHub body of the form:

"repository": {
  "type": "GitHub",
  "name": "owner/repo",
  "url":  "https://github.com/owner/repo.git",
  "defaultBranch": "<--default-branch>",
  "properties": {
    "connectedServiceId": "<resolved GUID>",
    "reportBuildStatus": "true"
  }
}

Two new flags on enable:

  • --service-connection <name-or-guid> — required when source is GitHub; rejected for ADO Git. Friendly names are resolved via GET /_apis/serviceendpoint/endpoints?type=github&endpointNames=<name> (project-scoped) with operator-friendly 0/>1-match errors. A UUID regex short-circuits the API call when a GUID is passed.
  • --repository-name owner/repo — optional override; auto-detected from a GitHub remote.

Lifecycle commands (list/disable/remove/run/status) need no code changes — they route through the lexical match_definitions matcher, which only inspects process.yamlFilename and name and is therefore provider-agnostic. A pinning test (match_definitions_in_works_for_github_source_definition) locks that contract in. They work from a GitHub checkout via explicit --org/--project.

Operator usage

The new tests/safe-outputs/REGISTERED.md manual-handoff checklist captures the full workflow. Short version:

# One-time portal step: create a GitHub service connection in AgentPlayground
#   Project settings → Service connections → New → GitHub
#   Name: ado-aw-github  (or whatever you like)

# Bulk-register all smoke pipelines from a githubnext/ado-aw checkout:
ado-aw enable `
  --org msazuresphere --project AgentPlayground `
  --service-connection ado-aw-github `
  --folder '\smoke' `
  tests/safe-outputs/

# One-time first run per pipeline (ADO platform quirk — scheduled triggers
# don't fire until each definition has had at least one run):
ado-aw run --org msazuresphere --project AgentPlayground tests/safe-outputs/

Out of scope (deliberate, documented in the plan)

  • GitHub Enterprise (github.example.com) — parser matches github.com only.
  • BitBucket Cloud source type.
  • Persisting --org/--project deployment coords to a local config file. Lifecycle commands against GitHub-source pipelines require these explicitly until that lands.
  • Marker emit-path migration for GitHub source. The # ado-aw-metadata: {org, repo} marker stays a pure source-identity token; GitHub-source compilations continue to emit empty org/repo and rely on the existing wildcard fallback. The URL filter in discovery is sufficient for the immediate use case; cross-talk is theoretical and not exercised by AgentPlayground.
  • secrets --all-repos/--source discovery migrationsecrets --definition-ids works for GitHub source already; the discovery-path migration is its own focused follow-up.

Test plan

  • cargo build — clean.
  • cargo clippy --all-targets --all-features — clean.
  • cargo test1785 unit tests + every integration suite, 0 failures.
  • 21 new tests covering:
    • parse_git_remote — ADO HTTPS/SSH/legacy, GitHub HTTPS/SSH, .git suffix, GHES rejection, unrelated-host rejection, missing-repo-half rejection (9 cases).
    • RepoSource::url() — TfsGit + Github shapes (2 cases).
    • is_uuid_like and pick_service_endpoint_id — UUID short-circuit, 0/1/>1-match triage, missing-value-array bail (6 cases).
    • build_create_body GitHub-shape snapshot + repository.id-omission guarantee (2 cases).
    • resolve_source and split_owner_repo_arg — autodetect rule for all 6 (provider × override) combinations and arg parsing (10 cases).
    • match_definitions_in_works_for_github_source_definition — pins provider-agnostic matching.
    • enable_help_describes_github_source_support integration test confirms the new flags advertise correctly.
  • cargo run -- enable --help manually verified.
  • End-to-end manual verification against msazuresphere/AgentPlayground to follow once a GitHub service connection is set up in the project (the one operator step that can't be automated — no REST API).

Lets `ado-aw enable` register ADO build definitions whose source YAML
lives in a GitHub repository — required to bulk-register
`tests/safe-outputs/*.lock.yml` smoke pipelines in the
`msazuresphere/AgentPlayground` ADO project from a `githubnext/ado-aw`
checkout.

Provider is autodetected from `git remote get-url origin`: ADO Git
remotes emit the existing TfsGit create-definition body; `github.com`
remotes emit a GitHub body whose
`repository.properties.connectedServiceId` references a project-level
GitHub service connection resolved from the new `--service-connection`
flag (accepts friendly name or GUID). `--repository-name owner/repo`
overrides the auto-detected source coords for the rare cross-fork case.

Lifecycle commands (`list`/`disable`/`remove`/`run`/`status`) need no
code changes — they use the lexical `match_definitions` matcher which
only inspects `yamlFilename` and `name`. A pinning test
(`match_definitions_in_works_for_github_source_definition`) locks that
contract in.

Out of scope: GitHub Enterprise, BitBucket Cloud, persisting
`--org`/`--project` to a local config, marker emit-path migration for
GitHub source, and `secrets --all-repos`/`--source` discovery
migration. See the PR description for the rationale.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good overall — clean model separation, solid test coverage. Two issues worth addressing before merge.


Findings

🐛 Bugs / Logic Issues

  • src/enable.rs:479 — silent empty-string failure on invariant break

    // Inside `ResolvedSource::Github { .. }` match arm:
    full_name: github_full_name.as_deref().unwrap_or(""),

    github_full_name is always Some(...) here (set two lines above to Some(format!(...)) for the Github case), so unwrap_or("") is safe today. But it would silently produce a malformed POST body with "repository.name": "" if the invariant were broken by a future refactor, instead of failing fast. Prefer .expect("invariant: github_full_name must be Some for Github source") or restructure the loop to capture full_name directly from the resolved match.


⚠️ Suggestions

  • src/ado/mod.rs:1183endpointNames query value encoded with PATH_SEGMENT

    percent_encoding::utf8_percent_encode(trimmed, PATH_SEGMENT),  // query value

    PATH_SEGMENT deliberately leaves & and = unencoded (they're legal sub-delimiters in URI path segments). For a query-string value that's the encoding for path components, so a service connection named foo&bar would shatter the endpointNames=foo&bar parameter into two separate params, and foo=bar would produce endpointNames=foo=bar which some servers reject. In practice ADO likely constrains names to alphanumeric + hyphens, so this is theoretical — but percent_encoding::NON_ALPHANUMERIC (or the query-component set) would be strictly correct here. The ctx.project on the line above it is correctly encoded with PATH_SEGMENT since it is a path segment.

  • src/ado/mod.rs:222-228RepoSource::url() doc comment is aspirational; gap with normalize_repo_url

    The doc says "Used by discovery's CurrentRepo URL filter ... both shapes flow through normalize_repo_url", but RepoSource::url() has no production callers today — only tests use it. More importantly, if it were wired to apply_scope_filter, there's a latent mismatch: RepoSource::url() returns https://github.com/owner/repo (no .git) while build_create_body stores https://github.com/owner/repo.git in the ADO definition, and normalize_repo_url does not strip .git suffixes. The URL comparison would silently return zero matches for every GitHub-source pipeline. Either drop the "used by discovery" claim from the doc (current behavior is intentional wildcard-fallback) or add .git-stripping to normalize_repo_url before connecting the two.


✅ What Looks Good

  • ResolvedSource enum cleanly separates the ADO-deployment and source-identity namespaces — the two-namespace mental model translates well into the type system.
  • is_uuid_like short-circuit on the service-connection API call is a nice touch.
  • Lifecycle commands correctly route through match_definitions_in (YAML-path-only, provider-agnostic) rather than discover_ado_aw_pipelines's URL filter — this is the right design for cross-provider support.
  • build_create_body_github_omits_repo_id test explicitly pins the absence of repository.id, which is a real ADO API constraint that's easy to accidentally break.
  • All 21 new tests are well-named and cover the important edge cases (UUID short-circuit, 0/1/>1 service connection matches, resolve_source rules for all 6 combinations, GHES rejection, SSH + HTTPS forms).

Generated by Rust PR Reviewer for issue #905 · sonnet46 2.4M ·

Three findings from the automated Rust PR review:

1. enable.rs:479 — silent empty-string failure on invariant break.
   Restructure the per-fixture loop to bind `github_full_name` as an
   unconditional `String` (empty for the `AdoGit` variant, populated
   for `Github`) so `RepositoryRef::Github { full_name }` no longer
   has to `unwrap_or("")` on an Option that, by construction, is
   always `Some`. Fails-fast at the type level if a future refactor
   moves the invariant.

2. ado/mod.rs:1183 — endpointNames query value must encode `&` / `=`.
   `PATH_SEGMENT` deliberately leaves both unescaped (they're legal
   sub-delims in URI path segments). For a query-string value that
   would shatter a name like `foo&bar=baz` across multiple params.
   Switch to `percent_encoding::NON_ALPHANUMERIC` for that one
   substitution; project (still a path segment) keeps `PATH_SEGMENT`.
   New unit test pins the contract.

3. ado/mod.rs RepoSource::url() doc + discovery.rs normalize_repo_url
   `.git` suffix mismatch. `RepoSource::url()` returns the bare GitHub
   URL while `build_create_body` POSTs the `.git` form, and the
   discovery URL filter compared them via `normalize_repo_url` which
   did not strip the suffix — so a future wiring would silently miss
   every GitHub-source pipeline. Fix `normalize_repo_url` to strip
   trailing `.git` (after the existing trailing-`/` trim) and update
   `RepoSource::url()`'s doc to drop the "used by discovery" claim
   (still deferred per the PR's out-of-scope list). New
   `apply_scope_filter` test covers the `.git` round-trip in both
   directions.

Verification: `cargo build` + `cargo clippy --all-targets
--all-features` clean. `cargo test` — 1787 unit tests + every
integration suite, 0 failures (2 new tests added).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good — clean design, excellent test coverage, solid error handling. A couple of minor issues worth addressing before merge.


Findings

🐛 Bugs / Logic Issues

  • src/ado/mod.rsRepoSource::url() for AdoGit with project: None

    RepoProvider::AdoGit => format!(
        "https://dev.azure.com/{}/{}/_git/{}",
        self.owner,
        self.project.as_deref().unwrap_or(""),  // empty string → double slash
        self.repo,
    ),

    If project is None, the returned URL contains // (e.g. https://dev.azure.com/myorg//_git/myrepo). In practice parse_git_remote always sets project: Some(...) for AdoGit, so this is not reachable via normal usage today. But the type allows it, and the url() doc says "No trailing .git — the discovery normalizer also strips it" without mentioning the project invariant. The defensive fix is simple:

    let project = self.project.as_deref().unwrap_or_default();
    format!("https://dev.azure.com/{project}/{}/_git/{}", self.owner, self.repo)

    ...but ideally the AdoGit arm of the type would structurally enforce project: String (non-optional). If a breaking change to RepoSource is too wide for this PR, at minimum add a debug_assert!(!project.is_empty()).

⚠️ Suggestions

  • src/enable.rsresolve_ado_context error message is misleading for GitHub-source users

    When a GitHub remote is detected but --org/--project are omitted, resolve_ado_context falls through to:

    Could not determine ADO context: no ADO git remote found and --org/--project not both provided.
    When using --definition-ids outside an ADO repo, both --org and --project are required.
    

    The --definition-ids mention is from the lifecycle-command help path and makes no sense in the enable context — it will confuse operators. This is a pre-existing message, but it's now surfaced in a common new code path. A minimal improvement before merging would be a follow-up issue. Slightly higher priority than other suggestions since this is the first-time-user failure mode for the exact audience this PR targets.

  • src/enable.rs — empty-string sentinels for the unused branch of (repo_id, service_conn_id)

    let (repo_id, service_conn_id) = match &resolved {
        ResolvedSource::AdoGit { .. } => (id, String::new()),   // service_conn_id unused
        ResolvedSource::Github { .. } => (String::new(), id),   // repo_id unused
    };

    This pattern means a future developer adding a third provider (or accidentally swapping the tuple order) would silently get empty strings. The code is correct now and well-commented, but encoding the "only one is meaningful" invariant structurally would be safer. One option: keep the sentinel as-is but add debug_assert!(repo_id.is_empty() != service_conn_id.is_empty() || opts.dry_run) to catch any future mix-up.

  • src/enable.rssplit_owner_repo_arg doesn't strip .git suffix

    split_owner_repo (in mod.rs) strips .git from remote URLs; split_owner_repo_arg (in enable.rs) does not. If an operator passes --repository-name owner/repo.git the registered definition will have repository.name = "owner/repo.git" and repository.url = "https://github.com/owner/repo.git.git". Unlikely in practice, but a one-liner trim would make both helpers consistent:

    let repo = parts[1].trim().trim_end_matches(".git");

✅ What Looks Good

  • URL-encoding of endpointNames= query value uses NON_ALPHANUMERIC (not PATH_SEGMENT) so & and = in endpoint names are correctly escaped. The accompanying unit test pins this contract. Well-caught.
  • is_uuid_like UUID short-circuit avoids a network round-trip when a GUID is passed directly. The byte-level check is correct and the test matrix is thorough.
  • normalize_repo_url .git normalization — chaining .trim_end_matches(".git") after the slash trim is the right place, and the two-direction test (stored without .git / queried with and vice versa) gives confidence in the contract.
  • Test coverage across 21 new cases is comprehensive and the snapshot-test on build_create_body_github_omits_repo_id is a good guard for the ADO API contract.
  • ResolvedSource / RepositoryRef split cleanly separates identity-resolution from body-construction — easy to reason about and extend.

Generated by Rust PR Reviewer for issue #905 · sonnet46 1.5M ·

Four findings from the second Rust PR reviewer comment plus an
observed UX gap from the operator's first-run failure mode:

1. ado/mod.rs RepoSource::url() AdoGit double-slash on missing project.
   Bind `project` to a local, add a `debug_assert!` documenting the
   invariant, and document the release-build fallback in the
   doc-comment. New test exercises the release fallback under
   `cfg(not(debug_assertions))`.

2. enable.rs structural tagging of resolved identifiers. Replace the
   `(repo_id, service_conn_id)` tuple with empty-string sentinels by a
   new `ResolvedIdentifiers { AdoGit { repo_id }, Github { full_name,
   service_connection_id } }` enum. The per-fixture loop now matches
   on `(&resolved, &identifiers)` and bails on the impossible cross-
   variant pairings — a future refactor can't silently swap the two
   or forget to populate one half.

3. enable.rs split_owner_repo_arg `.git` suffix. Mirror the behaviour
   of `split_owner_repo` in `src/ado/mod.rs`: trim a trailing `.git`
   from the repo half so `--repository-name owner/repo.git` round-
   trips identically to `owner/repo`. Guard against the pathological
   `owner/.git` (strips to empty repo) with an explicit re-check.
   Two new tests pin the contract.

4. enable.rs friendly --org/--project pre-check for GitHub source.
   When the autodetected source is GitHub and `--org`/`--project` are
   missing, bail with an audience-appropriate error that names both
   flags and includes a worked example, rather than falling through
   to `resolve_ado_context`'s generic "for `--definition-ids` outside
   an ADO repo, both `--org` and `--project` are required" message
   (which is from the lifecycle-command path and makes no sense for
   the first-time `enable` operator). Matches the failure mode the
   user just observed when running `ado-aw enable ./tests/safe-outputs`
   with no flags.

Verification: `cargo clippy --all-targets --all-features` clean.
`cargo test` — 1789 unit tests (+2 new) + every integration suite,
0 failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good — solid design, comprehensive tests, clear error messages. Two minor items worth noting before merge.

Findings

⚠️ Suggestions

  • src/enable.rsresolve_source catch-all arm: The Ok(mut source) arm matches all non-AdoGit providers, not just Github. The comment says "GitHub remote" but the pattern would silently accept any future RepoProvider variant and construct a ResolvedSource::Github body for it. Currently safe (only two providers exist), but fragile. Consider tightening:

    Ok(mut source) if source.provider == RepoProvider::Github => { /* existing body */ }
    Ok(source) => anyhow::bail!("Unsupported source provider: {:?}", source.provider),

    This turns a silent future regression into a compile-time exhaustiveness gap or a runtime bail.

  • src/ado/discovery.rsnormalize_repo_url strip ordering: The current ordering

    .trim_end_matches('/')
    .trim_end_matches(".git")

    is correct for the real-world cases (repo.git, repo.git/) but leaves a trailing slash for the pathological repo/.git form (repo/ instead of repo). The more robust ordering is .trim_end_matches(".git").trim_end_matches('/'), which covers all three forms with the same two operations. No real-world URL from ADO or GitHub takes the /.git form, so this is purely defensive.

✅ What Looks Good

  • RepoProvider/RepoSource two-namespace model is clean and makes the ADO-source / GitHub-source divergence explicit at the type level rather than with empty-string sentinels.
  • RepositoryRef<'a> enum ensuring ADO Git and GitHub POST bodies stay structurally distinct — the _ bail in the (resolved, identifiers) match is a good defensive guard.
  • NON_ALPHANUMERIC for query-string values is correct; the inline comment and the encoding unit test both lock it in.
  • Pre-check for --org/--project before resolve_ado_context gives a targeted, audience-appropriate error instead of the generic "use --definition-ids" message.
  • Test coverage is excellent — 21 new tests, all meaningful boundary cases (0/1/>1 service connection matches, UUID short-circuit, .git suffix normalisation, all six resolve_source combinator paths, provider-agnostic match_definitions_in contract test).

Generated by Rust PR Reviewer for issue #905 · sonnet46 1.8M ·

Two findings from the third reviewer comment:

1. enable.rs resolve_source catch-all arm. The pattern
   `Ok(mut source) => { /* GitHub */ }` matches every non-AdoGit
   provider, not just Github. Restructure to lift the `Err(_)` arm
   out and then `match parsed.provider` directly on the enum — adding
   a new RepoProvider variant is now a compile-time exhaustiveness
   error rather than a silent fall-through to the GitHub code path.
   No behavioural change for existing callers.

2. ado/discovery.rs normalize_repo_url strip ordering. Current
   `.trim_end_matches('/').trim_end_matches(".git")` collapses
   `repo`, `repo/`, `repo.git`, `repo.git/` cleanly but leaves a
   trailing `/` on the pathological `repo/.git` form. Add a final
   `.trim_end_matches('/')` pass so every realistic and pathological
   suffix combination canonicalises to the same form. New unit test
   `normalize_repo_url_collapses_all_suffix_forms` pins the contract
   across all six input shapes.

Verification: cargo clippy clean. cargo test — 1790 unit tests
(+1 new) + every integration suite, 0 failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good overall — solid type-safe design, excellent test coverage, and a clean separation of concerns between source identity and deployment target. Two items worth fixing before merge.


Findings

🐛 Bugs / Logic Issues

src/enable.rssplit_owner_repo_arg silently accepts extra path segments

split_owner_repo_arg uses splitn(2, '/'), so --repository-name owner/repo/extra splits into ["owner", "repo/extra"]. Both halves are non-empty, so the guard passes and repo silently becomes "repo/extra". That flows into full_name = "owner/repo/extra" in the ADO repository.name POST field. ADO rejects it with an opaque API error instead of the operator seeing the expected "must be in the form 'owner/repo'" message.

The analogous split_owner_repo in src/ado/mod.rs correctly uses splitn(3, '/') and only consumes the first two parts, discarding any extra segments (it silently ignores them too, but that function is for git remote URLs where trailing path noise is expected). For a CLI argument the right behaviour is an early, clear error.

Minimal fix — add a check after splitting:

// Reject inputs with more than one slash (e.g. "owner/repo/extra").
if owner.contains('/') || parts[1].contains('/') {
    anyhow::bail!("--repository-name must be in the form 'owner/repo' (got: '{}')", value);
}

Or mirror split_owner_repo with splitn(3, '/') and assert the third slot is absent.


src/ado/mod.rsparse_git_remote: silent empty owner on org_name() returning None

if let Ok(ctx) = parse_ado_remote(remote_url) {
    let owner = ctx.org_name().unwrap_or("").to_string();

If parse_ado_remote succeeds but org_name() somehow returns None (edge-case ADO URL shape), owner becomes "" and the resulting RepoSource produces a URL with an empty segment (https://dev.azure.com//Project/_git/repo). The error surfaces later as a confusing ADO API rejection rather than a clear parse failure. A targeted bail here would be cleaner:

let owner = ctx
    .org_name()
    .ok_or_else(|| anyhow::anyhow!("could not extract org name from ADO remote '{}'", remote_url))?
    .to_string();

⚠️ Suggestions

src/ado/mod.rsNON_ALPHANUMERIC over-encodes service connection names

percent_encoding::utf8_percent_encode(trimmed, percent_encoding::NON_ALPHANUMERIC)

NON_ALPHANUMERIC encodes every non-alphanumeric byte, so a common connection name like ado-aw-github is emitted as ado%2Daw%2Dgithub. ADO should decode this correctly (it's valid percent-encoding), and the existing test pins & and = are encoded. Still, using a custom AsciiSet that preserves -, _, ., ~ (RFC 3986 unreserved characters) would be more conventional and sidestep any hypothetical strict-matching quirk on the ADO server side. Low priority — worth a note in the comment next to the encoding call at minimum.


✅ What Looks Good

  • Type-level enforcement on the (ResolvedSource, ResolvedIdentifiers) pair. The enum structure makes cross-variant mixups a compile error, and the _ => anyhow::bail!("internal error: ...") fallback in the pipeline loop is the right choice over unreachable!() — future refactors that decouple the two can't silently produce a malformed POST body.
  • Exhaustive match parsed.provider rather than a catch-all arm. The comment explaining why is excellent; a new RepoProvider variant is a compile-time failure everywhere it needs handling.
  • normalize_repo_url three-pass trim correctly handles repo/.git (trailing slash first, then .git, then residual slash). The pinned test normalize_repo_url_collapses_all_suffix_forms is a solid regression guard.
  • Test coverage is thorough: 21 new tests covering all remote-parsing combinations, UUID short-circuit, 0/1/>1 service-connection matches, provider-agnostic match_definitions_in, and snapshot tests for both create-body shapes. The build_create_body_github_omits_repo_id test is a particularly nice contract pin.
  • GitHub-source pre-check before resolve_ado_context gives the right first-time-user error for the most likely failure mode (--org/--project not supplied from a GitHub checkout).

Generated by Rust PR Reviewer for issue #905 · sonnet46 1.6M ·

Two bugs and one suggestion from the fourth reviewer comment:

1. enable.rs split_owner_repo_arg silently accepted extra path
   segments. `splitn(2, '/')` on `owner/repo/extra` yielded
   `["owner", "repo/extra"]` and passed the non-empty guard, sliding
   `repo/extra` into the ADO `repository.name` POST field where it
   would surface as an opaque API error. Switch to `splitn(3, '/')`
   and require exactly two parts — `owner/repo/extra` now produces
   three parts and triggers the targeted "must be in the form
   'owner/repo'" message immediately. Two new tests cover the
   three-segment and four-segment cases.

2. ado/mod.rs parse_git_remote: silent empty owner when
   ado_ctx.org_name() returns None. The path is unreachable in
   practice (every `parse_ado_remote`-accepted shape populates
   `org_url` with a `/{org}` segment) but the type allowed it.
   Replace `.unwrap_or("")` with `.ok_or_else(...)` so a future
   regression in `org_name`'s parser fails fast at the boundary
   rather than producing a `https://dev.azure.com//Project/_git/...`
   URL that surfaces as a confusing ADO API error later.

3. ado/mod.rs new QUERY_VALUE AsciiSet. `NON_ALPHANUMERIC` (the
   previous choice for the service-connection name encoding)
   over-encoded RFC 3986 unreserved chars — `ado-aw-github` became
   `ado%2Daw%2Dgithub`. Define a `QUERY_VALUE` set that preserves
   the unreserved class (`-`, `_`, `.`, `~`) while still encoding
   the query-syntax metacharacters (`&`, `=`, `#`, `?`, `+`, `%`)
   and structural ASCII. Strictly correct either way; the literal
   form is what humans read in logs and what strict server-side
   matchers prefer. Existing test updated to use `QUERY_VALUE`; new
   test pins that unreserved chars pass through literally.

Verification: cargo clippy clean. cargo test — 1792 unit tests
(+2 new) + every integration suite, 0 failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Solid implementation — well-structured, excellent test coverage (21 new tests), clear operator-facing error messages. A couple of minor robustness items worth addressing before merge.


Findings

🐛 Bugs / Logic Issues

  • src/ado/mod.rssplit_owner_repo silently drops extra URL path segments

    split_owner_repo uses splitn(3, '/') and reads only the first two parts, silently ignoring a third. This means https://github.com/owner/repo/tree/main (a GitHub tree URL an operator might accidentally paste) resolves to owner/repo instead of failing:

    let mut parts = path.splitn(3, '/');
    let owner = parts.next()?.trim();   // "owner"
    let repo_raw = parts.next()?.trim(); // "repo"
    // "tree/main" silently dropped

    split_owner_repo_arg in enable.rs correctly rejects len != 2, and the doc comment explicitly notes this inconsistency as "more forgivable" for URL parsing. That reasoning is fine for truly pathological inputs, but a pasted tree/PR/blob URL is a realistic operator mistake. At minimum, a test documenting the silent-drop behaviour (or a None return for path depth > 2) would pin the contract.

⚠️ Suggestions

  • src/enable.rs ~line 1197 — redundant double .with_context()

    resolve_service_connection_id already attaches context ("Failed to look up GitHub service connection '{}' in project '{}'") to the network-level error. The call site in enable.rs wraps it again with "Failed to resolve GitHub service connection '{}' in project '{}'". This causes an error chain that reads:

    Failed to resolve GitHub service connection 'foo' in project 'bar':
      Failed to look up GitHub service connection 'foo' in project 'bar':
        <underlying network error>
    

    The outer .with_context() at the call site adds no new information and can be removed. The bail! messages from pick_service_endpoint_id (0-match, >1-match) are already self-contained and won't be affected.

  • src/ado/mod.rs pick_service_endpoint_id — no non-empty check on returned id

    The single-match branch only checks that an id field exists and is a string; it doesn't guard against "":

    1 => entries[0]
        .get("id")
        .and_then(|v| v.as_str())  // accepts ""
        .map(str::to_string)
        .ok_or_else(|| ...),

    An empty connectedServiceId in the POST body would produce an opaque ADO API error rather than a clear local message. ADO shouldn't return "" for a valid endpoint, but adding .filter(|s| !s.is_empty()) before .map(str::to_string) costs nothing and improves debuggability if it ever does.

✅ What Looks Good

  • Type-safe enum designRepositoryRef/ResolvedSource/ResolvedIdentifiers as tagged enums (rather than tuples with empty-string sentinels) prevents future cross-variant confusion at compile time; the _ => bail! arm in run() is a nice belt-and-suspenders addition.
  • QUERY_VALUE encoding — correctly encodes & and = without over-encoding RFC 3986 unreserved chars (-, _, ., ~); the two pinning tests are exactly the right shape.
  • normalize_repo_url three-pass strip — handles all five .git// suffix combinations correctly, including the unusual /.git form; test table is comprehensive.
  • is_uuid_like short-circuit — clean without pulling in a UUID crate; the "false negative is harmless" design rationale is correctly documented.
  • Pre-flight --org/--project check — surfacing the GitHub-source requirement before resolve_ado_context runs gives operators the targeted message rather than the generic definition-IDs guidance. Good UX lift.
  • resolve_source exhaustiveness — matching on parsed.provider directly (not a catch-all Ok(_)) means a future RepoProvider variant is a compile-time exhaustiveness error. Exactly the right pattern here.

Generated by Rust PR Reviewer for issue #905 · sonnet46 2M ·

jamesadevine and others added 2 commits June 8, 2026 16:23
Running `ado-aw enable PATH` against a subdirectory (e.g.
`tests/safe-outputs`) produced two cascading path bugs that made the
command unusable for the AgentPlayground operator flow:

1. `source_path = repo_path.join(&pipeline.source)` joined the
   *scan root* with the marker's repo-root-relative `source` field,
   producing doubled paths like
   `\\?\C:\repo\tests\safe-outputs\tests\safe-outputs\noop.md` —
   every fixture failed with "The system cannot find the path
   specified" (os error 3).

2. `compute_yaml_filename(&pipeline.yaml_path)` used the scan-relative
   yaml path, so the `process.yamlFilename` we'd POST to ADO was
   `/noop.lock.yml` (scan-relative) rather than the actual
   repo-relative `/tests/safe-outputs/noop.lock.yml`. Even if (1)
   hadn't blown up, ADO would have created pipelines pointing at the
   wrong location.

Fix:

- `find_repo_root` in `src/compile/mod.rs` is now `pub` so non-compile
  CLI commands can reuse the walk-up-to-`.git` resolver.
- `enable::run` now resolves two distinct paths: `scan_root` (still
  the canonicalized `PATH` arg / cwd, used by `detect_pipelines`) and
  `repo_root` (anchor for `pipeline.source` joins and for
  `yamlFilename` derivation).
- The per-fixture loop anchors `source_path` on `repo_root` and
  re-anchors `yaml_path` through `scan_root.join(...).strip_prefix(
  repo_root)` so the value handed to `process_one` is always
  repo-root-relative.
- Bails with a clear "must be run inside a git checkout" error if no
  `.git` ancestor can be found (relevant only for `PATH`s outside any
  git repo — not a regression).

Verification:

- New integration test
  `enable_dry_run_against_subdirectory_uses_repo_root_relative_yaml_path`
  exercises the exact `enable --service-connection <guid>
  tests/safe-outputs --project ... --org ... --dry-run` invocation
  the user just ran and asserts the printed POST body contains
  `"yamlFilename": "/tests/safe-outputs/..."` with no "Failed to read
  source" errors.
- Manual dry-run confirmed: 27 fixtures resolved, every POST body
  carries the correct `/tests/safe-outputs/<x>.lock.yml` filename.
- `cargo clippy --all-targets --all-features` clean.
- `cargo test` — 1792 unit tests (unchanged) + every integration
  suite (4 enable, +1 new), 0 failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI failed because the new
`enable_dry_run_against_subdirectory_uses_repo_root_relative_yaml_path`
test relied on `list_definitions` succeeding (it always runs, even in
--dry-run mode, since `decide_action` needs the existing-definitions
snapshot to know which fixtures already exist). Locally this happened
to work via `az` CLI fallback; CI has no Azure login and no tty, so
the auth resolver fell through to the interactive prompt and crashed
with "The input device is not a TTY" before the dry-run path could
print anything.

Fix: spin up a `wiremock::MockServer` that returns an empty
definitions list, point the binary at it via the existing
`ADO_AW_TEST_ORG_URL` test override, and pass an explicit dummy
`--pat` so `resolve_auth` skips the Azure-CLI / interactive paths
entirely. `wiremock` is already a dev-dependency (used by
`tests/audit_it.rs`).

Verified locally: all 4 enable integration tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good — clean design, comprehensive tests, no significant bugs. Two minor issues worth noting.

Findings

⚠️ Suggestions

  • src/enable.rsresolve_source misleading error message (no-remote path): When the user passes --repository-name but omits --service-connection, the first ok_or_else bails with "pass --repository-name owner/repo and --service-connection" even though --repository-name was already supplied. The message should be conditional on whether repository_name_override is already set. Minor UX issue, not a functional bug.

  • src/enable.rs:484-487 — silent fallback in repo_relative_yaml: find_repo_root guarantees the scan root is under the repo root, so strip_prefix should always succeed. But on failure the code silently falls back to pipeline.yaml_path (scan-root-relative), which would register the wrong yamlFilename in ADO with no diagnostic output. A .with_context() bail would make this failure mode visible.

✅ What Looks Good

  • RepoSource / AdoContext / RepositoryRef / ResolvedIdentifiers design: Type-level enforcement of the source-vs-deployment separation and the two ADO body shapes. The comment explaining why ResolvedIdentifiers is an enum rather than a tuple with empty-string sentinels is especially good.
  • scan_root vs repo_root split: Correct fix for the yaml-path anchoring regression; the integration test (enable_dry_run_against_subdirectory_uses_repo_root_relative_yaml_path) pins the exact path-doubling bug described in the comment.
  • QUERY_VALUE encoding set: &, =, #, ?, % are all present. The two unit tests lock in both correctness and human readability of common endpoint names like ado-aw-github.
  • is_uuid_like short-circuit: Skips the API call for already-GUID inputs; the edge-case tests are thorough.
  • GitHub source --org/--project pre-check: Surfaces the error before resolve_ado_context, giving a clearer, audience-appropriate message for the primary new user flow.
  • normalize_repo_url three-pass strip: Handles the repo/.git form that the old two-pass missed; the exhaustive test table documents every form including pathological cases.

Generated by Rust PR Reviewer for issue #905 · sonnet46 2.4M ·

@jamesadevine jamesadevine merged commit 6ab5f9a into main Jun 8, 2026
8 checks passed
@jamesadevine jamesadevine deleted the devinejames/enable-github-source branch June 8, 2026 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant