tsp-client: fetch spec files from GitHub before falling back to git#15607
tsp-client: fetch spec files from GitHub before falling back to git#15607jeremymeng wants to merge 2 commits into
Conversation
Add a GitHub-first download path used by `init` (for `tspconfig.yaml`) and by `sync` (for the spec directory plus any `additionalDirectories`). Strategies are tried in order: 1. `gh` CLI when installed (uses existing `gh auth` for private repos / rate limits) 2. `fetch` against the GitHub REST API and `raw.githubusercontent.com`, sending `Authorization: Bearer` when `GITHUB_TOKEN` (preferred) or `GH_TOKEN` is set 3. existing `cloneRepo` + `sparseCheckout` + `addSpecFiles` + `checkoutCommit` fallback Failures are logged at debug level and silently fall back; downstream behavior (`cp` into `srcDir`, `additionalDirectories` handling, error reporting) is unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
I found that when running |
There was a problem hiding this comment.
Pull request overview
This PR updates tools/tsp-client to prefer downloading TypeSpec spec content directly from GitHub (via gh or the GitHub REST API + raw.githubusercontent.com) instead of always doing a sparse git clone, while keeping the existing git-based path as a fallback.
Changes:
- Added a new
githubFetchmodule that can list/download pinned spec trees and files from GitHub with retries and bounded concurrency. - Updated
initCommandandsyncCommandto attempt GitHub-based fetches first, then fall back to the existing sparse-clone workflow. - Added Vitest coverage for URL building, tree walking/truncation handling, auth header propagation, and end-to-end directory/file download behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tools/tsp-client/src/githubFetch.ts | Implements GitHub-first listing/downloading (gh CLI + REST/raw) and exposes best-effort fetch APIs for callers. |
| tools/tsp-client/src/commands.ts | Wires GitHub-first fetch into init (single file) and sync (directory + additionalDirectories) with git fallback. |
| tools/tsp-client/test/githubFetch.spec.ts | Adds unit/integration-style tests covering token selection, URL builders, tree traversal (incl. truncation), and downloads. |
| tools/tsp-client/CHANGELOG.md | Documents the new default GitHub-first behavior and auth env vars used. |
Comments suppressed due to low confidence (1)
tools/tsp-client/src/githubFetch.ts:487
- Similar to tryFetchSpecFromGitHub: if downloadFileFromGitHub fails after creating parent directories, returning false can leave a non-empty cloneDir and break the subsequent
git clonefallback. Ensure failure leaves the filesystem in a clean/empty state for the fallback path (e.g., write to a temp file then rename, and/or remove created directories on error).
try {
const opts = await resolveGitHubFetchOptions();
Logger.debug(
`Attempting GitHub-based file fetch: ${args.repo}@${args.commit}:${args.path} via ${describeStrategy(opts)}`,
);
await downloadFileFromGitHub({ ...args, opts });
return true;
} catch (err) {
Logger.debug(
`GitHub-based file fetch failed; falling back to git sparse clone: ${(err as Error).message}`,
);
return false;
}
| try { | ||
| const opts = await resolveGitHubFetchOptions(); | ||
| Logger.debug( | ||
| `Attempting GitHub-based spec fetch: ${args.repo}@${args.commit} via ${describeStrategy(opts)}`, | ||
| ); | ||
| await downloadDirectoryFromGitHub({ | ||
| repo: args.repo, | ||
| commit: args.commit, | ||
| directory: args.directory, | ||
| destRoot: args.destRoot, | ||
| opts, | ||
| }); | ||
| for (const dir of args.additionalDirectories ?? []) { | ||
| Logger.debug(`Fetching additional directory from GitHub: ${dir}`); | ||
| await downloadDirectoryFromGitHub({ | ||
| repo: args.repo, | ||
| commit: args.commit, | ||
| directory: dir, | ||
| destRoot: args.destRoot, | ||
| opts, | ||
| }); | ||
| } | ||
| return true; | ||
| } catch (err) { | ||
| Logger.debug( | ||
| `GitHub-based spec fetch failed; falling back to git sparse clone: ${(err as Error).message}`, | ||
| ); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Addressed in 88f098e. tryFetchSpecFromGitHub and tryFetchFileFromGitHub now empty their destination directory contents before returning false, so the subsequent git clone fallback writes into a clean directory. tryFetchFileFromGitHub gained an optional destRoot parameter (wired from initCommand) for this purpose. New tests empties destRoot when a spec fetch fails partway through and empties destRoot when a single-file fetch fails verify the cleanup.
| export async function resolveGitHubFetchOptions(): Promise<GitHubFetchOptions> { | ||
| const useGhCli = await isGhCliAvailable(); | ||
| let token = getGitHubToken(); | ||
| if (!token && useGhCli) { | ||
| token = await readGhAuthToken(); | ||
| } | ||
| return { useGhCli, token }; | ||
| } |
There was a problem hiding this comment.
Addressed in 88f098e. The strategy order is now gh -> REST -> git: when the gh CLI attempt fails (gh installed but unauthenticated, transient gh api failure, etc.), we cleanup partial writes and retry once via the REST API path (preserving any token from GITHUB_TOKEN / GH_TOKEN / gh auth token) before falling back to git. Centralized in a new runWithFallback helper. New test retries via REST when the gh CLI strategy fails covers this via a stubbed gh subprocess runner.
1. On any GitHub-fetch failure, empty destRoot before returning false so the subsequent `git clone` fallback isn't blocked by partial writes. 2. When the gh CLI strategy fails (e.g. gh installed but unauthenticated), transparently retry once via the REST API path before giving up. The strategy order is now gh -> REST -> git instead of gh -> git. Implementation: extract a `runWithFallback` helper that owns strategy selection, retry, and cleanup. Add a `destRoot` cleanup hook to `tryFetchFileFromGitHub` and wire it from `initCommand`. Expose a `_setGhRunnerForTests` seam so the gh subprocess can be stubbed; add 3 new tests covering spec-fetch cleanup, file-fetch cleanup, and gh -> REST retry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Adds a GitHub-first download path to
tsp-clientso spec files are pulled directly fromazure-rest-api-specs/-prinstead of always doing a sparsegit clone. The existing git path stays in place as a silent fallback.Wired into both call sites in
src/commands.ts:initCommandURL branch — fetches justtspconfig.yamlahead of the rest of initsyncCommandclone branch — fetches the spec directory plus everyadditionalDirectoriesentry into the samecloneDir/<directory>layout that downstream code expectsStrategy order
Decided once per call:
ghCLI when installed (uses existinggh authso private repos and higher rate limits work)fetchagainst the GitHub REST API andraw.githubusercontent.com, sendingAuthorization: Bearer <token>whenGITHUB_TOKEN(preferred) orGH_TOKENis setcloneRepo+sparseCheckout+addSpecFiles+checkoutCommitfallbackAny error in 1 or 2 is logged at debug level and we silently fall back to the git path. There is no opt-out flag.
Implementation notes
src/githubFetch.tsexposestryFetchSpecFromGitHubandtryFetchFileFromGitHubreturningbooleanso callers can fall back transparently.recursive=1), transparently handlingtruncated: trueby recursing into immediate subtrees.raw.githubusercontent.com(smaller than the contents API and works for files of any size).resolveTspConfigUrl), so caching at GitHub's edge works well and there's no race with branch updates.Validation
npm run build— cleannpm test— 81 passed / 9 pre-existing skips, including 15 new tests intest/githubFetch.spec.tscovering token preference, URL builders, segment-walkinglistTree, truncation handling, non-2xx → fallback, end-to-end directory download, andAuthorization: Bearerheader propagation.Changelog
Unreleased entry added in
tools/tsp-client/CHANGELOG.md.