|
| 1 | +# Per-repo filesystem resolution + `{orgID}/{repoName}` paths (HTTP focus) |
| 2 | + |
| 3 | +## Context |
| 4 | + |
| 5 | +Today the `daemon` holds a single static `fs billy.Filesystem` (the whole bucket) |
| 6 | +and a single `loader transport.Loader`, both built once in `main.go`. Every |
| 7 | +transport passes a **raw, unvalidated, variable-depth** path straight into |
| 8 | +`auth.Request.Repo` and into `load`/`loadOrInit`, which `Chroot`s the one bucket |
| 9 | +fs by that path. |
| 10 | + |
| 11 | +We want two coupled changes: |
| 12 | + |
| 13 | +1. **Restrict repo paths to `{orgID}/{repoName}`** — `orgID` is an opaque |
| 14 | + reference a later API call will validate; for now it's accepted as-is. Paths |
| 15 | + that aren't exactly two segments are rejected. The `.git` suffix is stripped |
| 16 | + from the repo name (`org/repo.git` and `org/repo` resolve to the same repo; |
| 17 | + storage key `org/repo/`). |
| 18 | +2. **Discover the billy filesystem per-repo via a pluggable hook**, and **pass |
| 19 | + the HTTP Basic-auth username/password into that hook** so a real backend can |
| 20 | + route an org to its own bucket/credentials based on who's calling. The |
| 21 | + default hook preserves today's behavior (chroot the one bucket fs, ignoring |
| 22 | + the credential). |
| 23 | + |
| 24 | +**Scope:** this pass targets the **HTTP** transport. SSH is explicitly out of |
| 25 | +scope. The shared resolution layer is transport-agnostic, so git:// and SSH get |
| 26 | +only the mechanical edits needed to keep compiling (they pass an empty |
| 27 | +credential); their auth semantics are unchanged. |
| 28 | + |
| 29 | +## New package: `internal/repofs` |
| 30 | + |
| 31 | +Transport-neutral, mirroring how `internal/auth` is structured. Imports only |
| 32 | +`context`, `errors`, `path`, `strings`, and `go-billy/v6`. |
| 33 | + |
| 34 | +```go |
| 35 | +package repofs |
| 36 | + |
| 37 | +var ErrInvalidPath = errors.New("repository path must be of the form {orgID}/{repoName}") |
| 38 | + |
| 39 | +// RepoRef identifies a repository. OrgID is opaque (validated later); Name has |
| 40 | +// any trailing ".git" stripped. |
| 41 | +type RepoRef struct { |
| 42 | + OrgID string |
| 43 | + Name string |
| 44 | +} |
| 45 | + |
| 46 | +// Path is the canonical storage/identity path "orgID/name". |
| 47 | +func (r RepoRef) Path() string { return path.Join(r.OrgID, r.Name) } |
| 48 | + |
| 49 | +// Parse trims surrounding slashes, requires exactly two non-empty segments, |
| 50 | +// and strips a trailing ".git" from the name. OrgID is not otherwise validated. |
| 51 | +func Parse(raw string) (RepoRef, error) |
| 52 | + |
| 53 | +// Credential carries the HTTP Basic-auth username/password (zero value = none). |
| 54 | +// Unvalidated; the Resolver decides what to do with it. |
| 55 | +type Credential struct { |
| 56 | + Username string |
| 57 | + Password string |
| 58 | +} |
| 59 | + |
| 60 | +// Resolver maps a RepoRef (plus the caller's credential) to the |
| 61 | +// billy.Filesystem rooted at that repository. This is the hook a real backend |
| 62 | +// implements to route an org to its bucket. |
| 63 | +type Resolver interface { |
| 64 | + Resolve(ctx context.Context, ref RepoRef, cred Credential) (billy.Filesystem, error) |
| 65 | +} |
| 66 | + |
| 67 | +// BucketResolver is the default Resolver: chroot one base filesystem (the whole |
| 68 | +// bucket) to ref.Path(), ignoring the credential. Preserves current behavior. |
| 69 | +type BucketResolver struct{ Base billy.Filesystem } |
| 70 | +func (b BucketResolver) Resolve(_ context.Context, ref RepoRef, _ Credential) (billy.Filesystem, error) { |
| 71 | + return b.Base.Chroot(ref.Path()) |
| 72 | +} |
| 73 | +``` |
| 74 | + |
| 75 | +`Parse` is the single validation path. Add unit tests for valid input, |
| 76 | +missing/extra segments, empty segments, trailing slash, and `.git` stripping. |
| 77 | + |
| 78 | +## `daemon` changes (`cmd/objgitd/git_protocol.go`) |
| 79 | + |
| 80 | +Replace the `fs` and `loader` fields: |
| 81 | + |
| 82 | +```go |
| 83 | +type daemon struct { |
| 84 | + sysFS billy.Filesystem // bucket-level storage (SSH host key); NOT repo-scoped |
| 85 | + resolver repofs.Resolver |
| 86 | + authz auth.Authorizer |
| 87 | + allowHooks bool |
| 88 | + hookTimeout time.Duration |
| 89 | +} |
| 90 | +``` |
| 91 | + |
| 92 | +Rewrite resolution to go through the hook (threading the credential), building |
| 93 | +the storer per resolved fs. Reuse go-git's bare-repo detection |
| 94 | +(`FilesystemLoader.load` returns `ErrRepositoryNotFound` when no `config` exists |
| 95 | +at the chroot root): |
| 96 | + |
| 97 | +```go |
| 98 | +// storerFor returns the bare-repo storer rooted at fs, or |
| 99 | +// transport.ErrRepositoryNotFound when none exists there. |
| 100 | +func storerFor(fs billy.Filesystem) (storage.Storer, error) { |
| 101 | + return transport.NewFilesystemLoader(fs, false).Load(&url.URL{Path: "/"}) |
| 102 | +} |
| 103 | + |
| 104 | +func (d *daemon) load(ctx context.Context, ref repofs.RepoRef, cred repofs.Credential) (storage.Storer, error) { |
| 105 | + fs, err := d.resolver.Resolve(ctx, ref, cred) |
| 106 | + if err != nil { return nil, err } |
| 107 | + st, err := storerFor(fs) |
| 108 | + if err != nil { return nil, err } |
| 109 | + if err := ensureHEAD(st); err != nil { slog.Warn("...", "repo", ref.Path(), "err", err) } |
| 110 | + return st, nil |
| 111 | +} |
| 112 | + |
| 113 | +func (d *daemon) loadOrInit(ctx context.Context, ref repofs.RepoRef, cred repofs.Credential) (storage.Storer, error) { |
| 114 | + fs, err := d.resolver.Resolve(ctx, ref, cred) |
| 115 | + if err != nil { return nil, err } |
| 116 | + st, err := storerFor(fs) |
| 117 | + if err == nil { ensureHEAD(st); return st, nil } |
| 118 | + if !errors.Is(err, transport.ErrRepositoryNotFound) { return nil, err } |
| 119 | + st = filesystem.NewStorage(fs, cache.NewObjectLRUDefault()) |
| 120 | + if _, err := git.Init(st, git.WithDefaultBranch(plumbing.NewBranchReferenceName("main"))); err != nil { |
| 121 | + return nil, fmt.Errorf("init bare repo: %w", err) |
| 122 | + } |
| 123 | + metrics.ReposCreated() |
| 124 | + slog.Info("created repository", "repo", ref.Path()) |
| 125 | + return st, nil |
| 126 | +} |
| 127 | +``` |
| 128 | + |
| 129 | +The old `d.fs.Chroot(repoPath)` step is gone — `Resolve` returns the repo-root |
| 130 | +fs directly, so resolution happens once per request. |
| 131 | + |
| 132 | +## HTTP transport (`cmd/objgitd/http.go` + `main.go`) — primary work |
| 133 | + |
| 134 | +Replace the suffix-dispatch `ServeHTTP` with an `http.ServeMux` (built by a new |
| 135 | +`d.httpHandler()` method, wired in `main.go` as the server `Handler`). With a |
| 136 | +fixed two-segment path the wildcards the old code couldn't use now work: |
| 137 | + |
| 138 | +- `GET /{orgID}/{repoName}/info/refs` |
| 139 | +- `POST /{orgID}/{repoName}/git-upload-pack` |
| 140 | +- `POST /{orgID}/{repoName}/git-receive-pack` |
| 141 | + |
| 142 | +Handlers read `r.PathValue("orgID")`/`r.PathValue("repoName")`, build the ref via |
| 143 | +`repofs.Parse(path.Join(orgID, repoName))`, and 400 on `ErrInvalidPath`. |
| 144 | +ServeMux 404s anything that isn't exactly two segments before the suffix, so the |
| 145 | +shape is enforced for free. |
| 146 | + |
| 147 | +`resolve` extracts the Basic-auth credential and threads it through: |
| 148 | + |
| 149 | +```go |
| 150 | +func credFromRequest(r *http.Request) (auth.Credential, repofs.Credential) { |
| 151 | + if u, p, ok := r.BasicAuth(); ok { |
| 152 | + return auth.BasicAuth{Username: u, Password: p}, repofs.Credential{Username: u, Password: p} |
| 153 | + } |
| 154 | + return auth.Anonymous{}, repofs.Credential{} |
| 155 | +} |
| 156 | +``` |
| 157 | + |
| 158 | +(or keep the existing `auth` credential helper and build the `repofs.Credential` |
| 159 | +inline). `resolve`, `handleInfoRefs`, `handleRPC`, and `d.receivePack` change |
| 160 | +their `repoPath string` parameter to a `repofs.RepoRef`; `resolve` passes the |
| 161 | +`repofs.Credential` to `load`/`loadOrInit`. Logging/hook context uses |
| 162 | +`ref.Path()`. Remove the variable-depth comment block and the now-unused |
| 163 | +`strings` import if it drops out. |
| 164 | + |
| 165 | +## git:// and SSH — mechanical only (out of scope) |
| 166 | + |
| 167 | +`git_protocol.go handle` and `ssh.go handleSSH` must adapt to the new |
| 168 | +`load`/`loadOrInit` signatures: parse their raw path with `repofs.Parse` |
| 169 | +(rendering `ErrInvalidPath` in their own dialect — pktline error / stderr+exit) |
| 170 | +and pass an empty `repofs.Credential{}`. `ssh.go`'s host-key load switches from |
| 171 | +`d.fs` to `d.sysFS`. No further redesign of these transports. |
| 172 | + |
| 173 | +## `main.go` changes |
| 174 | + |
| 175 | +- Keep building the base bucket fs (`fsys`) as today. |
| 176 | +- `d := &daemon{ sysFS: fsys, resolver: repofs.BucketResolver{Base: fsys}, authz: ..., allowHooks: ..., hookTimeout: ... }` — drop the `loader` field. |
| 177 | +- HTTP server `Handler: d.httpHandler()` instead of `Handler: d`. |
| 178 | +- Drop the `transport.NewFilesystemLoader` call; remove the `transport` import |
| 179 | + from `main.go` if it becomes unused. |
| 180 | + |
| 181 | +## Behavioral note / migration |
| 182 | + |
| 183 | +Stripping `.git` and requiring an org changes the storage key from `repo.git/` |
| 184 | +to `org/repo/`. Repos created under the old layout won't resolve under the new |
| 185 | +scheme. Acceptable for the current stage; no migration is in scope. |
| 186 | + |
| 187 | +## Tests |
| 188 | + |
| 189 | +- New `internal/repofs/repofs_test.go` — table-driven `Parse` cases (and a tiny |
| 190 | + `BucketResolver.Resolve` check that it chroots to `ref.Path()`). |
| 191 | +- Update `cmd/objgitd/http_test.go` (and the shared helpers in |
| 192 | + `git_protocol_test.go` it reuses): remotes gain an org segment (`/test.git` |
| 193 | + → `/acme/test.git`), and storage-key assertions drop `.git` |
| 194 | + (`/test.git/config` → `/acme/test/config`; `assertPackedRepo(t, fs, |
| 195 | +"/acme/test")`). The git:// tests in `git_protocol_test.go` need the same path |
| 196 | + updates to keep passing. |
| 197 | +- Optionally add an HTTP test that a single-segment path returns 404 and that a |
| 198 | + Basic-auth credential reaches a stub resolver. |
| 199 | + |
| 200 | +## Verification |
| 201 | + |
| 202 | +```text |
| 203 | +go build ./... |
| 204 | +go test ./internal/repofs/... |
| 205 | +go test -run TestSmartHTTP ./cmd/objgitd/... # requires git on PATH |
| 206 | +go test ./cmd/objgitd/... |
| 207 | +``` |
| 208 | + |
| 209 | +End-to-end against a real bucket: |
| 210 | + |
| 211 | +```text |
| 212 | +./objgitd -bucket $BUCKET -http-bind :8080 -allow-push |
| 213 | +git clone http://user:pass@localhost:8080/acme/demo.git # creates acme/demo/ on first push; user/pass reach the resolver |
| 214 | +git clone http://localhost:8080/demo.git # single segment -> 404 |
| 215 | +``` |
0 commit comments