|
| 1 | +# ADR-0001 — S3 auth design |
| 2 | + |
| 3 | +**Status**: Accepted (2026-05-16) |
| 4 | +**Phase**: SPEC §4 Phase E — cross-machine sync beyond git |
| 5 | +**Unblocks**: `ClaudeConfig.sync_to_s3` implementation |
| 6 | + |
| 7 | +## Context |
| 8 | + |
| 9 | +Phase E ships a way to sync the content dir to an S3-compatible |
| 10 | +target (AWS S3, Cloudflare R2, Backblaze B2, MinIO, etc.) so |
| 11 | +operators on machines without git access can still get the latest |
| 12 | +config. The scaffold shipped in v0.6.0 (commit `9976632`); the |
| 13 | +upload itself was blocked on three open questions: |
| 14 | + |
| 15 | +1. **Which credentials chain?** AWS has a documented chain (env |
| 16 | + vars → shared credentials file → IAM role → SSO → instance |
| 17 | + metadata). Which subset do we honour, in what order? |
| 18 | +2. **Do we ship a default profile or require explicit auth?** |
| 19 | +3. **Federated identity (OIDC) for CI?** If `sync_to_s3` runs in |
| 20 | + CI, we want OIDC trusted publishing not long-lived keys. |
| 21 | + |
| 22 | +## Decision |
| 23 | + |
| 24 | +### 1. Credentials: boto3's default chain, unmodified |
| 25 | + |
| 26 | +boto3's default credential resolution is well-documented and |
| 27 | +respects the AWS conventions every operator already knows: |
| 28 | + |
| 29 | +``` |
| 30 | +1. Constructor args (we never pass these — keeps secrets out of code) |
| 31 | +2. AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY env vars |
| 32 | +3. AWS_PROFILE env var (selects from ~/.aws/credentials) |
| 33 | +4. Shared credentials file ~/.aws/credentials (default profile) |
| 34 | +5. Container credentials (ECS task role) |
| 35 | +6. Instance metadata (EC2 / EKS / similar) |
| 36 | +``` |
| 37 | + |
| 38 | +We do NOT override the chain. We pass `Session(profile_name=…)` |
| 39 | +when the user explicitly names a profile via `--profile` on the |
| 40 | +CLI; otherwise we use the default session and let boto3 resolve. |
| 41 | + |
| 42 | +### 2. Required: no implicit profile |
| 43 | + |
| 44 | +We refuse to upload without one of: |
| 45 | + |
| 46 | +- `AWS_PROFILE` env var set |
| 47 | +- `--profile NAME` flag passed |
| 48 | +- `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` in env |
| 49 | + |
| 50 | +If none of these are present, `sync_to_s3` raises `ConfigError` |
| 51 | +with a remediation that points at the boto3 credential docs. |
| 52 | + |
| 53 | +Rationale: silently picking up `~/.aws/credentials.default` on a |
| 54 | +shared machine has surprised more than one operator into |
| 55 | +uploading to the wrong account. Explicit > implicit. |
| 56 | + |
| 57 | +### 3. CI: OIDC via boto3 + AssumeRoleWithWebIdentity |
| 58 | + |
| 59 | +For GitHub Actions, the standard pattern is |
| 60 | +`aws-actions/configure-aws-credentials` which sets |
| 61 | +`AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` / |
| 62 | +`AWS_SESSION_TOKEN` from STS. boto3 picks those up via #2 of the |
| 63 | +default chain. We don't need ai-config-kit-specific OIDC code. |
| 64 | + |
| 65 | +The workflow side gets a documented snippet in `docs/sync.md`: |
| 66 | + |
| 67 | +```yaml |
| 68 | +permissions: |
| 69 | + id-token: write |
| 70 | + contents: read |
| 71 | +steps: |
| 72 | + - uses: aws-actions/configure-aws-credentials@v4 |
| 73 | + with: |
| 74 | + role-to-assume: arn:aws:iam::123456789012:role/ai-config-sync |
| 75 | + aws-region: us-east-1 |
| 76 | + - run: ai-config-kit sync --target s3://my-bucket/config --apply |
| 77 | +``` |
| 78 | +
|
| 79 | +### 4. S3-compatible endpoints |
| 80 | +
|
| 81 | +Support non-AWS S3 (R2, B2, MinIO) via the `--endpoint-url` |
| 82 | +flag, which maps to `Session.resource("s3", endpoint_url=…)`. |
| 83 | +Default is AWS; specifying `--endpoint-url` picks any compatible |
| 84 | +target. |
| 85 | + |
| 86 | +### 5. Upload semantics |
| 87 | + |
| 88 | +Upload is `client.upload_file(path, bucket, key)` for each file |
| 89 | +under `src_dir`. Symlinks resolve before upload (we want the |
| 90 | +target's content, not the symlink path). Files matching the |
| 91 | +secret patterns (`*.credentials.json`, `.env*`, etc.) are |
| 92 | +filtered out by the existing `_is_secret` check. |
| 93 | + |
| 94 | +A manifest file `.ai-config-sync.json` is written to the bucket |
| 95 | +root with the per-file sha256 + content_dir layout so a |
| 96 | +downstream `sync_from_s3` (Phase E.2) can diff before |
| 97 | +overwriting. |
| 98 | + |
| 99 | +## Consequences |
| 100 | + |
| 101 | +### What this enables |
| 102 | + |
| 103 | +- `ClaudeConfig.sync_to_s3(target, profile=None, endpoint_url=None, |
| 104 | + dry_run=False)` can be implemented. |
| 105 | +- New CLI: `ai-config-kit sync --target s3://bucket/path [--profile NAME] |
| 106 | + [--endpoint-url URL] [--apply]`. |
| 107 | +- Operators with multi-machine config get a non-git option. |
| 108 | +- CI workflows can keep config in S3 + pull on cold runners. |
| 109 | + |
| 110 | +### What this costs |
| 111 | + |
| 112 | +- Optional `[s3]` extras dep (boto3) — already scaffolded. |
| 113 | +- A new audit event `s3_sync` recording target, file count, |
| 114 | + bytes transferred. |
| 115 | +- Secret-pattern filtering is the only defense against uploading |
| 116 | + `.env` files; we never override the user's secret patterns |
| 117 | + list. |
| 118 | + |
| 119 | +### What this doesn't fix |
| 120 | + |
| 121 | +- `sync_from_s3` (the reverse direction) is Phase E.2 — separate |
| 122 | + ADR. Today's flow is one-way (laptop → S3). |
| 123 | +- Conflict resolution: if two machines upload to the same key |
| 124 | + prefix, last-write-wins. A v0.7 enhancement could use the |
| 125 | + manifest's sha256 to detect divergence. |
| 126 | + |
| 127 | +## Alternatives considered |
| 128 | + |
| 129 | +### Vendored credential reading (rejected) |
| 130 | + |
| 131 | +Read `~/.aws/credentials` ourselves + handle each backend |
| 132 | +explicitly. |
| 133 | + |
| 134 | +**Why rejected:** reimplementing boto3's chain is fragile and |
| 135 | +makes future credential types (SSO refresh, IMDSv2, etc.) our |
| 136 | +problem. boto3 already handles this correctly. |
| 137 | + |
| 138 | +### git annex / rclone / rsync wrappers (rejected) |
| 139 | + |
| 140 | +Wrap rclone or git-annex for the sync. |
| 141 | + |
| 142 | +**Why rejected:** adds a system-level binary dep that's painful |
| 143 | +on macOS-default + Windows + Linux-fedora-default. boto3 is |
| 144 | +pure-Python and bundles cleanly with `pip install |
| 145 | +'ai-config-kit[s3]'`. |
| 146 | +
|
| 147 | +### Mandatory `--profile` (no env-var support) (rejected) |
| 148 | + |
| 149 | +Force every operator to use a named profile. |
| 150 | + |
| 151 | +**Why rejected:** breaks the CI flow where credentials come from |
| 152 | +`configure-aws-credentials` action via env vars. |
| 153 | + |
| 154 | +## Implementation |
| 155 | + |
| 156 | +```python |
| 157 | +def sync_to_s3( |
| 158 | + self, |
| 159 | + target: str, |
| 160 | + profile: str | None = None, |
| 161 | + endpoint_url: str | None = None, |
| 162 | + dry_run: bool = True, |
| 163 | +) -> S3SyncReport: |
| 164 | + if not target.startswith("s3://"): |
| 165 | + raise ConfigError(f"sync target must be s3://bucket/key, got {target!r}") |
| 166 | +
|
| 167 | + # Refuse implicit auth — explicit profile or explicit env vars only. |
| 168 | + if profile is None and "AWS_PROFILE" not in os.environ and ( |
| 169 | + "AWS_ACCESS_KEY_ID" not in os.environ |
| 170 | + or "AWS_SECRET_ACCESS_KEY" not in os.environ |
| 171 | + ): |
| 172 | + raise ConfigError( |
| 173 | + "no AWS auth detected; set AWS_PROFILE, pass --profile NAME, " |
| 174 | + "or set AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY" |
| 175 | + ) |
| 176 | +
|
| 177 | + import boto3 |
| 178 | + session = boto3.Session(profile_name=profile) if profile else boto3.Session() |
| 179 | + client = session.client("s3", endpoint_url=endpoint_url) |
| 180 | +
|
| 181 | + bucket, _, key_prefix = target[5:].partition("/") |
| 182 | + uploaded: list[str] = [] |
| 183 | + skipped_secrets: list[str] = [] |
| 184 | + for path in self._files_to_track(include_overlays=True): |
| 185 | + if self._is_secret(path): |
| 186 | + skipped_secrets.append(str(path.relative_to(self.src_dir))) |
| 187 | + continue |
| 188 | + rel = path.relative_to(self.src_dir).as_posix() |
| 189 | + key = f"{key_prefix.rstrip('/')}/{rel}" if key_prefix else rel |
| 190 | + if not dry_run: |
| 191 | + client.upload_file(str(path), bucket, key) |
| 192 | + uploaded.append(rel) |
| 193 | + # Manifest + audit event after the loop. |
| 194 | + ... |
| 195 | +``` |
| 196 | + |
| 197 | +## Next steps |
| 198 | + |
| 199 | +1. Land the `sync_to_s3` implementation per the snippet above. |
| 200 | +2. Add `S3SyncReport` dataclass. |
| 201 | +3. CLI: `ai-config-kit sync --target s3://... [--profile NAME] |
| 202 | + [--endpoint-url URL] [--apply]`. |
| 203 | +4. Tests using `moto` (the AWS mocking library) — won't add a |
| 204 | + network dep. |
| 205 | +5. `docs/sync.md` covering the CI snippet + R2/B2/MinIO examples. |
0 commit comments