Skip to content

Commit 4f7cb03

Browse files
committed
feat(s3-sync): Phase E implementation + ADR-0001
Per docs/adr/0001-s3-auth-design.md (new this commit): - boto3's default credential chain (env vars > shared creds > IAM role > IMDS), unmodified. - No implicit auth: caller must set AWS_PROFILE env var, pass --profile NAME, or set AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY. - `endpoint_url` flag for non-AWS S3 (Cloudflare R2, Backblaze B2, MinIO). - Secret-pattern files (`.env*`, `*.credentials.json`, ...) filtered via the existing `_is_secret` check. Library: ClaudeConfig.sync_to_s3(target, profile=None, endpoint_url=None, dry_run=True) -> S3SyncReport. CLI: `ai-config-kit s3-sync --target s3://bucket/path [--profile NAME] [--endpoint-url URL] [--apply]`. (`sync` was already taken by the git commit+push verb; s3-sync is clearer.) Audit event `sync_to_s3` records target, file count, bytes transferred, skipped secrets count. The ADR also documents the GitHub Actions OIDC + AssumeRoleWithWebIdentity flow so CI workflows can sync to S3 without long-lived credentials — no ai-config-kit-specific OIDC code needed since boto3 picks up the env vars set by `aws-actions/configure-aws-credentials`.
1 parent 2ea7589 commit 4f7cb03

4 files changed

Lines changed: 368 additions & 22 deletions

File tree

docs/adr/0001-s3-auth-design.md

Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
# ADR-0001 — S3 auth design
2+
3+
**Status**: Accepted (2026-05-16)
4+
**Phase**: SPEC §4 Phase E — cross-machine sync beyond git
5+
**Unblocks**: `ClaudeConfig.sync_to_s3` implementation
6+
7+
## Context
8+
9+
Phase E ships a way to sync the content dir to an S3-compatible
10+
target (AWS S3, Cloudflare R2, Backblaze B2, MinIO, etc.) so
11+
operators on machines without git access can still get the latest
12+
config. The scaffold shipped in v0.6.0 (commit `9976632`); the
13+
upload itself was blocked on three open questions:
14+
15+
1. **Which credentials chain?** AWS has a documented chain (env
16+
vars → shared credentials file → IAM role → SSO → instance
17+
metadata). Which subset do we honour, in what order?
18+
2. **Do we ship a default profile or require explicit auth?**
19+
3. **Federated identity (OIDC) for CI?** If `sync_to_s3` runs in
20+
CI, we want OIDC trusted publishing not long-lived keys.
21+
22+
## Decision
23+
24+
### 1. Credentials: boto3's default chain, unmodified
25+
26+
boto3's default credential resolution is well-documented and
27+
respects the AWS conventions every operator already knows:
28+
29+
```
30+
1. Constructor args (we never pass these — keeps secrets out of code)
31+
2. AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY env vars
32+
3. AWS_PROFILE env var (selects from ~/.aws/credentials)
33+
4. Shared credentials file ~/.aws/credentials (default profile)
34+
5. Container credentials (ECS task role)
35+
6. Instance metadata (EC2 / EKS / similar)
36+
```
37+
38+
We do NOT override the chain. We pass `Session(profile_name=…)`
39+
when the user explicitly names a profile via `--profile` on the
40+
CLI; otherwise we use the default session and let boto3 resolve.
41+
42+
### 2. Required: no implicit profile
43+
44+
We refuse to upload without one of:
45+
46+
- `AWS_PROFILE` env var set
47+
- `--profile NAME` flag passed
48+
- `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` in env
49+
50+
If none of these are present, `sync_to_s3` raises `ConfigError`
51+
with a remediation that points at the boto3 credential docs.
52+
53+
Rationale: silently picking up `~/.aws/credentials.default` on a
54+
shared machine has surprised more than one operator into
55+
uploading to the wrong account. Explicit > implicit.
56+
57+
### 3. CI: OIDC via boto3 + AssumeRoleWithWebIdentity
58+
59+
For GitHub Actions, the standard pattern is
60+
`aws-actions/configure-aws-credentials` which sets
61+
`AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` /
62+
`AWS_SESSION_TOKEN` from STS. boto3 picks those up via #2 of the
63+
default chain. We don't need ai-config-kit-specific OIDC code.
64+
65+
The workflow side gets a documented snippet in `docs/sync.md`:
66+
67+
```yaml
68+
permissions:
69+
id-token: write
70+
contents: read
71+
steps:
72+
- uses: aws-actions/configure-aws-credentials@v4
73+
with:
74+
role-to-assume: arn:aws:iam::123456789012:role/ai-config-sync
75+
aws-region: us-east-1
76+
- run: ai-config-kit sync --target s3://my-bucket/config --apply
77+
```
78+
79+
### 4. S3-compatible endpoints
80+
81+
Support non-AWS S3 (R2, B2, MinIO) via the `--endpoint-url`
82+
flag, which maps to `Session.resource("s3", endpoint_url=…)`.
83+
Default is AWS; specifying `--endpoint-url` picks any compatible
84+
target.
85+
86+
### 5. Upload semantics
87+
88+
Upload is `client.upload_file(path, bucket, key)` for each file
89+
under `src_dir`. Symlinks resolve before upload (we want the
90+
target's content, not the symlink path). Files matching the
91+
secret patterns (`*.credentials.json`, `.env*`, etc.) are
92+
filtered out by the existing `_is_secret` check.
93+
94+
A manifest file `.ai-config-sync.json` is written to the bucket
95+
root with the per-file sha256 + content_dir layout so a
96+
downstream `sync_from_s3` (Phase E.2) can diff before
97+
overwriting.
98+
99+
## Consequences
100+
101+
### What this enables
102+
103+
- `ClaudeConfig.sync_to_s3(target, profile=None, endpoint_url=None,
104+
dry_run=False)` can be implemented.
105+
- New CLI: `ai-config-kit sync --target s3://bucket/path [--profile NAME]
106+
[--endpoint-url URL] [--apply]`.
107+
- Operators with multi-machine config get a non-git option.
108+
- CI workflows can keep config in S3 + pull on cold runners.
109+
110+
### What this costs
111+
112+
- Optional `[s3]` extras dep (boto3) — already scaffolded.
113+
- A new audit event `s3_sync` recording target, file count,
114+
bytes transferred.
115+
- Secret-pattern filtering is the only defense against uploading
116+
`.env` files; we never override the user's secret patterns
117+
list.
118+
119+
### What this doesn't fix
120+
121+
- `sync_from_s3` (the reverse direction) is Phase E.2 — separate
122+
ADR. Today's flow is one-way (laptop → S3).
123+
- Conflict resolution: if two machines upload to the same key
124+
prefix, last-write-wins. A v0.7 enhancement could use the
125+
manifest's sha256 to detect divergence.
126+
127+
## Alternatives considered
128+
129+
### Vendored credential reading (rejected)
130+
131+
Read `~/.aws/credentials` ourselves + handle each backend
132+
explicitly.
133+
134+
**Why rejected:** reimplementing boto3's chain is fragile and
135+
makes future credential types (SSO refresh, IMDSv2, etc.) our
136+
problem. boto3 already handles this correctly.
137+
138+
### git annex / rclone / rsync wrappers (rejected)
139+
140+
Wrap rclone or git-annex for the sync.
141+
142+
**Why rejected:** adds a system-level binary dep that's painful
143+
on macOS-default + Windows + Linux-fedora-default. boto3 is
144+
pure-Python and bundles cleanly with `pip install
145+
'ai-config-kit[s3]'`.
146+
147+
### Mandatory `--profile` (no env-var support) (rejected)
148+
149+
Force every operator to use a named profile.
150+
151+
**Why rejected:** breaks the CI flow where credentials come from
152+
`configure-aws-credentials` action via env vars.
153+
154+
## Implementation
155+
156+
```python
157+
def sync_to_s3(
158+
self,
159+
target: str,
160+
profile: str | None = None,
161+
endpoint_url: str | None = None,
162+
dry_run: bool = True,
163+
) -> S3SyncReport:
164+
if not target.startswith("s3://"):
165+
raise ConfigError(f"sync target must be s3://bucket/key, got {target!r}")
166+
167+
# Refuse implicit auth — explicit profile or explicit env vars only.
168+
if profile is None and "AWS_PROFILE" not in os.environ and (
169+
"AWS_ACCESS_KEY_ID" not in os.environ
170+
or "AWS_SECRET_ACCESS_KEY" not in os.environ
171+
):
172+
raise ConfigError(
173+
"no AWS auth detected; set AWS_PROFILE, pass --profile NAME, "
174+
"or set AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY"
175+
)
176+
177+
import boto3
178+
session = boto3.Session(profile_name=profile) if profile else boto3.Session()
179+
client = session.client("s3", endpoint_url=endpoint_url)
180+
181+
bucket, _, key_prefix = target[5:].partition("/")
182+
uploaded: list[str] = []
183+
skipped_secrets: list[str] = []
184+
for path in self._files_to_track(include_overlays=True):
185+
if self._is_secret(path):
186+
skipped_secrets.append(str(path.relative_to(self.src_dir)))
187+
continue
188+
rel = path.relative_to(self.src_dir).as_posix()
189+
key = f"{key_prefix.rstrip('/')}/{rel}" if key_prefix else rel
190+
if not dry_run:
191+
client.upload_file(str(path), bucket, key)
192+
uploaded.append(rel)
193+
# Manifest + audit event after the loop.
194+
...
195+
```
196+
197+
## Next steps
198+
199+
1. Land the `sync_to_s3` implementation per the snippet above.
200+
2. Add `S3SyncReport` dataclass.
201+
3. CLI: `ai-config-kit sync --target s3://... [--profile NAME]
202+
[--endpoint-url URL] [--apply]`.
203+
4. Tests using `moto` (the AWS mocking library) — won't add a
204+
network dep.
205+
5. `docs/sync.md` covering the CI snippet + R2/B2/MinIO examples.

src/ai_config_kit/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@
5555
ReconcileReport,
5656
RepairAction,
5757
RepairReport,
58+
S3SyncReport,
5859
SettingsMigrateReport,
5960
SettingsValidateReport,
6061
StatusReport,
@@ -102,6 +103,7 @@
102103
"ReconcileReport",
103104
"RepairAction",
104105
"RepairReport",
106+
"S3SyncReport",
105107
"SettingsMigrateReport",
106108
"SettingsValidateReport",
107109
"StatusReport",

src/ai_config_kit/cli.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,35 @@ def _build_parser() -> argparse.ArgumentParser:
288288
help="Actually write. Default is dry-run.",
289289
)
290290

291+
# s3-sync (Phase E — S3-compatible upload)
292+
p_s3sync = sub.add_parser(
293+
"s3-sync",
294+
help="Sync the content dir to an S3-compatible target (Phase E).",
295+
)
296+
p_s3sync.add_argument(
297+
"--target",
298+
type=str,
299+
required=True,
300+
help="S3 URI: s3://bucket/key-prefix",
301+
)
302+
p_s3sync.add_argument(
303+
"--profile",
304+
type=str,
305+
default=None,
306+
help="Named AWS profile from ~/.aws/credentials.",
307+
)
308+
p_s3sync.add_argument(
309+
"--endpoint-url",
310+
type=str,
311+
default=None,
312+
help="Non-AWS S3-compatible endpoint (R2, B2, MinIO).",
313+
)
314+
p_s3sync.add_argument(
315+
"--apply",
316+
action="store_true",
317+
help="Actually upload. Default is dry-run.",
318+
)
319+
291320
# profiles (Claude Code permission profiles)
292321
p_profiles = sub.add_parser(
293322
"profiles",
@@ -671,6 +700,20 @@ def main(argv: list[str] | None = None) -> int:
671700
print(sm_report.summary())
672701
return 0
673702

703+
if args.cmd == "s3-sync":
704+
s3_report = cfg.sync_to_s3(
705+
target=args.target,
706+
profile=args.profile,
707+
endpoint_url=args.endpoint_url,
708+
dry_run=not args.apply,
709+
)
710+
print(s3_report.summary())
711+
if not args.quiet and s3_report.skipped_secrets:
712+
print(" skipped secrets:")
713+
for s in s3_report.skipped_secrets:
714+
print(f" - {s}")
715+
return 0
716+
674717
if args.cmd == "profiles":
675718
if args.profiles_cmd == "list":
676719
pl_report = cfg.profiles_list()

0 commit comments

Comments
 (0)