Skip to content

Commit eabc0ef

Browse files
committed
Merge TASK-1066 antigravity provider
2 parents 1c749b1 + 6c42f33 commit eabc0ef

18 files changed

Lines changed: 2151 additions & 15 deletions

File tree

Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
# TASK-1066 Antigravity Provider Design
2+
3+
## Goal
4+
5+
Add a Garyx provider backed by the local Antigravity CLI (`agy`) so custom
6+
agents can run through the user's local Antigravity OAuth login. The provider
7+
defaults to `Claude Opus 4.6 (Thinking)`.
8+
9+
The implementation must use the `agy` CLI process and Antigravity transcript
10+
files. It must not use the `google-antigravity` Python SDK because that SDK does
11+
not reuse the local OAuth subscription path.
12+
13+
## CLI Contract
14+
15+
First run:
16+
17+
```text
18+
agy -p "<prompt>" --model "Claude Opus 4.6 (Thinking)" --dangerously-skip-permissions
19+
```
20+
21+
Continuation:
22+
23+
```text
24+
agy -p "<prompt>" --model "Claude Opus 4.6 (Thinking)" --conversation "<id>" --dangerously-skip-permissions
25+
```
26+
27+
`--continue` is not used because it resumes the most recent global
28+
conversation, not the Garyx thread-bound conversation.
29+
30+
`ANTIGRAVITY_CONVERSATION_ID` was tested locally and ignored by the current
31+
CLI. Garyx therefore cannot precompute the transcript path on first run. The
32+
provider discovers the new conversation id after spawn, stores it as
33+
`sdk_session_id`, and uses `--conversation` on later runs.
34+
35+
## Provider Surface
36+
37+
- Add `ProviderType::Antigravity`.
38+
- Canonical slug: `antigravity`.
39+
- Alias: `agy`.
40+
- Add a built-in custom-agent profile with `agent_id: "antigravity"`.
41+
- Add optional default provider registration in bridge lifecycle for config keys
42+
`antigravity` and `agy`.
43+
- Add a provider-model catalog with the static `agy models` entries, defaulting
44+
to `Claude Opus 4.6 (Thinking)`. The Gemini entries can remain selectable for
45+
users who explicitly choose them, but the recommended/default model stays
46+
Claude because Gemini Antigravity calls can fail with location-gated Google
47+
AI Platform errors.
48+
- Reuse `AgentProviderConfig` fields for `workspace_dir`, `default_model`,
49+
`model`, `timeout_seconds`, and `env`.
50+
- Add provider-specific shared config fields: `antigravity_bin` and optional
51+
`antigravity_brain_root`. The default brain root is derived from
52+
`$HOME/.gemini/antigravity-cli/brain`, with an env override allowed for tests.
53+
- Add `AntigravityCliConfig` in `garyx-models/src/provider.rs`.
54+
55+
Custom-agent model tri-state semantics stay unchanged: absent preserves on
56+
update, empty string clears to provider default, and a non-empty value sets the
57+
model.
58+
59+
## Run Flow
60+
61+
Implement `garyx-bridge/src/antigravity_provider.rs`.
62+
63+
For each run:
64+
65+
1. Resolve `run_id` from metadata using the same fallback order as the Gemini
66+
provider.
67+
2. Resolve cwd from run options, provider config, or current directory.
68+
3. Resolve binary from `antigravity_bin`, then `agy`.
69+
4. Resolve model from metadata `model`, config `model`, config `default_model`,
70+
then built-in default `Claude Opus 4.6 (Thinking)`.
71+
5. Build the prompt with existing Garyx prompt helpers. File/image payloads are
72+
converted to local attachment instructions using existing staging helpers.
73+
6. Spawn `agy -p <prompt> --model <model> --dangerously-skip-permissions`.
74+
Add `--conversation <id>` when the thread already has a bound session id,
75+
add `--add-dir <workspace>` when a workspace is available, and add
76+
`--log-file <temp-run-log>` to make first-run diagnostics deterministic.
77+
7. Pass `--print-timeout` from the provider timeout so `agy` and Garyx agree on
78+
the maximum run duration.
79+
8. Merge config env, task CLI env, and provider-specific metadata env
80+
(`desktop_antigravity_env`) into the child environment.
81+
9. Register the child process by run id for abort/shutdown.
82+
10. Tail transcript rows while the child runs.
83+
11. Emit `SessionBound` once the conversation id is known.
84+
12. On process exit, drain remaining transcript rows, emit `Done`, unregister
85+
the child, and return `ProviderRunResult`.
86+
87+
## Conversation Discovery
88+
89+
Known session id:
90+
91+
```text
92+
$HOME/.gemini/antigravity-cli/brain/<id>/.system_generated/logs/transcript.jsonl
93+
```
94+
95+
Fresh first run:
96+
97+
1. Record run start time before spawning `agy`.
98+
2. Hold a short provider-local fresh-session discovery lock. First runs without
99+
a known conversation id are the only ambiguous case, and serializing just
100+
this discovery path avoids cross-claiming two new Antigravity conversations.
101+
3. Prefer parsing the conversation id from the per-run `--log-file` if the CLI
102+
writes it there. If not present, scan
103+
`$HOME/.gemini/antigravity-cli/conversations/*.db` for files created or
104+
modified after run start.
105+
4. Prefer the newest candidate whose matching brain log path exists and whose
106+
turn-0 `USER_INPUT` matches the prompt envelope for this run.
107+
5. Once found, emit `SessionBound { sdk_session_id: <id> }` and persist the id
108+
in the provider's thread session map.
109+
110+
If `--conversation <id>` fails with a session-not-found style error, clear the
111+
thread's session id and retry once as a fresh conversation. Do not retry on
112+
ordinary model/tool/runtime failures.
113+
114+
Forking is out of scope for the first implementation. If
115+
`sdk_session_fork=true` is requested, return `SessionError` explaining that
116+
Antigravity exposes resume-by-id but no safe local fork primitive.
117+
118+
## Transcript Reader
119+
120+
Read compact `transcript.jsonl` first. When the corresponding
121+
`transcript_full.jsonl` exists, keep it available for row replacement. Local
122+
evidence showed compact and full files can differ by small encoding/formatting
123+
amounts without a stable `is_truncated` flag on every row, so the reader should
124+
replace a compact row from full only when:
125+
126+
- compact row has `is_truncated: true`, or
127+
- compact row is missing `content`, `thinking`, `tool_calls`, or `error` that
128+
exists on the full row with the same `step_index`.
129+
130+
Rows are processed once by `step_index`; before a continuation run starts, record
131+
the current max `step_index` and emit only rows with `step_index` greater than
132+
that baseline. If a row is rewritten while being tailed, the later complete
133+
parse wins before emission.
134+
135+
## Event Mapping
136+
137+
| Antigravity row | Garyx behavior |
138+
| --- | --- |
139+
| `USER_INPUT` | Skip. Garyx already persists the user row. |
140+
| `CONVERSATION_HISTORY` | Skip. |
141+
| `SYSTEM_MESSAGE` | Skip. |
142+
| `CHECKPOINT` | Skip. Checkpoints contain internal summaries and may include local transcript paths. |
143+
| `PLANNER_RESPONSE.thinking` | Store in assistant message metadata as provider reasoning, but do not stream as visible text. |
144+
| `PLANNER_RESPONSE.content` | Emit `StreamEvent::Delta { text }` and append/merge an assistant `ProviderMessage` with source `antigravity`. |
145+
| `PLANNER_RESPONSE.tool_calls[]` | Emit one `ToolUse`. Tool call objects have `name` and `args`; `args` is a JSON string when parseable. |
146+
| `RUN_COMMAND`, `LIST_DIRECTORY`, and other model-source tool result rows | Emit `ToolResult` with raw row content, tool name from row `type`, timestamp, status-derived `is_error`, and source metadata. |
147+
| `ERROR_MESSAGE.error` | Emit a failed `ToolResult` when the run already has useful context; otherwise surface it as the run error. |
148+
| Unknown user/system rows | Skip with debug logging. |
149+
| Unknown model rows with content/error | Emit `ToolResult` to keep provider activity visible. |
150+
151+
The visible run response is only the concatenation of emitted
152+
`PLANNER_RESPONSE.content` strings. The provider must not surface checkpoint,
153+
history, or system rows as visible assistant text.
154+
155+
Tool result rows do not carry explicit call ids in the observed schema. The
156+
provider should assign synthetic ids to `ToolUse` events and pair result rows
157+
with the most recent unmatched tool call of the same normalized name, falling
158+
back to FIFO order for unknown/parallel cases.
159+
160+
## Error And Timeout Handling
161+
162+
- `initialize` runs `agy models`; failure returns `ProviderNotReady`.
163+
- Spawn failure returns an internal spawn error.
164+
- Transcript discovery timeout returns `RunFailed`.
165+
- Invalid partial JSONL reads are retried until the row becomes parseable or
166+
the child exits.
167+
- Non-zero child exit returns a failed `ProviderRunResult` when transcript
168+
messages were captured; otherwise it returns `RunFailed` with stderr/stdout
169+
context.
170+
- Obvious overload/rate-limit strings can map to `BridgeError::Overloaded` when
171+
no useful partial transcript exists.
172+
- Garyx timeout kills the child and returns `BridgeError::Timeout`.
173+
- `abort(run_id)` kills the child and returns `true` when a child existed.
174+
175+
## Tests
176+
177+
Unit tests:
178+
179+
- `ProviderType` slug/alias round trips for `antigravity` and `agy`.
180+
- Config defaults and builder precedence for model, binary path, workspace,
181+
env, timeout, and brain-root override.
182+
- Command construction for first run, continuation, `--add-dir`, `--log-file`,
183+
and print timeout.
184+
- Transcript parser:
185+
- skips user/history/system/checkpoint rows
186+
- maps planner content to delta and final response
187+
- maps planner `tool_calls` to `ToolUse`
188+
- parses JSON-string tool args when possible
189+
- maps `RUN_COMMAND` and `LIST_DIRECTORY` to `ToolResult`
190+
- maps `ERROR_MESSAGE.error`
191+
- replaces compact row data from full transcript when needed
192+
- Conversation discovery using temp Antigravity home directories.
193+
- Baseline `step_index` filtering for continuation runs.
194+
- Abort removes and kills the registered child.
195+
196+
Focused validation:
197+
198+
```text
199+
cargo test -p garyx-models provider
200+
cargo test -p garyx-bridge antigravity_provider
201+
cargo test -p garyx-gateway provider_models
202+
cargo test -p garyx agent_upsert
203+
cargo check --workspace
204+
```
205+
206+
End-to-end validation:
207+
208+
1. Build/install the patched local CLI when validating through the managed
209+
gateway.
210+
2. Restart the managed gateway only for installed-binary validation.
211+
3. Create a synthetic custom agent bound to provider `antigravity`.
212+
4. Run `garyx thread create --agent-id <agent>` and `garyx thread send` with a
213+
deterministic prompt.
214+
5. Confirm the Garyx transcript streams the Claude reply.
215+
6. Confirm the thread stores an Antigravity conversation id as `sdk_session_id`.
216+
7. Send a second message on the same thread and confirm the same Antigravity
217+
transcript receives the new rows.
218+
8. Run a harmless temporary-directory prompt that triggers listing/command
219+
activity and confirm Garyx shows tool use and result rows.

0 commit comments

Comments
 (0)