|
| 1 | +# TASK-1066 Antigravity Provider Design |
| 2 | + |
| 3 | +## Goal |
| 4 | + |
| 5 | +Add a Garyx provider backed by the local Antigravity CLI (`agy`) so custom |
| 6 | +agents can run through the user's local Antigravity OAuth login. The provider |
| 7 | +defaults to `Claude Opus 4.6 (Thinking)`. |
| 8 | + |
| 9 | +The implementation must use the `agy` CLI process and Antigravity transcript |
| 10 | +files. It must not use the `google-antigravity` Python SDK because that SDK does |
| 11 | +not reuse the local OAuth subscription path. |
| 12 | + |
| 13 | +## CLI Contract |
| 14 | + |
| 15 | +First run: |
| 16 | + |
| 17 | +```text |
| 18 | +agy -p "<prompt>" --model "Claude Opus 4.6 (Thinking)" --dangerously-skip-permissions |
| 19 | +``` |
| 20 | + |
| 21 | +Continuation: |
| 22 | + |
| 23 | +```text |
| 24 | +agy -p "<prompt>" --model "Claude Opus 4.6 (Thinking)" --conversation "<id>" --dangerously-skip-permissions |
| 25 | +``` |
| 26 | + |
| 27 | +`--continue` is not used because it resumes the most recent global |
| 28 | +conversation, not the Garyx thread-bound conversation. |
| 29 | + |
| 30 | +`ANTIGRAVITY_CONVERSATION_ID` was tested locally and ignored by the current |
| 31 | +CLI. Garyx therefore cannot precompute the transcript path on first run. The |
| 32 | +provider discovers the new conversation id after spawn, stores it as |
| 33 | +`sdk_session_id`, and uses `--conversation` on later runs. |
| 34 | + |
| 35 | +## Provider Surface |
| 36 | + |
| 37 | +- Add `ProviderType::Antigravity`. |
| 38 | +- Canonical slug: `antigravity`. |
| 39 | +- Alias: `agy`. |
| 40 | +- Add a built-in custom-agent profile with `agent_id: "antigravity"`. |
| 41 | +- Add optional default provider registration in bridge lifecycle for config keys |
| 42 | + `antigravity` and `agy`. |
| 43 | +- Add a provider-model catalog with the static `agy models` entries, defaulting |
| 44 | + to `Claude Opus 4.6 (Thinking)`. The Gemini entries can remain selectable for |
| 45 | + users who explicitly choose them, but the recommended/default model stays |
| 46 | + Claude because Gemini Antigravity calls can fail with location-gated Google |
| 47 | + AI Platform errors. |
| 48 | +- Reuse `AgentProviderConfig` fields for `workspace_dir`, `default_model`, |
| 49 | + `model`, `timeout_seconds`, and `env`. |
| 50 | +- Add provider-specific shared config fields: `antigravity_bin` and optional |
| 51 | + `antigravity_brain_root`. The default brain root is derived from |
| 52 | + `$HOME/.gemini/antigravity-cli/brain`, with an env override allowed for tests. |
| 53 | +- Add `AntigravityCliConfig` in `garyx-models/src/provider.rs`. |
| 54 | + |
| 55 | +Custom-agent model tri-state semantics stay unchanged: absent preserves on |
| 56 | +update, empty string clears to provider default, and a non-empty value sets the |
| 57 | +model. |
| 58 | + |
| 59 | +## Run Flow |
| 60 | + |
| 61 | +Implement `garyx-bridge/src/antigravity_provider.rs`. |
| 62 | + |
| 63 | +For each run: |
| 64 | + |
| 65 | +1. Resolve `run_id` from metadata using the same fallback order as the Gemini |
| 66 | + provider. |
| 67 | +2. Resolve cwd from run options, provider config, or current directory. |
| 68 | +3. Resolve binary from `antigravity_bin`, then `agy`. |
| 69 | +4. Resolve model from metadata `model`, config `model`, config `default_model`, |
| 70 | + then built-in default `Claude Opus 4.6 (Thinking)`. |
| 71 | +5. Build the prompt with existing Garyx prompt helpers. File/image payloads are |
| 72 | + converted to local attachment instructions using existing staging helpers. |
| 73 | +6. Spawn `agy -p <prompt> --model <model> --dangerously-skip-permissions`. |
| 74 | + Add `--conversation <id>` when the thread already has a bound session id, |
| 75 | + add `--add-dir <workspace>` when a workspace is available, and add |
| 76 | + `--log-file <temp-run-log>` to make first-run diagnostics deterministic. |
| 77 | +7. Pass `--print-timeout` from the provider timeout so `agy` and Garyx agree on |
| 78 | + the maximum run duration. |
| 79 | +8. Merge config env, task CLI env, and provider-specific metadata env |
| 80 | + (`desktop_antigravity_env`) into the child environment. |
| 81 | +9. Register the child process by run id for abort/shutdown. |
| 82 | +10. Tail transcript rows while the child runs. |
| 83 | +11. Emit `SessionBound` once the conversation id is known. |
| 84 | +12. On process exit, drain remaining transcript rows, emit `Done`, unregister |
| 85 | + the child, and return `ProviderRunResult`. |
| 86 | + |
| 87 | +## Conversation Discovery |
| 88 | + |
| 89 | +Known session id: |
| 90 | + |
| 91 | +```text |
| 92 | +$HOME/.gemini/antigravity-cli/brain/<id>/.system_generated/logs/transcript.jsonl |
| 93 | +``` |
| 94 | + |
| 95 | +Fresh first run: |
| 96 | + |
| 97 | +1. Record run start time before spawning `agy`. |
| 98 | +2. Hold a short provider-local fresh-session discovery lock. First runs without |
| 99 | + a known conversation id are the only ambiguous case, and serializing just |
| 100 | + this discovery path avoids cross-claiming two new Antigravity conversations. |
| 101 | +3. Prefer parsing the conversation id from the per-run `--log-file` if the CLI |
| 102 | + writes it there. If not present, scan |
| 103 | + `$HOME/.gemini/antigravity-cli/conversations/*.db` for files created or |
| 104 | + modified after run start. |
| 105 | +4. Prefer the newest candidate whose matching brain log path exists and whose |
| 106 | + turn-0 `USER_INPUT` matches the prompt envelope for this run. |
| 107 | +5. Once found, emit `SessionBound { sdk_session_id: <id> }` and persist the id |
| 108 | + in the provider's thread session map. |
| 109 | + |
| 110 | +If `--conversation <id>` fails with a session-not-found style error, clear the |
| 111 | +thread's session id and retry once as a fresh conversation. Do not retry on |
| 112 | +ordinary model/tool/runtime failures. |
| 113 | + |
| 114 | +Forking is out of scope for the first implementation. If |
| 115 | +`sdk_session_fork=true` is requested, return `SessionError` explaining that |
| 116 | +Antigravity exposes resume-by-id but no safe local fork primitive. |
| 117 | + |
| 118 | +## Transcript Reader |
| 119 | + |
| 120 | +Read compact `transcript.jsonl` first. When the corresponding |
| 121 | +`transcript_full.jsonl` exists, keep it available for row replacement. Local |
| 122 | +evidence showed compact and full files can differ by small encoding/formatting |
| 123 | +amounts without a stable `is_truncated` flag on every row, so the reader should |
| 124 | +replace a compact row from full only when: |
| 125 | + |
| 126 | +- compact row has `is_truncated: true`, or |
| 127 | +- compact row is missing `content`, `thinking`, `tool_calls`, or `error` that |
| 128 | + exists on the full row with the same `step_index`. |
| 129 | + |
| 130 | +Rows are processed once by `step_index`; before a continuation run starts, record |
| 131 | +the current max `step_index` and emit only rows with `step_index` greater than |
| 132 | +that baseline. If a row is rewritten while being tailed, the later complete |
| 133 | +parse wins before emission. |
| 134 | + |
| 135 | +## Event Mapping |
| 136 | + |
| 137 | +| Antigravity row | Garyx behavior | |
| 138 | +| --- | --- | |
| 139 | +| `USER_INPUT` | Skip. Garyx already persists the user row. | |
| 140 | +| `CONVERSATION_HISTORY` | Skip. | |
| 141 | +| `SYSTEM_MESSAGE` | Skip. | |
| 142 | +| `CHECKPOINT` | Skip. Checkpoints contain internal summaries and may include local transcript paths. | |
| 143 | +| `PLANNER_RESPONSE.thinking` | Store in assistant message metadata as provider reasoning, but do not stream as visible text. | |
| 144 | +| `PLANNER_RESPONSE.content` | Emit `StreamEvent::Delta { text }` and append/merge an assistant `ProviderMessage` with source `antigravity`. | |
| 145 | +| `PLANNER_RESPONSE.tool_calls[]` | Emit one `ToolUse`. Tool call objects have `name` and `args`; `args` is a JSON string when parseable. | |
| 146 | +| `RUN_COMMAND`, `LIST_DIRECTORY`, and other model-source tool result rows | Emit `ToolResult` with raw row content, tool name from row `type`, timestamp, status-derived `is_error`, and source metadata. | |
| 147 | +| `ERROR_MESSAGE.error` | Emit a failed `ToolResult` when the run already has useful context; otherwise surface it as the run error. | |
| 148 | +| Unknown user/system rows | Skip with debug logging. | |
| 149 | +| Unknown model rows with content/error | Emit `ToolResult` to keep provider activity visible. | |
| 150 | + |
| 151 | +The visible run response is only the concatenation of emitted |
| 152 | +`PLANNER_RESPONSE.content` strings. The provider must not surface checkpoint, |
| 153 | +history, or system rows as visible assistant text. |
| 154 | + |
| 155 | +Tool result rows do not carry explicit call ids in the observed schema. The |
| 156 | +provider should assign synthetic ids to `ToolUse` events and pair result rows |
| 157 | +with the most recent unmatched tool call of the same normalized name, falling |
| 158 | +back to FIFO order for unknown/parallel cases. |
| 159 | + |
| 160 | +## Error And Timeout Handling |
| 161 | + |
| 162 | +- `initialize` runs `agy models`; failure returns `ProviderNotReady`. |
| 163 | +- Spawn failure returns an internal spawn error. |
| 164 | +- Transcript discovery timeout returns `RunFailed`. |
| 165 | +- Invalid partial JSONL reads are retried until the row becomes parseable or |
| 166 | + the child exits. |
| 167 | +- Non-zero child exit returns a failed `ProviderRunResult` when transcript |
| 168 | + messages were captured; otherwise it returns `RunFailed` with stderr/stdout |
| 169 | + context. |
| 170 | +- Obvious overload/rate-limit strings can map to `BridgeError::Overloaded` when |
| 171 | + no useful partial transcript exists. |
| 172 | +- Garyx timeout kills the child and returns `BridgeError::Timeout`. |
| 173 | +- `abort(run_id)` kills the child and returns `true` when a child existed. |
| 174 | + |
| 175 | +## Tests |
| 176 | + |
| 177 | +Unit tests: |
| 178 | + |
| 179 | +- `ProviderType` slug/alias round trips for `antigravity` and `agy`. |
| 180 | +- Config defaults and builder precedence for model, binary path, workspace, |
| 181 | + env, timeout, and brain-root override. |
| 182 | +- Command construction for first run, continuation, `--add-dir`, `--log-file`, |
| 183 | + and print timeout. |
| 184 | +- Transcript parser: |
| 185 | + - skips user/history/system/checkpoint rows |
| 186 | + - maps planner content to delta and final response |
| 187 | + - maps planner `tool_calls` to `ToolUse` |
| 188 | + - parses JSON-string tool args when possible |
| 189 | + - maps `RUN_COMMAND` and `LIST_DIRECTORY` to `ToolResult` |
| 190 | + - maps `ERROR_MESSAGE.error` |
| 191 | + - replaces compact row data from full transcript when needed |
| 192 | +- Conversation discovery using temp Antigravity home directories. |
| 193 | +- Baseline `step_index` filtering for continuation runs. |
| 194 | +- Abort removes and kills the registered child. |
| 195 | + |
| 196 | +Focused validation: |
| 197 | + |
| 198 | +```text |
| 199 | +cargo test -p garyx-models provider |
| 200 | +cargo test -p garyx-bridge antigravity_provider |
| 201 | +cargo test -p garyx-gateway provider_models |
| 202 | +cargo test -p garyx agent_upsert |
| 203 | +cargo check --workspace |
| 204 | +``` |
| 205 | + |
| 206 | +End-to-end validation: |
| 207 | + |
| 208 | +1. Build/install the patched local CLI when validating through the managed |
| 209 | + gateway. |
| 210 | +2. Restart the managed gateway only for installed-binary validation. |
| 211 | +3. Create a synthetic custom agent bound to provider `antigravity`. |
| 212 | +4. Run `garyx thread create --agent-id <agent>` and `garyx thread send` with a |
| 213 | + deterministic prompt. |
| 214 | +5. Confirm the Garyx transcript streams the Claude reply. |
| 215 | +6. Confirm the thread stores an Antigravity conversation id as `sdk_session_id`. |
| 216 | +7. Send a second message on the same thread and confirm the same Antigravity |
| 217 | + transcript receives the new rows. |
| 218 | +8. Run a harmless temporary-directory prompt that triggers listing/command |
| 219 | + activity and confirm Garyx shows tool use and result rows. |
0 commit comments