Commit c6e9614
docs(k8s-proxy-llm-workflow): document --cluster for local cloud replay (#865)
* docs(k8s-proxy-llm-workflow): document --cluster for local cloud replay
Discovery step 3: agent caches origin.clusterName via getApp (listApps does
not return it). Pass it as --cluster on every keploy cloud replay so a local
(no --trigger) run resolves the proxy app's identity without requiring an
actively-heartbeating cluster.
Routine A re-validate + Routine B B4 + Prompt B table now show the local-
replay form with --cluster, -c, --container-name, --disableReportUpload=false;
CI / active-cluster runs drop the local flags.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(k8s-proxy-llm-workflow): tighten Case 2b — explicit edit→edit→delete loop
Replace the prose "one or two mock edits then fall back" with a
numbered 1→2→delete_recording loop the agent has to follow exactly.
The previous wording let an agent linger on update_mock attempts
indefinitely or improvise alternative repair strategies (e.g.
editing the on-disk mocks.yaml directly) when an edit didn't take.
Add two explicit DO-NOT rules that surfaced during Scenario 4
validation of the LLM workflow:
* "Do not edit keploy/<test_set>/mocks.yaml on disk." cloud
replay re-downloads mocks on every run, so the local file is a
per-run snapshot and any local edit is silently overwritten
before the test runs. An agent that didn't know this spiralled
on grep+edit cycles against a file it couldn't actually
change.
* "Do not recompute hash fields by hand." Some recorded mocks
carry derived fingerprints (sqlAstHash on Postgres v3); these
are now recomputed by the proxy from the human-readable fields
on load (api-server PR #1697 strips them on write, integrations
PR #209 recomputes on read). LLMs that previously tried to
rewrite the hash to match a SQL change couldn't compute the
canonical value (libpg_query isn't reachable from a typical
agent runtime) and shipped a mock the matcher couldn't reach.
The new rule directs the agent to edit only what it can
reason about and let the proxy derive the rest.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(k8s-proxy-llm-workflow): scoped vs whole-set re-record in Case 2b fallback
Split today's single "fall back to delete_recording + re-record" path
into two arms based on how many cases in the set are failing.
3a Whole-set: delete_recording (no test_case_ids), re-record all
flows, keploy upload test-set with a fresh --name. Use when most
of the set is failing.
3b Scoped: delete_recording({test_case_ids: [...]}) tombstones
only the failing cases. Re-record just those flows. keploy upload
test-set still produces a NEW test-set with a fresh --name; the
branch ends with two coexisting test-sets (original-minus-
tombstones + the small replacement set) both feeding the next
replay. Use when one / a few cases are failing — preserves the
unrelated passing tests in the same set.
The 3b path is enabled by api-server PR #1697's scope-aware
delete_recording (test_case_ids array param). Until it merges, the
agent has only 3a available; that's why today's runs were destroying
unrelated tests on every Case 2b fallback.
Heuristic: pick 3a when ≥ ~75% of the cases are failing, 3b
otherwise. The upload server enforces test-set name uniqueness per
app, so the agent must mint a fresh --name on every upload (server
returns `test set "X" already exists for this app` on collision).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(k8s-proxy-llm-workflow): forbid AskUserQuestion; drop hash-recompute note
Two changes to Case 2b guidance:
1. Add "Do NOT ask the dev which path to take" — Routine A is
autonomous by contract. The previous wording ("announce …
otherwise proceed") was read as "ask, then proceed" by an agent
that interpreted AskUserQuestion as a safety net; that's wrong.
Make the rule explicit: announce in plain text, never call
AskUserQuestion, never offer numbered choices, never pause for
confirmation. The dev reviews the streamed transcript and Ctrl-Cs
if the agent is wrong — that's the contract.
2. Remove the "Do NOT recompute hash fields by hand" note. It
referenced keploy/api-server#1697's strip-on-write commit and
keploy/integrations#209's loader recompute, both reverted/closed.
The supported path for Postgres mock drift is now plain Case 2b:
two update_mock attempts (which the LLM may or may not get right
for a Postgres v3 query), then the existing 3a/3b delete +
re-record fallback. No special hash-related guidance needed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(k8s-proxy-llm-workflow): review-driven Case 2b refinements
Addresses inline review comments on PR #865:
* **3b availability banner.** The `test_case_ids` parameter on
`delete_recording` requires a server build that advertises the
scoped delete. Documented the discovery check + the fall-back to 3a
so an agent running against an older api-server can still progress
rather than hitting an unhelpful error.
* **`<name-1>` placeholder + inline comment.** Replaced `<id-1>` with
`<name-1>` and added a comment line clarifying that the values are
the same friendly recording names used elsewhere in the skill
(e.g. `get-api-orders-1`). Prevents agents from sending a
branch-overlay UUID they don't have.
* **Naming convention for --name.** Spelled out the convention
`<original-set-name>--rerec-<YYYYMMDDHHMM>` (with `<short-git-sha>`
as a deterministic alternative). Without a convention, every
agent invents its own and the recordings page becomes a
one-off-named long-tail mess.
* **Cap-retry-3 vs Case 2b loop disambiguation.** A4's "cap retry at
3" was being read as the same budget as the 2-attempt mock-edit
loop. Made the relationship explicit: 3 = total cloud-replay runs
across the Case 2 loop (steps 1, 2, 3); 2 = update_mock attempts
within step 2 before forcing the 3a/3b fallback. Two budgets, not
one.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(k8s-proxy-llm-workflow): round-2 polish on Case 2b 3b path
- Availability banner: route agents to MCP `tools/list` (the protocol-
level introspection that returns each tool's inputSchema) instead of
`listMocks`/`getApp` which are domain endpoints that don't carry tool
shape.
- Naming convention: original-set-name often contains spaces, parens, or
other chars that the api-server's name validator may reject. Switched
to slug(original-set-name) and called out the slug rule explicitly.
- Timestamp: UTC is now explicit (trailing Z is part of the literal name)
so two agents running in different timezones at the same wall-clock
minute mint the same name — preserving the "original + its re-records
sort together" intent for cross-timezone CI integrations.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(k8s-proxy-llm-workflow): close slug-rule ambiguities
Round-3 review caught three ambiguities in the round-2 naming
convention block that would cause different LLM agents to mint
different --name values for the same original-set-name:
1. Trailing literal `Z` collided with the slug charset `[a-z0-9-]` —
a strict-slugger lowercases it; a literal-suffix-preserver doesn't.
2. The worked example silently applied "collapse runs of -" and
"trim leading/trailing -" rules that weren't in the stated rule.
3. `<slug(original-set-name)>` read as a function-call placeholder
that agents might try to invoke (no `slug` tool exists in the
MCP toolset; library fallbacks produce divergent outputs).
Replaced with a two-part declarative spec: (1) build the slug part
with a fully-specified rule + worked example, (2) append a literal
suffix with explicit "do NOT lowercase or slug" instruction. Also
spelled out `<short-git-sha>` = first 7 chars of `git rev-parse HEAD`
in the deterministic alternative.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(k8s-proxy-llm-workflow): note deterministic alt has no leading dash
Round-4 review surfaced a separator inconsistency between the primary
naming form (`<slug>--rerec-<…>`) and the deterministic alternative
(`rerec-<sha>-<…>`). An LLM agent reading both could reflexively copy
the primary's `--rerec-` literal into the alternative and mint
`--rerec-<sha>-<…>` — leading `--` looks like a CLI flag to shell
escapers and log scrapers, high chance of producing a bad command.
Added one parenthetical line to the deterministic-alternative
description: "The alternative begins with the literal `rerec` (NO
leading `-` or `--` — the double-dash in the primary form is the
slug/suffix boundary marker, which this form doesn't have because
there's no slug part)."
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* style(k8s-proxy-llm-workflow): prettier formatting
Blank lines before fenced code blocks after the 3a/3b headers, and
trailing-whitespace trim in one table cell. No content changes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Charan Kamarapu <charan@keploy.io>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>1 parent 9e0fca1 commit c6e9614
1 file changed
Lines changed: 74 additions & 6 deletions
Lines changed: 74 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
134 | 134 | | |
135 | 135 | | |
136 | 136 | | |
| 137 | + | |
137 | 138 | | |
138 | | - | |
| 139 | + | |
139 | 140 | | |
140 | 141 | | |
141 | 142 | | |
| |||
187 | 188 | | |
188 | 189 | | |
189 | 190 | | |
190 | | - | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
191 | 251 | | |
192 | 252 | | |
193 | 253 | | |
| |||
196 | 256 | | |
197 | 257 | | |
198 | 258 | | |
199 | | - | |
| 259 | + | |
| 260 | + | |
200 | 261 | | |
201 | 262 | | |
202 | | - | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
203 | 266 | | |
204 | 267 | | |
205 | 268 | | |
| |||
262 | 325 | | |
263 | 326 | | |
264 | 327 | | |
| 328 | + | |
| 329 | + | |
265 | 330 | | |
266 | | - | |
| 331 | + | |
| 332 | + | |
267 | 333 | | |
268 | 334 | | |
| 335 | + | |
| 336 | + | |
269 | 337 | | |
270 | 338 | | |
271 | 339 | | |
| |||
344 | 412 | | |
345 | 413 | | |
346 | 414 | | |
347 | | - | |
| 415 | + | |
348 | 416 | | |
349 | 417 | | |
350 | 418 | | |
| |||
0 commit comments