Commit 038c943
docs(k8s-proxy): developer + LLM workflow playbook + trim to verified form (#871)
* docs(k8s-proxy): add developer + llm workflow playbook page
Sibling to the existing k8s-proxy-developer-workflow page. Documents
an autonomous Keploy workflow driven from an MCP-aware editor (Claude
Code, Cursor, Windsurf, Claude Desktop, VS Code Copilot, Trae). The
developer types one of two prompts; the agent does everything else.
The two prompts:
1. "my keploy cloud replay is failing, please analyse and fix it."
(or "the keploy cloud replay pipeline is failing..." for CI)
2. "Add new keploy tests for my changes."
The page ships a single pasteable playbook that installs as a Claude
Code skill or any other editor's rules / memory file. Inside the
playbook the agent:
- Resolves app_id from `basename $(pwd)` + listApps.
- Resolves branch_id from `git rev-parse --abbrev-ref HEAD` +
create_branch (find-or-create, idempotent, sticky for the session).
- Diagnoses failing runs via two cases: Case 1 (app regression, agent
fixes handler code and announces file:line before applying);
Case 2 (test data stale, with sub-actions 2a noise / 2a response
edit / 2b mock edit / 2b delete_recording + re-record).
- For new tests: git diff to find changed handlers, pre-flight the
dev's local run command, then `keploy record -c "<cmd>" --sync` +
`keploy upload test-set` to land the bundle on the branch.
Sidebar updated to surface the page under K8s Proxy.
Signed-off-by: Charan Kamarapu <kamarapucharan@gmail.com>
* docs(k8s-proxy-llm-workflow): trim playbook to verified-working form
Replace the long-form playbook with the trimmed, validated form
(11,305 → 7,939 tok + 2 anti-patterns ≈ 8,095 tok in source). Same
load-bearing rules preserved verbatim:
- Step 0 ALLOWLIST + uncommitted-edit revert mandate
- listTestReports EXACTLY ONCE per session
- getApp memoize (≤1 call/session)
- fields=[...] on getTestReportFull + getApp
- drop listMocks default; targeted getMock instead
- record → upload → delete order for 2b-recapture
- sql_ast_hash CLI mandate (use `keploy mock patch`, not MCP update_mock)
- --disableReportUpload=false and --cluster mandatory
- pipe all keploy/docker output through tail/grep
- two new anti-patterns: ban keploy --help dump, ban Read of
keploy/ local cache files
Verified against S1 scenario at 632k total tokens, 13/13 effective
asserts.
* docs(k8s-proxy-llm-workflow): make getApp mandatory pre-replay; clarify --cluster error
Routine B used to skip Discovery step 3 (getApp) because B1 starts at
'git diff' — then hit Phase B4 needing --cluster and dropped the flag,
causing `no active clusters found`. Two fixes:
1. Discovery step 3 (`getApp` for cluster/ns/deployment) is now MANDATORY
before any `keploy cloud replay` invocation (both Phase A4 and B4).
2. Phase B4 explicitly tells the agent: if you skipped Discovery
step 3 because Routine B starts at git diff, go back and call getApp
NOW. Plus inline the error-message ambiguity: `no active clusters
found` actually means "you forgot --cluster", not "no cluster is
running".
Source of truth: matches the trimmed verified-working SKILL.md
(`.claude/skills/keploy/SKILL.md`) byte-for-byte.
* docs(k8s-proxy-llm-workflow): keploy mock patch flag is --app (not --app-id)
The CLI registers --app, not --app-id (OSS root pre-registers --app-id
as a deprecated uint64 flag). The prior template told agents to use
--app-id which the CLI rejects with exit 1.
Real-world impact: S4 validation run had the agent construct the
documented --app-id command, get rejected, confabulate success.
* docs(k8s-proxy-llm-workflow): canonical fields= projection + listTestReports one-shot stricter
Two cost-discipline fixes from validation evidence:
1. Phase A2: replaced the narrow recommended projection
([failed_steps[].diff, mock_mismatches, status, ci_metadata])
with one that covers per-case identity + per-case oss_report.req /
.result / .mock_mismatches / .noise — everything Phase A3 actually
reads. The old projection was too narrow, agents fell back to
include_oss_report=true (NO fields=) to fetch the full 34k blob
that re-bills every subsequent turn.
2. Phase A1: added "do NOT re-call listTestReports after your own
`keploy cloud replay` finishes — the replay stdout already prints
the new test_run_id in `View test report at: .../tr/<id>`, parse
that line instead of re-querying."
Also added explicit "ADD fields, never drop" rule under "use fields
aggressively" — agents were retrying without fields= to "get everything"
which is the exact failure mode the projection was meant to prevent.
* docs(k8s-proxy-llm-workflow): correct field names + mock_mismatches_only call
Two skill corrections discovered via S7 deep-dive on the actual
getTestReportFull response schema:
1. Field-name corrections: the canonical fields= projection used wrong
keys that returned null on every call.
test_sets[].name → test_sets[].test_set_name
test_sets[].id → test_sets[].test_set_id
test_sets[].test_cases[].name → test_sets[].test_cases[].test_case_name
test_sets[].test_cases[].id → test_sets[].test_cases[].test_case_id
Plus dropped refs that don't exist anywhere in the response:
failed_steps[].diff (not in response)
top-level mock_mismatches (not in response)
oss_report.failure_info.mock_mismatch (failure_info has no such subkey)
2. mock_mismatches_only=true second call: per-case mock_mismatches data
is NOT included by default in getTestReportFull. Added explicit
instruction that when Phase A3 routes to Case 2b, make a SECOND
projected call with mock_mismatches_only=true to discover mock IDs
from oss_report.mock_mismatches.actual_mocks[].name. This avoids
listMocks (~28k token inventory) for the common Case 2b path.
3. listMocks ban softened: now allowed as fallback when the
mock_mismatches_only call returns empty for the failing test set
(e.g., body-only drift with no consumed mocks).
Verified live: S7 with the corrected skill + the projection bug fixes
(see api-server PR for those) — 13/16 strict assert pass (was 11/16),
A-CR1 fields= now passing 2/2, response payload 22k → 572 bytes on the
projected call.
* docs(k8s-proxy-llm-workflow): mandate --disable-mapping=false on keploy record
After investigating S6 (Routine B) end-to-end, found that
`keploy record --sync` alone produces no `mappings.yaml`. The recorder
inherits keploy.yml's `disableMapping` and the auto-orchestrator-forwarded
flag doesn't propagate without an explicit host-side override. Without
mappings.yaml, the upload pipeline persists no `mapping_audits` doc in
mongo, and `getMockMapping` returns empty `mocks: []` for every test
case — forcing the replay matcher onto fragile timestamp windows.
Two skill updates:
1. Phase B2 step 1: `keploy record -c "<cmd>" --sync --disable-mapping=false`
is the canonical incantation, with explicit rationale for why
--disable-mapping=false is mandatory.
2. Case 2b-recapture: same flag pair documented on the record step of
the (record → upload → delete) order.
The --disable-mapping flag was added to `keploy record` upstream
(keploy/keploy PR #4250).
* docs(k8s-proxy-llm-workflow): add Installation section + Vale spelling fixes
Address user feedback + Copilot Vale-spelling comments on PR #871:
User feedback (Cursor user): the doc lacked a setup section, so they
went off the older `.cursorrules` instructions in agent-test-generation.md
which is now deprecated. Verified against cursor-agent's built-in
`migrate-to-skills` skill: `.cursor/skills/<name>/SKILL.md` IS the
modern Cursor format, `.cursorrules` and `.cursor/rules/*.mdc` are
being migrated FROM. Added an Installation section at the top of the
page covering the modern Skills mechanism for Cursor / Claude Code /
other agents, with an explicit "do not use .cursorrules" note (the
playbook is ~8k tokens; pinning it as always-on context would bill on
every editor turn).
Vale spelling fixes (Copilot comments r3343-r3369):
- "analyse" → "analyze" (en_US): Prompt A wording + Routine A heading
- "ALLOWLIST" → "Allowlist" (security term, lowercased to match Vale)
+ added `[Aa]llowlist` to the Base vocabulary so future occurrences
pass lint
- "re-bills" → "gets re-added to context" (3 sites) — clearer to
readers and dodges Vale's spelling check
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(k8s-proxy-llm-workflow): prettier — normalize emphasis to _underscore_
CI's prettier check (creyD/prettier_action@v4.6 with prettier 3.8.3)
fails the PR because three emphasis spans in the file use `*…*`
syntax. Prettier 3.x normalizes em-emphasis to `_…_`. Auto-fixed via
`prettier --write`. No prose changes — only the markup style for
the three italic spans (`*values*`, `*shape*`, `*value*`).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(k8s-proxy-llm-workflow): self-review fixes — token count, foreground -c, two-routine wording
Three self-review nits caught on a deep re-read:
1. Installation: "~8k-token playbook" was off — measured the actual file
with tiktoken cl100k_base and got 9,310 tokens. Bumped the warning
to "~9k-token" so the cost rationale is grounded in the real number.
2. Phase B2 capture: clarified that the -c value must be the FOREGROUND
form of the run command. If pre-flight uses `docker compose up -d`
(detached, common in repos without a foreground equivalent declared),
passing the same string to `keploy record -c` makes docker exit
immediately on detach and keploy thinks the app already terminated,
capturing nothing. Example: pre-flight `docker compose up -d`,
record `docker compose up` (no -d).
3. Page description: "exactly two developer prompts" was inaccurate —
Prompt A has two phrasings, so the agent listens for three distinct
surface phrases. Reworded to "two routine prompts (failing-replay
analyze-and-fix; add-tests-for-my-changes)" so the count refers to
the two routines rather than the surface phrases.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(vale): allow spaced em-dash + logical quotes + tech vocab
CI's Vale doc linter (errata-ai/vale-action@v2.1.1 with vale 3.0.3 and
the project's existing Google + Vale base styles) flagged 89 errors on
the k8s-proxy-llm-workflow page after my Installation section landed.
Categorized:
58× Google.EmDash — "Don't put a space before or after a dash". The
doc uses the spaced em-dash form ` — ` for prose readability;
many other docs in the repo do the same (see hits in
generate-api-tests-using-ai.md, etc.). Disabling the rule
repo-wide is consistent with the seven other Google.* overrides
already in `.vale.ini` and matches the docs' established style.
8× Google.Quotes — "Commas and periods go inside quotation marks".
The docs use period-OUTSIDE-quote when the quoted token is a
literal the reader is supposed to paste verbatim (e.g.
`the exact value "FAILED".`); putting the period inside would
change the visible token. Disabling for consistency with the
other Google.* overrides.
23× Vale.Spelling — tech terms not yet in the Base vocabulary.
Added: branch_id, camelCase, CLI[s]?, cwd, hardcoded,
JSONPath[s]?, matcher, misclassification, mutex, OAuth,
readback, README, snake_case, stdout, test_run, unprojected.
1× Vale.Spelling on "whatever's" — possessive on the indefinite
pronoun that Vale's en_US dictionary doesn't recognize.
Reworded the sentence in-place rather than vocab-ing it; the
possessive form is genuinely unusual and a rewrite is cleaner
than whitelisting it.
Local `vale --config=.vale.ini versioned_docs/.../k8s-proxy-llm-workflow.md`
now reports 0 errors. Prettier still clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Signed-off-by: Charan Kamarapu <kamarapucharan@gmail.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>1 parent 22a3806 commit 038c943
3 files changed
Lines changed: 203 additions & 269 deletions
File tree
- vale_styles/config/vocabularies/Base
- versioned_docs/version-4.0.0/quickstart
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
| 36 | + | |
35 | 37 | | |
36 | 38 | | |
37 | 39 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
3 | 4 | | |
4 | 5 | | |
5 | 6 | | |
| |||
201 | 202 | | |
202 | 203 | | |
203 | 204 | | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
0 commit comments