docs(k8s-proxy): developer + LLM workflow playbook + trim to verified form (#871)

charankamarapu · claude · web-flow · commit 038c943bc28e · 2026-06-07T21:17:29.000+05:30
* docs(k8s-proxy): add developer + llm workflow playbook page Sibling to the existing k8s-proxy-developer-workflow page. Documents an autonomous Keploy workflow driven from an MCP-aware editor (Claude Code, Cursor, Windsurf, Claude Desktop, VS Code Copilot, Trae). The developer types one of two prompts; the agent does everything else. The two prompts: 1. "my keploy cloud replay is failing, please analyse and fix it." (or "the keploy cloud replay pipeline is failing..." for CI) 2. "Add new keploy tests for my changes." The page ships a single pasteable playbook that installs as a Claude Code skill or any other editor's rules / memory file. Inside the playbook the agent: - Resolves app_id from `basename $(pwd)` + listApps. - Resolves branch_id from `git rev-parse --abbrev-ref HEAD` + create_branch (find-or-create, idempotent, sticky for the session). - Diagnoses failing runs via two cases: Case 1 (app regression, agent fixes handler code and announces file:line before applying); Case 2 (test data stale, with sub-actions 2a noise / 2a response edit / 2b mock edit / 2b delete_recording + re-record). - For new tests: git diff to find changed handlers, pre-flight the dev's local run command, then `keploy record -c "<cmd>" --sync` + `keploy upload test-set` to land the bundle on the branch. Sidebar updated to surface the page under K8s Proxy. Signed-off-by: Charan Kamarapu <kamarapucharan@gmail.com> * docs(k8s-proxy-llm-workflow): trim playbook to verified-working form Replace the long-form playbook with the trimmed, validated form (11,305 → 7,939 tok + 2 anti-patterns ≈ 8,095 tok in source). Same load-bearing rules preserved verbatim: - Step 0 ALLOWLIST + uncommitted-edit revert mandate - listTestReports EXACTLY ONCE per session - getApp memoize (≤1 call/session) - fields=[...] on getTestReportFull + getApp - drop listMocks default; targeted getMock instead - record → upload → delete order for 2b-recapture - sql_ast_hash CLI mandate (use `keploy mock patch`, not MCP update_mock) - --disableReportUpload=false and --cluster mandatory - pipe all keploy/docker output through tail/grep - two new anti-patterns: ban keploy --help dump, ban Read of keploy/ local cache files Verified against S1 scenario at 632k total tokens, 13/13 effective asserts. * docs(k8s-proxy-llm-workflow): make getApp mandatory pre-replay; clarify --cluster error Routine B used to skip Discovery step 3 (getApp) because B1 starts at 'git diff' — then hit Phase B4 needing --cluster and dropped the flag, causing `no active clusters found`. Two fixes: 1. Discovery step 3 (`getApp` for cluster/ns/deployment) is now MANDATORY before any `keploy cloud replay` invocation (both Phase A4 and B4). 2. Phase B4 explicitly tells the agent: if you skipped Discovery step 3 because Routine B starts at git diff, go back and call getApp NOW. Plus inline the error-message ambiguity: `no active clusters found` actually means "you forgot --cluster", not "no cluster is running". Source of truth: matches the trimmed verified-working SKILL.md (`.claude/skills/keploy/SKILL.md`) byte-for-byte. * docs(k8s-proxy-llm-workflow): keploy mock patch flag is --app (not --app-id) The CLI registers --app, not --app-id (OSS root pre-registers --app-id as a deprecated uint64 flag). The prior template told agents to use --app-id which the CLI rejects with exit 1. Real-world impact: S4 validation run had the agent construct the documented --app-id command, get rejected, confabulate success. * docs(k8s-proxy-llm-workflow): canonical fields= projection + listTestReports one-shot stricter Two cost-discipline fixes from validation evidence: 1. Phase A2: replaced the narrow recommended projection ([failed_steps[].diff, mock_mismatches, status, ci_metadata]) with one that covers per-case identity + per-case oss_report.req / .result / .mock_mismatches / .noise — everything Phase A3 actually reads. The old projection was too narrow, agents fell back to include_oss_report=true (NO fields=) to fetch the full 34k blob that re-bills every subsequent turn. 2. Phase A1: added "do NOT re-call listTestReports after your own `keploy cloud replay` finishes — the replay stdout already prints the new test_run_id in `View test report at: .../tr/<id>`, parse that line instead of re-querying." Also added explicit "ADD fields, never drop" rule under "use fields aggressively" — agents were retrying without fields= to "get everything" which is the exact failure mode the projection was meant to prevent. * docs(k8s-proxy-llm-workflow): correct field names + mock_mismatches_only call Two skill corrections discovered via S7 deep-dive on the actual getTestReportFull response schema: 1. Field-name corrections: the canonical fields= projection used wrong keys that returned null on every call. test_sets[].name → test_sets[].test_set_name test_sets[].id → test_sets[].test_set_id test_sets[].test_cases[].name → test_sets[].test_cases[].test_case_name test_sets[].test_cases[].id → test_sets[].test_cases[].test_case_id Plus dropped refs that don't exist anywhere in the response: failed_steps[].diff (not in response) top-level mock_mismatches (not in response) oss_report.failure_info.mock_mismatch (failure_info has no such subkey) 2. mock_mismatches_only=true second call: per-case mock_mismatches data is NOT included by default in getTestReportFull. Added explicit instruction that when Phase A3 routes to Case 2b, make a SECOND projected call with mock_mismatches_only=true to discover mock IDs from oss_report.mock_mismatches.actual_mocks[].name. This avoids listMocks (~28k token inventory) for the common Case 2b path. 3. listMocks ban softened: now allowed as fallback when the mock_mismatches_only call returns empty for the failing test set (e.g., body-only drift with no consumed mocks). Verified live: S7 with the corrected skill + the projection bug fixes (see api-server PR for those) — 13/16 strict assert pass (was 11/16), A-CR1 fields= now passing 2/2, response payload 22k → 572 bytes on the projected call. * docs(k8s-proxy-llm-workflow): mandate --disable-mapping=false on keploy record After investigating S6 (Routine B) end-to-end, found that `keploy record --sync` alone produces no `mappings.yaml`. The recorder inherits keploy.yml's `disableMapping` and the auto-orchestrator-forwarded flag doesn't propagate without an explicit host-side override. Without mappings.yaml, the upload pipeline persists no `mapping_audits` doc in mongo, and `getMockMapping` returns empty `mocks: []` for every test case — forcing the replay matcher onto fragile timestamp windows. Two skill updates: 1. Phase B2 step 1: `keploy record -c "<cmd>" --sync --disable-mapping=false` is the canonical incantation, with explicit rationale for why --disable-mapping=false is mandatory. 2. Case 2b-recapture: same flag pair documented on the record step of the (record → upload → delete) order. The --disable-mapping flag was added to `keploy record` upstream (keploy/keploy PR #4250). * docs(k8s-proxy-llm-workflow): add Installation section + Vale spelling fixes Address user feedback + Copilot Vale-spelling comments on PR #871: User feedback (Cursor user): the doc lacked a setup section, so they went off the older `.cursorrules` instructions in agent-test-generation.md which is now deprecated. Verified against cursor-agent's built-in `migrate-to-skills` skill: `.cursor/skills/<name>/SKILL.md` IS the modern Cursor format, `.cursorrules` and `.cursor/rules/*.mdc` are being migrated FROM. Added an Installation section at the top of the page covering the modern Skills mechanism for Cursor / Claude Code / other agents, with an explicit "do not use .cursorrules" note (the playbook is ~8k tokens; pinning it as always-on context would bill on every editor turn). Vale spelling fixes (Copilot comments r3343-r3369): - "analyse" → "analyze" (en_US): Prompt A wording + Routine A heading - "ALLOWLIST" → "Allowlist" (security term, lowercased to match Vale) + added `[Aa]llowlist` to the Base vocabulary so future occurrences pass lint - "re-bills" → "gets re-added to context" (3 sites) — clearer to readers and dodges Vale's spelling check Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(k8s-proxy-llm-workflow): prettier — normalize emphasis to _underscore_ CI's prettier check (creyD/prettier_action@v4.6 with prettier 3.8.3) fails the PR because three emphasis spans in the file use `*…*` syntax. Prettier 3.x normalizes em-emphasis to `_…_`. Auto-fixed via `prettier --write`. No prose changes — only the markup style for the three italic spans (`*values*`, `*shape*`, `*value*`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(k8s-proxy-llm-workflow): self-review fixes — token count, foreground -c, two-routine wording Three self-review nits caught on a deep re-read: 1. Installation: "~8k-token playbook" was off — measured the actual file with tiktoken cl100k_base and got 9,310 tokens. Bumped the warning to "~9k-token" so the cost rationale is grounded in the real number. 2. Phase B2 capture: clarified that the -c value must be the FOREGROUND form of the run command. If pre-flight uses `docker compose up -d` (detached, common in repos without a foreground equivalent declared), passing the same string to `keploy record -c` makes docker exit immediately on detach and keploy thinks the app already terminated, capturing nothing. Example: pre-flight `docker compose up -d`, record `docker compose up` (no -d). 3. Page description: "exactly two developer prompts" was inaccurate — Prompt A has two phrasings, so the agent listens for three distinct surface phrases. Reworded to "two routine prompts (failing-replay analyze-and-fix; add-tests-for-my-changes)" so the count refers to the two routines rather than the surface phrases. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(vale): allow spaced em-dash + logical quotes + tech vocab CI's Vale doc linter (errata-ai/vale-action@v2.1.1 with vale 3.0.3 and the project's existing Google + Vale base styles) flagged 89 errors on the k8s-proxy-llm-workflow page after my Installation section landed. Categorized: 58× Google.EmDash — "Don't put a space before or after a dash". The doc uses the spaced em-dash form ` — ` for prose readability; many other docs in the repo do the same (see hits in generate-api-tests-using-ai.md, etc.). Disabling the rule repo-wide is consistent with the seven other Google.* overrides already in `.vale.ini` and matches the docs' established style. 8× Google.Quotes — "Commas and periods go inside quotation marks". The docs use period-OUTSIDE-quote when the quoted token is a literal the reader is supposed to paste verbatim (e.g. `the exact value "FAILED".`); putting the period inside would change the visible token. Disabling for consistency with the other Google.* overrides. 23× Vale.Spelling — tech terms not yet in the Base vocabulary. Added: branch_id, camelCase, CLI[s]?, cwd, hardcoded, JSONPath[s]?, matcher, misclassification, mutex, OAuth, readback, README, snake_case, stdout, test_run, unprojected. 1× Vale.Spelling on "whatever's" — possessive on the indefinite pronoun that Vale's en_US dictionary doesn't recognize. Reworded the sentence in-place rather than vocab-ing it; the possessive form is genuinely unusual and a rewrite is cleaner than whitelisting it. Local `vale --config=.vale.ini versioned_docs/.../k8s-proxy-llm-workflow.md` now reports 0 errors. Prettier still clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Signed-off-by: Charan Kamarapu <kamarapucharan@gmail.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
diff --git a/.vale.ini b/.vale.ini
@@ -32,6 +32,8 @@ Google.Exclamation = NO     # Allow exclamation points
 Google.Ellipses = NO        # Allow ellipses in text
 Google.Latin = NO           # Allow "e.g." and "i.e." instead of "for example"
 Google.Units = NO           # Allow "k8s" — Google.Units' \d+s regex matches "8s" inside the token
+Google.EmDash = NO          # Allow spaced em-dashes — used consistently across the repo for prose readability (the Google style wants `—` without spaces; many existing docs use the spaced form intentionally)
+Google.Quotes = NO          # Allow logical (British) punctuation around quotation marks — the docs use period-OUTSIDE-quote for technical tokens (e.g. `the value "FAILED".`) so dropping a period inside doesn't alter the literal token the reader is supposed to paste
 
 # Allow specific terms:
 Vale.Terms=NO
diff --git a/vale_styles/config/vocabularies/Base/accept.txt b/vale_styles/config/vocabularies/Base/accept.txt
@@ -1,5 +1,6 @@
 [Aa]ir-?gap(?:ped|ping)?
 [Aa]uditable
+[Aa]llowlist
 [Cc]group[s]?
 [Cc]leartext
 [Cc]onfigMap[s]?
@@ -201,3 +202,19 @@ Woohoo
 wsl
 WSL
 YAMLs
+branch_id
+camelCase
+CLI[s]?
+cwd
+hardcoded
+JSONPath[s]?
+matcher
+misclassification
+mutex
+OAuth
+readback
+README
+snake_case
+stdout
+test_run
+unprojected
diff --git a/versioned_docs/version-4.0.0/quickstart/k8s-proxy-llm-workflow.md b/versioned_docs/version-4.0.0/quickstart/k8s-proxy-llm-workflow.md