You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Complete the --proxy-url → --k8s-proxy-url rename (and cloud.proxyUrl →
cloud.k8sProxyUrl) to match the CLI.
- Add "Authenticating through the k8s-proxy": the --k8s-proxy-auth /
KEPLOY_K8S_PROXY_AUTH offline/CI mode — required inputs (proxy URL +
KEPLOY_API_KEY), what it does and which api-server calls it skips,
KEPLOY_TLS_SKIP_VERIFY / SSL_CERT_FILE for self-signed proxy certs,
--cluster behaviour, and read+write vs replay:start permissions.
- Add "Troubleshooting self-hosted replay" table (self-signed TLS, empty
clusterName 400, missing mock blob, sysctl/PATH, app image pull, eBPF
agent image tag, --delay for slow apps, telemetry mock mismatches).
- Note the auth axis in the overview, a k8s-proxy-auth quick example, and a
report-upload row in the SaaS-vs-self-hosted table.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Aditya Sharma <aditya282003@gmail.com>
Copy file name to clipboardExpand all lines: versioned_docs/version-4.0.0/keploy-cloud/cloud-replay.md
+95-11Lines changed: 95 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
id: cloud-replay
3
3
title: Cloud Replay Command Reference
4
4
sidebar_label: Cloud Replay
5
-
description: Complete reference for every flag of the keploy cloud replay command — app and cluster selection, self-hosted vs SaaS targeting, the --proxy-url ingress override, release-tag and branch selection, mapping flags, and trigger vs local replay, with examples.
5
+
description: Complete reference for every flag of the keploy cloud replay command — app and cluster selection, self-hosted vs SaaS targeting, the --k8s-proxy-url ingress override, release-tag and branch selection, mapping flags, and trigger vs local replay, with examples.
-**Where the replay executes** — locally on your machine (the default) or inside the cluster (`--trigger`). See [Trigger vs local replay](#trigger-vs-local-replay).
40
-
-**Where the test data and cluster live** — Keploy **SaaS** (test assets fetched from the api-server) or **self-hosted** (test assets fetched from your in-cluster k8s-proxy ingress). Keploy detects this from the selected cluster's `deployment_type`; some flags (notably [`--proxy-url`](#--proxy-url)) apply only to self-hosted clusters.
40
+
-**Where the test data and cluster live** — Keploy **SaaS** (test assets fetched from the api-server) or **self-hosted** (test assets fetched from your in-cluster k8s-proxy ingress). Keploy detects this from the selected cluster's `deployment_type`; some flags (notably [`--k8s-proxy-url`](#--k8s-proxy-url)) apply only to self-hosted clusters.
41
+
-**How the CLI authenticates** — directly against the api-server (the default), or **through the k8s-proxy** for offline/air-gapped self-hosted CI (`--k8s-proxy-auth`). See [Authenticating through the k8s-proxy](#authenticating-through-the-k8s-proxy).
41
42
42
43
:::tip
43
44
Every flag is also available via `keploy cloud replay --help`. Run with `-v`/`--debug` to see which ingress URL, cluster, and branch a run resolved to.
By default, a self-hosted cloud replay discovers the k8s-proxy ingress from the selected cluster's **latest heartbeat**. `--proxy-url` makes that endpoint explicit and overridable: when set, it **overrides** the heartbeat-inferred ingress URL for every k8s-proxy call the run makes (fetching test sets, mocks and mappings, and — under `--trigger` — starting the in-cluster replay and streaming its status).
98
+
By default, a self-hosted cloud replay discovers the k8s-proxy ingress from the selected cluster's **latest heartbeat**. `--k8s-proxy-url` makes that endpoint explicit and overridable: when set, it **overrides** the heartbeat-inferred ingress URL for every k8s-proxy call the run makes (fetching test sets, mocks and mappings, and — under `--trigger` — starting the in-cluster replay and streaming its status).
93
99
94
100
Reach for it when:
95
101
@@ -100,17 +106,71 @@ Rules and behaviour:
100
106
101
107
- Must be an **absolute `http://` or `https://` URL with a host**. Malformed values are rejected up front, before any cluster work, with a clear error. A trailing slash is trimmed automatically.
102
108
- Applies to **self-hosted clusters only**. For a SaaS cluster the value is ignored (replay talks to the api-server directly) and the CLI logs a warning so the flag is never silently dropped.
103
-
- It does **not** bypass cluster registration/activity checks — the cluster must still be registered and (for self-hosted) actively heart-beating. `--proxy-url` only changes *which ingress URL is used*, not whether the cluster is considered valid.
104
-
- Can also be set in `keploy.yml` as `cloud.proxyUrl`.
109
+
- It does **not** bypass cluster registration/activity checks — the cluster must still be registered and (for self-hosted) actively heart-beating. `--k8s-proxy-url` only changes *which ingress URL is used*, not whether the cluster is considered valid.
110
+
- Can also be set in `keploy.yml` as `cloud.k8sProxyUrl`.
**Self-hosted only.** An additive, opt-in authentication mode for CI runners and boxes that can reach **only** the k8s-proxy — no direct route to the Keploy api-server and no internet. When enabled, the CLI authenticates your API key **through** the k8s-proxy (which validates it against the api-server on your behalf) and then skips every other direct api-server call for the rest of the run.
123
+
124
+
This is distinct from [`--k8s-proxy-url`](#--k8s-proxy-url): that flag only chooses *which ingress the data plane targets*; `--k8s-proxy-auth` changes *how the CLI authenticates and which control-plane calls it makes*. You normally use both together.
125
+
126
+
### Enabling it
127
+
128
+
| Input | Flag | Environment variable | keploy.yml |
129
+
| --- | --- | --- | --- |
130
+
| Turn the mode on | `--k8s-proxy-auth` | `KEPLOY_K8S_PROXY_AUTH=true` | `cloud.k8sProxyAuth: true` |
All three are required — the mode needs a proxy URL to reach and a PAT to authenticate. Generate the PAT from **Settings → API Keys** in the dashboard.
- Authenticates by sending your PAT to the proxy's `/relay-auth` endpoint, which relays it to the api-server's `/auth/apikey` and returns only the verdict — the CLI never contacts the api-server directly.
148
+
- Forces the run into **self-hosted** mode: it skips the cluster-list lookup and targets the proxy ingress directly. Because of this, a `--cluster` name is needed (see below).
149
+
- **Skips** every direct api-server step that would otherwise hang offline: the api-key exchange, role/plan checks, IT-usage check, update check, and the end-of-run debug-bundle upload (which targets the api-server). Authorization is still enforced server-side by the proxy's RBAC on every request.
150
+
151
+
### TLS to a self-signed proxy
152
+
153
+
Self-hosted proxies commonly serve a self-signed certificate. If the auth preflight fails with `certificate signed by unknown authority`, either trust the CA or skip verification:
154
+
155
+
```bash
156
+
# Option A — skip verification (matches the self-hosted data plane)
157
+
export KEPLOY_TLS_SKIP_VERIFY=true
158
+
159
+
# Option B — keep verification on by trusting the proxy's CA
160
+
export SSL_CERT_FILE=/path/to/proxy-ca.pem
161
+
```
162
+
163
+
### `--cluster` in this mode
164
+
165
+
Since the mode skips the cluster-list lookup, pass `--cluster <name>` where `<name>` is the cluster the app was **registered/recorded under** (the deployment's `KEPLOY_CLUSTER_NAME`). The CLI adopts it automatically from the app's record when you omit it; a **wrong** name resolves zero apps (`app <ns>.<dep> not found for the cluster <name>`).
166
+
167
+
### Permissions
168
+
169
+
- **Local replay** (no `--trigger`) needs a key with **read + write** — enough to fetch tests and upload the report.
170
+
- **In-cluster replay** (`--trigger`) starts an orchestrated replay via the proxy's `/test/start`, which requires the admin-tier `replay:start` permission (or the `ci` scope). Under `KEPLOY_RBAC_MODE=enforce`, a key without it is rejected.
171
+
172
+
---
173
+
114
174
## Trigger vs local replay
115
175
116
176
### `--trigger`
@@ -223,8 +283,32 @@ These take precedence over values auto-detected from the CI provider's environme
223
283
| --- | --- | --- |
224
284
| Where test assets come from | Keploy api-server | In-cluster k8s-proxy ingress |
225
285
| `--cluster` | Optional (single active cluster auto-selected) | Recommended/required; cluster must be registered and active |
| `--k8s-proxy-auth` | N/A | Optional — authenticate through the proxy for offline/CI runs |
227
288
| Auth to the data plane | api-server token | User PAT validated by the k8s-proxy |
289
+
| Test report upload | api-server | k8s-proxy ingress (proxy DB), authenticated with the PAT |
290
+
291
+
---
292
+
293
+
## Troubleshooting self-hosted replay
294
+
295
+
Local (non-`--trigger`) self-hosted replay boots the app **and** the Keploy eBPF agent locally via docker-compose, so it has a few environment prerequisites the SaaS path doesn't. The failures below are the common ones, with fixes:
296
+
297
+
| Symptom | Cause | Fix |
298
+
| --- | --- | --- |
299
+
| `k8s-proxy-auth: … certificate signed by unknown authority` | Proxy serves a self-signed cert | `export KEPLOY_TLS_SKIP_VERIFY=true` (or `SSL_CERT_FILE=<ca.pem>`) |
300
+
| `failed to get test sets: unexpected status code: 400` | Empty `clusterName` sent to the proxy | Pass `--cluster <name>` (the app's registered cluster) |
301
+
| `app <ns>.<dep> not found for the cluster <name>` | `--cluster` doesn't match the app's recorded cluster | Use the exact `KEPLOY_CLUSTER_NAME` the app was recorded under |
302
+
| `blob not found` / `…/mocks.bin not found` | The mock blob isn't in the proxy's object store (e.g. the store was reset after recording) | Re-record the test set; keep the metadata DB and object store lifecycles in sync |
303
+
| `sysctl: executable file not found in $PATH` | The privileged (sudo) re-exec stripped `/usr/sbin`/`/sbin` from `PATH` | Handled by the CLI automatically; on older builds `export PATH="$PATH:/usr/sbin:/sbin"` |
304
+
| `pull access denied for <app-image>` | The app image lives only in the cluster, not the local docker daemon | Load it locally, pass `--image <local-tag>`, or use `--trigger` (in-cluster) |
305
+
| `manifest for keploy/enterprise:v<ver> not found` | The eBPF agent image tag derived from the CLI version isn't published (a dev build resolves to `v<version>-dev`) | Use a release binary, or make that tag resolvable to the local docker daemon |
306
+
| `connection reset by peer` → `ACTUAL STATUS 0` on the first test | The app wasn't listening yet when the first request fired | Raise `--delay` (e.g. `--delay 20`) for slow-starting apps (JVM, etc.) |
307
+
| `no matching mock found for POST /v1/logs` (or similar) | The app exports OpenTelemetry/telemetry to a collector that wasn't mocked | Disable telemetry export for the replay, add the endpoint to `test.globalNoise`, or re-record |
308
+
309
+
:::tip
310
+
For a cluster-deployed app, in-cluster replay (`--trigger --namespace <ns>`) sidesteps the local docker/image/eBPF prerequisites entirely — it replays where the app and its dependencies already run. It needs a key with `replay:start` (admin-tier or the `ci` scope).
0 commit comments