|
| 1 | +# OpenClaw Upgrade Validation Checklist |
| 2 | + |
| 3 | +Use this reference after reading the KiloClaw controller spec and current scripts. |
| 4 | +File names are stable workflow touchpoints; verify the actual branch diff instead of |
| 5 | +assuming every upgrade needs every file. |
| 6 | + |
| 7 | +## Typical Release Touchpoints |
| 8 | + |
| 9 | +| Path | Check | |
| 10 | +|---|---| |
| 11 | +| `services/kiloclaw/Dockerfile` | Pinned OpenClaw release and build-time compatibility patches | |
| 12 | +| `services/kiloclaw/plugins/kilo-chat/package.json` | OpenClaw peer/dev version alignment | |
| 13 | +| `services/kiloclaw/plugins/kiloclaw-morning-briefing/package.json` | OpenClaw peer/dev version alignment | |
| 14 | +| `pnpm-lock.yaml` | Resolved plugin compile/test dependency graph | |
| 15 | +| `pnpm-workspace.yaml` | Release-age or build-script policy needed for the reviewed pin | |
| 16 | +| `services/kiloclaw/e2e/docker-image-testing.md` | Expected image version in manual checks | |
| 17 | +| `apps/web/src/app/(app)/claw/components/changelog-data.ts` | User-visible release note when applicable | |
| 18 | + |
| 19 | +`services/kiloclaw/Dockerfile.local` installs a developer-provided tarball rather |
| 20 | +than the published production pin; do not update it solely for a release number. |
| 21 | + |
| 22 | +## Narrow Checks Before Live Validation |
| 23 | + |
| 24 | +Run repository-required formatting before committing. Prefer targeted checks while |
| 25 | +iterating, then allow push hooks or the relevant release process to run broader gates. |
| 26 | + |
| 27 | +```bash |
| 28 | +pnpm install --lockfile-only |
| 29 | +pnpm install --frozen-lockfile |
| 30 | +pnpm format |
| 31 | +bash -n services/kiloclaw/scripts/controller-smoke-helpers.sh |
| 32 | +bash -n services/kiloclaw/scripts/controller-live-provider-smoke-test.sh |
| 33 | +bash -n services/kiloclaw/scripts/controller-openclaw-upgrade-smoke-test.sh |
| 34 | +git diff --check |
| 35 | +bun run script/check-md-table-padding.ts |
| 36 | +pnpm --filter @kiloclaw/kilo-chat test |
| 37 | +pnpm --filter @kiloclaw/kiloclaw-morning-briefing test |
| 38 | +pnpm --filter @kiloclaw/kiloclaw-morning-briefing typecheck |
| 39 | +``` |
| 40 | + |
| 41 | +If pnpm rejects a just-reviewed OpenClaw release because of repository supply-chain |
| 42 | +policy, do not bypass installation ad hoc. Determine whether an explicit narrow policy |
| 43 | +entry is justified by the pinned image build and successful live upgrade evidence. |
| 44 | + |
| 45 | +Before submitting a KiloClaw change, run the required final gates from |
| 46 | +`services/kiloclaw/AGENTS.md`: |
| 47 | + |
| 48 | +```bash |
| 49 | +# Before tests, confirm Postgres is active or start it with pnpm test:db. |
| 50 | +docker compose -f dev/docker-compose.yml ps postgres |
| 51 | +pnpm typecheck |
| 52 | +pnpm test |
| 53 | +pnpm lint |
| 54 | +``` |
| 55 | + |
| 56 | +If a required final gate cannot be run, state that explicitly in the PR and handoff; |
| 57 | +do not describe narrow checks as full submission validation. |
| 58 | + |
| 59 | +## Official Upgrade Smoke |
| 60 | + |
| 61 | +Run only from a clean committed bump branch; the wrapper builds detached source |
| 62 | +worktrees so ignored local files do not enter either candidate image. |
| 63 | + |
| 64 | +```bash |
| 65 | +bash services/kiloclaw/scripts/controller-openclaw-upgrade-smoke-test.sh |
| 66 | +``` |
| 67 | + |
| 68 | +Expected behaviors: |
| 69 | + |
| 70 | +- It refreshes `origin/main` by default; use `BASE_REF` only when the intended |
| 71 | + upgrade baseline differs and document that reason in the PR. |
| 72 | +- It rejects an identical before/after OpenClaw pin by default. |
| 73 | +- It builds one baseline and one candidate image from checked-in Dockerfiles. |
| 74 | +- It starts the baseline on an empty temporary `/root`, then starts the candidate |
| 75 | + against the same `/root`. |
| 76 | +- The candidate therefore exercises existing-config startup and `openclaw doctor`. |
| 77 | + |
| 78 | +## Pass Criteria |
| 79 | + |
| 80 | +A release candidate is not validated until output proves all of the following: |
| 81 | + |
| 82 | +| Assertion | Why it matters | |
| 83 | +|---|---| |
| 84 | +| `OpenClaw version` for each phase | Images contain the intended packages | |
| 85 | +| `OpenClaw config validate` | Resulting config is accepted explicitly | |
| 86 | +| Gateway status and Control UI proxy | Controller and gateway boot correctly | |
| 87 | +| Configured live smoke model | KiloCode model selection survived boot/upgrade | |
| 88 | +| Kilo Chat plugin load | Packaged extension loads successfully | |
| 89 | +| Kilo Chat diagnostics | New warnings/errors cannot remain invisible | |
| 90 | +| Kilo Chat webhook semantic rejection | Live handler route is registered without side effects | |
| 91 | +| Live Auto Free agent turn | Real Kilo Gateway compatibility and execution work | |
| 92 | + |
| 93 | +## Docker Patch Investigation |
| 94 | + |
| 95 | +OpenClaw bundles may change between releases. If an image build fails around a |
| 96 | +minified bundle patch: |
| 97 | + |
| 98 | +1. Obtain or inspect the intended OpenClaw package without exposing credentials. |
| 99 | +2. Locate provider-specific markers and the exact behavior being patched. |
| 100 | +3. Confirm whether the patch is still necessary or whether upstream added a stable |
| 101 | + production config/env setting. |
| 102 | +4. Change the assertion to target the intended provider/behavior, not whichever |
| 103 | + generic text happens to match first. |
| 104 | +5. Rebuild and rerun the persisted-root live smoke. |
| 105 | + |
| 106 | +The KiloCode model discovery workaround patches KiloCode's own fetch timeout. An |
| 107 | +environment variable that only wraps live-test provider catalog execution does not |
| 108 | +replace that production fetch-level control. |
| 109 | + |
| 110 | +## Diagnostics Policy |
| 111 | + |
| 112 | +Inspect plugin diagnostics through `openclaw plugins inspect kilo-chat --json`. |
| 113 | +Current smoke behavior may explicitly surface an acknowledged cosmetic warning, such |
| 114 | +as missing optional `channelConfigs` metadata, while verifying runtime routing |
| 115 | +separately. Do not expand the allowance without review: |
| 116 | + |
| 117 | +- Fail on any unexpected warning or error. |
| 118 | +- Include the exact accepted diagnostic and impact assessment in the PR. |
| 119 | +- Prefer fixing actionable metadata rather than retaining a permanent allowance. |
| 120 | + |
| 121 | +## Safe PR Evidence |
| 122 | + |
| 123 | +A PR verification summary may include image tags, version checks, named assertions, |
| 124 | +pass/fail totals, and known diagnostic text. While reviewing or modifying live smoke, |
| 125 | +keep its controller port loopback-only and its default controller/proxy token randomly |
| 126 | +generated. A PR summary must not include: |
| 127 | + |
| 128 | +- API key or organization credential values. |
| 129 | +- Controller/proxy or gateway tokens. |
| 130 | +- Raw provider response bodies or failure logs from live credential runs. |
| 131 | +- Sensitive prompts; use only generated nonce prompts in live smoke tests. |
0 commit comments