Skip to content

Commit 04e599e

Browse files
author
NOVA
committed
Merge remote-tracking branch 'upstream/main'
# Conflicts: # CHANGELOG.md
2 parents 6dd31cb + e794b79 commit 04e599e

4,915 files changed

Lines changed: 190226 additions & 91026 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Telegram Maintainer Decisions
2+
3+
Use this page during Telegram PR review. These are intentional maintainer decisions, not incidental implementation details.
4+
5+
Verified against Telegram Bot API 10.0, May 8 2026.
6+
7+
## Streaming
8+
9+
- Do not reintroduce `sendMessageDraft` for answer streaming. Telegram drafts are ephemeral 30-second previews in private chats; final delivery still requires a separate `sendMessage`. OpenClaw uses `sendMessage` plus `editMessageText`, then finalizes in place so the user sees one persistent answer.
10+
- Streaming owns one visible preview message. Edit it forward. Do not send an extra final bubble unless the final edit genuinely failed.
11+
- Keep the first-preview debounce. If a provider sends token-sized deltas, coalesce them into cumulative preview text instead of removing the debounce.
12+
- Respect Telegram limits in the Telegram layer. Text over 4096 chars chains into continuation messages. Polls keep the current Bot API 12-option cap.
13+
14+
## Telegram API Ownership
15+
16+
- Prefer grammY primitives and Telegram-native helpers when they model the behavior directly. Avoid custom Bot API wrappers for behavior grammY already owns.
17+
- Throttling is bot-token scoped. All Telegram API clients for the same token share one grammY `apiThrottler()` instance.
18+
- Do not silently retry failed topic sends without topic metadata. A wrong-surface success is worse than a loud Telegram error.
19+
- DM topics and forum topics are distinct. `direct_messages_topic_id` and `message_thread_id` are not interchangeable.
20+
21+
## Context And Authorization
22+
23+
- Reply context comes from OpenClaw-observed messages. Bot API updates expose `reply_to_message`, but there is no arbitrary `getMessage(chat, id)` hydration path later.
24+
- Current local chat context must outrank stale reply ancestry in the prompt. Old replied-to messages should not look like the active conversation.
25+
- Pairing is DM-only. Group and topic authorization need explicit config allowlists.
26+
- Telegram allowlists use numeric sender IDs. Usernames are optional, mutable, and not a reliable arbitrary-user lookup key in the Bot API.
27+
- Group and channel visible replies are policy-controlled. Normal room replies stay private unless `messages.groupChat.visibleReplies: "automatic"` is set or the agent explicitly calls `message.send`.
28+
29+
## Interactive Surfaces
30+
31+
- Native callbacks stay structured. Approval, native command, plugin, select, and multiselect callbacks must not fall through as raw callback text.
32+
- Preserve callback values exactly, including delimiters such as `env|prod`.
33+
- Native slash commands should remain fast-pathable before full workspace and agent-turn setup.
34+
35+
## Review Standard
36+
37+
Telegram behavior PRs need real Telegram proof when they touch transport, streaming, topics, callbacks, authorization, or reply context. Prefer the bot-to-bot QA lane or an equivalent live Telegram probe over synthetic-only validation.

.agents/skills/crabbox/SKILL.md

Lines changed: 77 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,7 @@ Read the JSON summary. Useful fields:
127127
- `provider`: should be `blacksmith-testbox`
128128
- `leaseId`: `tbx_...`
129129
- `syncDelegated`: should be `true`
130+
- `commandPhases`: populated when the command prints `CRABBOX_PHASE:<name>`
130131
- `commandMs` / `totalMs`
131132
- `exitCode`
132133

@@ -138,6 +139,62 @@ unclear:
138139
blacksmith testbox list
139140
```
140141

142+
## Observability Flags
143+
144+
Use these on debugging runs before inventing ad hoc logging:
145+
146+
- `--preflight`: prints run context, workspace mode, SSH target, remote user/cwd,
147+
sudo/apt, Node, pnpm, Docker, and bubblewrap. On `blacksmith-testbox`, this
148+
prints a delegated-unsupported note because the workflow owns setup.
149+
- `CRABBOX_ENV_ALLOW=NAME,...`: forwards only listed local env vars for direct
150+
providers and prints `set len=N secret=true` style summaries. On
151+
`blacksmith-testbox`, env forwarding is unsupported; put secrets in the
152+
Testbox workflow instead.
153+
- `--env-from-profile <file>` plus `--allow-env NAME`: loads simple
154+
`export NAME=value` / `NAME=value` lines from a local profile without
155+
executing it, then forwards only allowlisted names. `--allow-env` is
156+
repeatable and comma-separated. Profile values override ambient allowlisted
157+
env values for that run.
158+
- `--script <file>` / `--script-stdin`: upload a local script into
159+
`.crabbox/scripts/` and execute it on the remote box. Shebang scripts execute
160+
directly; scripts without a shebang run through `bash`. Arguments after `--`
161+
become script args.
162+
- `--fresh-pr owner/repo#123|URL|number`: skip dirty local sync and create a
163+
fresh remote checkout of the GitHub PR. Bare numbers use the current repo's
164+
GitHub origin. Add `--apply-local-patch` only when the current local
165+
`git diff --binary HEAD` should be applied on top of that PR checkout.
166+
- `--capture-stdout <path>` / `--capture-stderr <path>`: write remote streams to
167+
local files and keep binary/noisy output out of retained logs. Parent
168+
directories must already exist. These are direct-provider only.
169+
- `--capture-on-fail`: on non-zero direct-provider exits, downloads
170+
`.crabbox/captures/*.tar.gz` with `test-results`, `playwright-report`,
171+
`coverage`, JUnit XML, and nearby logs. Treat as secret-bearing until reviewed.
172+
- `--timing-json`: final machine-readable timing. Add
173+
`echo CRABBOX_PHASE:install`, `CRABBOX_PHASE:test`, etc. in long shell
174+
commands; direct providers and Blacksmith Testbox both report them as
175+
`commandPhases`.
176+
177+
Live-provider debug template for direct AWS/Hetzner leases:
178+
179+
```sh
180+
mkdir -p .crabbox/logs
181+
pnpm crabbox:run -- --provider aws \
182+
--preflight \
183+
--env-from-profile ~/.profile \
184+
--allow-env OPENAI_API_KEY,OPENAI_BASE_URL \
185+
--timing-json \
186+
--capture-stdout .crabbox/logs/live-provider.stdout.log \
187+
--capture-stderr .crabbox/logs/live-provider.stderr.log \
188+
--capture-on-fail \
189+
--shell -- \
190+
"echo CRABBOX_PHASE:install; pnpm install --frozen-lockfile; echo CRABBOX_PHASE:test; pnpm test:live"
191+
```
192+
193+
Do not pass `--capture-*`, `--download`, `--checksum`, `--force-sync-large`, or
194+
`--sync-only` to delegated providers. Also do not pass `--script*` or
195+
`--fresh-pr` there. Crabbox rejects these because the provider owns sync or
196+
command transport.
197+
141198
## Efficient Bug E2E Verification
142199

143200
Use the smallest Crabbox lane that proves the reported user path, not just the
@@ -179,6 +236,11 @@ Efficient flow:
179236
Keep it efficient:
180237

181238
- Reuse existing E2E scripts and helper assertions before writing ad hoc shell.
239+
- Use `--script <file>` or `--script-stdin` for multi-line E2E commands instead
240+
of quote-heavy `--shell` strings on direct SSH providers.
241+
- Use `--fresh-pr <pr>` when validating an upstream PR in isolation from the
242+
local dirty tree. Add `--apply-local-patch` only when testing a local fixup on
243+
top of that PR.
182244
- Use one-shot Crabbox for a single proof; use a reusable Testbox only when
183245
several commands must share built images, installed packages, or live state.
184246
- Prefer `OPENCLAW_CURRENT_PACKAGE_TGZ` with Docker/package lanes when testing a
@@ -285,7 +347,9 @@ Common Crabbox-only failures:
285347
- Slug/claim confusion: use the raw `tbx_...` id, or run one-shot without
286348
`--id`.
287349
- Sync/timing bug: add `--debug --timing-json`; capture the final JSON and the
288-
printed Actions URL.
350+
printed Actions URL. Large sync warnings now include top source directories
351+
by file count and a hint to update `.crabboxignore` / `sync.exclude`; inspect
352+
those before reaching for `--force-sync-large`.
289353
- Cleanup uncertainty: run `blacksmith testbox list` and stop only boxes you
290354
created.
291355
- Testbox queued/capacity pressure: do not convert a broad changed gate or full
@@ -294,18 +358,19 @@ Common Crabbox-only failures:
294358
report the capacity blocker.
295359

296360
If Crabbox cannot dispatch, sync, attach, or stop but Blacksmith itself works,
297-
use direct Blacksmith from the repo root:
361+
first try the same command through the repo wrapper with `--debug` and
362+
`--timing-json`:
298363

299364
```sh
300-
blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90
301-
blacksmith testbox run --id <tbx_id> "env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed"
302-
blacksmith testbox stop --id <tbx_id>
365+
pnpm crabbox:run -- --provider blacksmith-testbox --debug --timing-json -- \
366+
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed
303367
```
304368

305-
Direct full suite:
369+
Full suite:
306370

307371
```sh
308-
blacksmith testbox run --id <tbx_id> "env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test"
372+
pnpm crabbox:run -- --provider blacksmith-testbox --debug --timing-json -- \
373+
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test
309374
```
310375

311376
Auth fallback, only when `blacksmith` says auth is missing:
@@ -340,16 +405,15 @@ The hydration workflow owns checkout, Node/pnpm setup, dependency install,
340405
secrets, ready marker, and keepalive. Crabbox owns dispatch, sync, SSH command
341406
execution, timing, logs/results, and cleanup.
342407

343-
Minimal direct Blacksmith fallback, from repo root:
408+
Minimal Blacksmith-backed Crabbox run, from repo root:
344409

345410
```sh
346-
blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90
347-
blacksmith testbox run --id <tbx_id> "env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test:changed"
348-
blacksmith testbox stop --id <tbx_id>
411+
pnpm crabbox:run -- --provider blacksmith-testbox --timing-json -- \
412+
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test:changed
349413
```
350414

351-
Use direct Blacksmith only when Crabbox is the broken layer and Blacksmith
352-
itself still works. Prefer direct `blacksmith testbox list` for cleanup
415+
Use direct Blacksmith only when Crabbox is the broken layer and you are
416+
isolating a Crabbox bug. Prefer direct `blacksmith testbox list` for cleanup
353417
diagnostics, not as a reusable work queue.
354418

355419
Important Blacksmith footguns:
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
---
2+
name: openclaw-debugging
3+
description: Debug OpenClaw model, provider, tool-surface, code-mode, streaming, and live/Crabbox behavior by choosing the right logs, probes, and proof path before changing code.
4+
---
5+
6+
# OpenClaw Debugging
7+
8+
Use this skill when OpenClaw behavior differs between local tests, live models,
9+
providers, code mode, Tool Search, Crabbox, or CI, and the next move should be a
10+
debug signal rather than a guess.
11+
12+
## Read First
13+
14+
- `docs/logging.md` for log files, `openclaw logs`, and targeted debug flags.
15+
- `docs/reference/test.md` for local test commands.
16+
- `docs/reference/code-mode.md` for code-mode exec/wait and tool catalog rules.
17+
- Use `$openclaw-testing` for choosing test lanes.
18+
- Use `$crabbox` for broad, Docker, package, Linux, live-key, or CI-parity proof.
19+
20+
## Default Loop
21+
22+
1. State the suspected boundary: config, tool construction, provider payload,
23+
fetch, stream/SSE, transcript replay, worker/runtime, package/dist, or CI.
24+
2. Add or enable the narrowest signal that proves that boundary.
25+
3. Reproduce with the same provider/model/config. Do not randomly switch models
26+
unless the model itself is the variable being tested.
27+
4. Compare configured state with actual run activation.
28+
5. Patch the root cause.
29+
6. Rerun the exact failing probe, then broaden only if the contract requires it.
30+
31+
## Model Transport Logs
32+
33+
Use targeted env flags instead of global debug when the model request shape or
34+
stream timing matters:
35+
36+
```bash
37+
OPENCLAW_DEBUG_MODEL_TRANSPORT=1 openclaw gateway
38+
OPENCLAW_DEBUG_MODEL_PAYLOAD=tools OPENCLAW_DEBUG_SSE=events openclaw gateway
39+
OPENCLAW_DEBUG_MODEL_PAYLOAD=full-redacted OPENCLAW_DEBUG_SSE=peek openclaw gateway
40+
```
41+
42+
Useful flags:
43+
44+
- `OPENCLAW_DEBUG_MODEL_TRANSPORT=1`: request start, fetch response, SDK
45+
headers, first SSE event, stream done, and transport errors at `info`.
46+
- `OPENCLAW_DEBUG_MODEL_PAYLOAD=summary`: bounded payload summary.
47+
- `OPENCLAW_DEBUG_MODEL_PAYLOAD=tools`: all model-facing tool names.
48+
- `OPENCLAW_DEBUG_MODEL_PAYLOAD=full-redacted`: capped, redacted JSON payload.
49+
Use only while debugging; prompts/message text may still appear.
50+
- `OPENCLAW_DEBUG_SSE=events`: first-event and stream-completion timing.
51+
- `OPENCLAW_DEBUG_SSE=peek`: first five redacted SSE events.
52+
- `OPENCLAW_DEBUG_CODE_MODE=1`: code-mode tool-surface diagnostics.
53+
54+
Watch logs with:
55+
56+
```bash
57+
openclaw logs --follow
58+
```
59+
60+
## Common Boundaries
61+
62+
- **Config vs activation:** config can be enabled while the run disables tools,
63+
is raw, has an empty allowlist, or lacks model tool support. Check the actual
64+
visible tools before enforcing provider payload invariants.
65+
- **Tool surface:** inspect final model-visible tool names, not only the tool
66+
registry or config. Code mode means exactly `exec` and `wait` only after it
67+
actually activates.
68+
- **Provider payload:** log fields, model id, service tier, reasoning, input
69+
size, metadata keys, prompt-cache key presence, and tool names before SDK
70+
call.
71+
- **Fetch vs SSE:** fetch response proves HTTP headers arrived; first SSE event
72+
proves provider body progress. A gap here is a stream/body/provider issue, not
73+
tool execution.
74+
- **Worker/dist:** run `pnpm build` when touching workers, dynamic imports,
75+
package exports, lazy runtime boundaries, or published paths.
76+
- **Live keys:** check local `~/.profile` for key presence/length before saying
77+
live proof is blocked. Never print secrets.
78+
79+
## Code Pointers
80+
81+
- Model payload + Responses stream:
82+
`src/agents/openai-transport-stream.ts`
83+
- Guarded fetch/timing:
84+
`src/agents/provider-transport-fetch.ts`
85+
- OpenAI/Codex provider wrappers:
86+
`src/agents/pi-embedded-runner/openai-stream-wrappers.ts`
87+
- Tool construction, Tool Search, code-mode activation:
88+
`src/agents/pi-embedded-runner/run/attempt.ts`
89+
- Code-mode runtime and worker:
90+
`src/agents/code-mode.ts`
91+
`src/agents/code-mode.worker.ts`
92+
- Tool Search catalog:
93+
`src/agents/tool-search.ts`
94+
95+
## Proof Choice
96+
97+
- Single helper/payload bug: local targeted Vitest.
98+
- Docs/logging-only: `pnpm check:docs` and `git diff --check`.
99+
- Worker/dist/lazy import/package surface: targeted tests plus `pnpm build`.
100+
- Live provider/model behavior: same provider/model with debug flags and a real
101+
key if available.
102+
- Docker/package/Linux/CI-parity: `$crabbox`.
103+
- CI failure: exact SHA, relevant job only, logs only after failure/completion.
104+
105+
## Output Habit
106+
107+
Report:
108+
109+
- boundary tested
110+
- exact command/env shape, redacted
111+
- observed signal, such as tool names or first SSE event timing
112+
- fix location
113+
- narrow proof and any remaining risk
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
interface:
2+
display_name: "OpenClaw Debugging"
3+
short_description: "Debug model, tool, stream, and live behavior"
4+
default_prompt: "Use $openclaw-debugging to identify the right OpenClaw debug boundary, turn on targeted logs, and choose the narrowest local or Crabbox proof."

0 commit comments

Comments
 (0)