Skip to content

Commit e2a17a4

Browse files
authored
Merge pull request #496 from ObolNetwork/fix/agent-buy-retry-wrapper
feat(flows): switch default QA LLM to qwen36-deep + 1-retry safety net for agent buy
2 parents 7850332 + 187d820 commit e2a17a4

15 files changed

Lines changed: 117 additions & 73 deletions

.agents/skills/obol-stack-dev/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ OBOL_TOKEN_BASE_SEPOLIA=0x0a09371a8b011d5110656ceBCc70603e53FD2c78
4747

4848
**Payment assertion**: don't bypass the agent buy step with a direct script exec. If the agent times out, diagnose Hermes/LiteLLM/model routing — don't relax the assertion. Required evidence: `PurchaseRequest Ready=True` + paid HTTP 200 + on-chain `Transfer` + exact balance deltas.
4949

50-
**QA LLM**: full seller/buyer QA must route Alice and Bob through `OBOL_LLM_ENDPOINT` (OpenAI-compatible vLLM or llama.cpp on the QA host). Default `OBOL_LLM_MODEL=qwen36-fast`. Sequence: `obol model setup custom``obol model prefer` → one `obol model sync`. Local Ollama and cloud-fallback are **not** acceptable green substitutes for full-flow QA.
50+
**QA LLM**: full seller/buyer QA must route Alice and Bob through `OBOL_LLM_ENDPOINT` (OpenAI-compatible vLLM or llama.cpp on the QA host). Default `OBOL_LLM_MODEL=qwen36-deep` (27B-class). The smaller `qwen36-fast` (~4B) was the previous default but flakes on the long single-shot agent-buy prompt at flow-13/14 step 46 — see the retry-wrapper rationale in `flows/lib-dual-stack.sh::agent_buy_with_retry`. Sequence: `obol model setup custom``obol model prefer` → one `obol model sync`. Local Ollama and cloud-fallback are **not** acceptable green substitutes for full-flow QA.
5151

5252
**Public vs private routes**: `/services/*`, `/.well-known/agent-registration.json`, `/skill.md`, and `/` (storefront) are public via the tunnel. **NEVER** remove `hostnames: ["obol.stack"]` from frontend or eRPC HTTPRoutes — exposing them publicly is a critical security flaw.
5353

.agents/skills/obol-stack-dev/references/llm-routing.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,15 +47,15 @@ Canonical user flow for vLLM / sglang / mlx-lm / a remote GPU box. **No ConfigMa
4747
obol stack up
4848

4949
# Drop auto-detected Ollama entries — they will out-rank the new custom entry.
50-
# Internal/model/rank.go parses ":9b" as 90 deci-billions; "qwen36-fast" (no
50+
# Internal/model/rank.go parses ":9b" as 90 deci-billions; "qwen36-deep" (no
5151
# ":Nb" tag) ranks 0. Without removing them, the agent stays on slow host Ollama.
5252
obol model remove qwen3.5:9b
5353
obol model remove qwen3.5:4b
5454

5555
obol model setup custom \
5656
--name spark1-vllm \
5757
--endpoint http://192.168.18.23:8000/v1 \
58-
--model qwen36-fast
58+
--model qwen36-deep
5959
# `setup custom` validates the endpoint, patches LiteLLM, and internally calls
6060
# syncAgentModels → hermes.Sync → rewrites the default agent's deployment files
6161
# with the new primary model. No manual restart needed.

.agents/skills/obol-stack-dev/references/paid-flows.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ The runner has a `warn_unpaid_base_sepolia_rpc` preflight. The CLI scrubs paid-R
6464
- Alice ServiceOffer reaches `Ready=True`.
6565
- ERC-8004 registration tx published to Base Sepolia (`/.well-known/agent-registration.json` reachable via tunnel for live flows).
6666
- Bob `PurchaseRequest` reaches `Ready=True`.
67-
- LiteLLM exposes `paid/<OBOL_LLM_MODEL>` (default `qwen36-fast`).
67+
- LiteLLM exposes `paid/<OBOL_LLM_MODEL>` (default `qwen36-deep`).
6868
- Paid inference returns HTTP 200 and **final-answer** content (not reasoning metadata or tool-catalogue text).
6969
- On-chain `Transfer(Bob signer → Alice, <PAID_AMOUNT>)` receipt is archived.
7070
- Alice balance increases and Bob signer balance decreases by exactly `PAID_AMOUNT` wei (USDC for flow-11, OBOL for flow-13/14).

.agents/skills/obol-stack-dev/references/remote-qa.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ Set `OBOL_LLM_MODEL` to an id returned by `/models`.
6060
cd "$QA"
6161
export PATH="$QA/.workspace/bin:$FOUNDRY_BIN:$TOOL_ROOT:$PATH"
6262
export OBOL_LLM_ENDPOINT=${OBOL_LLM_ENDPOINT:-http://127.0.0.1:8000/v1}
63-
export OBOL_LLM_MODEL=${OBOL_LLM_MODEL:-qwen36-fast}
63+
export OBOL_LLM_MODEL=${OBOL_LLM_MODEL:-qwen36-deep}
6464
ts=$(date +%Y%m%d-%H%M%S)
6565
log="$QA/.tmp/flow-14-$ts.log"
6666
art="$QA/.tmp/flow-14-$ts-artifacts"

.gitleaks.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,12 @@ regexes = [
4444
'''test test test test test test test test test test test junk''',
4545
# USDC storage slot values (uint256 padded, not secrets)
4646
'''0x0{50,}[0-9a-fA-F]{1,14}''',
47+
# Shell variable expansion in HTTP Auth headers — the actual secret
48+
# comes from $BOB_TOKEN / $LITELLM_KEY / etc. at runtime, not from
49+
# the literal source text. Matches `Authorization: Bearer $VAR` and
50+
# `Authorization: Basic ${VAR}` forms only; a hardcoded literal still
51+
# trips the rule because the allowlist regex requires a literal `$`.
52+
'''Authorization:\s+(?:Basic|Bearer)\s+\$\{?[A-Za-z_][A-Za-z0-9_]*''',
4753
]
4854
paths = [
4955
# Gitleaks own config

CLAUDE.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ go test -tags integration -v -run TestIntegration_Tunnel_SellDiscoverBuySidecar_
3737

3838
# Release-gate seller/buyer smoke (requires OBOL_LLM_ENDPOINT pointing at OpenAI-compatible vLLM/llama.cpp)
3939
RELEASE_SMOKE_INCLUDE_OBOL=true RELEASE_SMOKE_INCLUDE_OBOL_FORK=true \
40-
OBOL_LLM_ENDPOINT=http://127.0.0.1:8000/v1 OBOL_LLM_MODEL=qwen36-fast \
40+
OBOL_LLM_ENDPOINT=http://127.0.0.1:8000/v1 OBOL_LLM_MODEL=qwen36-deep \
4141
bash flows/release-smoke.sh
4242

4343
just up # obol stack init + up
@@ -246,13 +246,13 @@ obol model remove qwen3.5:4b
246246
obol model setup custom \
247247
--name spark1-vllm \
248248
--endpoint http://192.168.18.23:8000/v1 \
249-
--model qwen36-fast
249+
--model qwen36-deep
250250
# `setup custom` validates the endpoint, patches LiteLLM, and internally calls
251251
# syncAgentModels → hermes.Sync → rewrites the default agent's deployment files
252252
# with the new primary model. No manual restart needed.
253253

254254
# (b) OR keep Ollama and force-promote the custom entry to the head:
255-
obol model prefer qwen36-fast
255+
obol model prefer qwen36-deep
256256
obol model sync # propagate to Hermes
257257

258258
obol model list # confirm head of model_list

flows/buy-external.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@
6060
# EXTERNAL_PR_TIMEOUT_S default: 300 (5 min)
6161
# EXTERNAL_LOG_BLOCKS_BACK default: 30 (~6 min on Base Sepolia at 2s/blk)
6262
# OBOL_LLM_ENDPOINT default: http://127.0.0.1:8000/v1
63-
# OBOL_LLM_MODEL default: qwen36-fast
63+
# OBOL_LLM_MODEL default: qwen36-deep (27B-class)
6464
# OBOL_LLM_NAME default: external-llm
6565
#
6666
# Exit code: 0 on PASS (every step pass), 1 on any FAIL.
@@ -106,7 +106,7 @@ EXTERNAL_PR_TIMEOUT_S="${EXTERNAL_PR_TIMEOUT_S:-300}"
106106
EXTERNAL_LOG_BLOCKS_BACK="${EXTERNAL_LOG_BLOCKS_BACK:-30}"
107107

108108
OBOL_LLM_ENDPOINT="${OBOL_LLM_ENDPOINT:-http://127.0.0.1:8000/v1}"
109-
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-fast}"
109+
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-deep}"
110110
OBOL_LLM_NAME="${OBOL_LLM_NAME:-external-llm}"
111111

112112
# Resolve OBOL_ROOT before sourcing helpers — lib.sh re-derives it but

flows/flow-03-inference.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ source "$(dirname "$0")/lib.sh"
55

66
if [ -n "${OBOL_LLM_ENDPOINT:-}" ]; then
77
run_step "Route LiteLLM through QA LLM endpoint" route_llm_via_obol_cli "$OBOL"
8-
LITELLM_MODEL="${OBOL_LLM_MODEL:-qwen36-fast}"
8+
LITELLM_MODEL="${OBOL_LLM_MODEL:-qwen36-deep}"
99
else
1010
LITELLM_MODEL="$FLOW_MODEL"
1111

flows/flow-04-agent.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -110,8 +110,8 @@ fi
110110

111111
model_name=$("$OBOL" kubectl get cm hermes-config -n "$NS" -o jsonpath='{.data.config\.yaml}' 2>/dev/null | sed -n 's/^[[:space:]]*default: //p' | tr -d '"' | head -1)
112112
[ -n "$model_name" ] || model_name="qwen3.5:35b"
113-
if [ -n "${OBOL_LLM_ENDPOINT:-}" ] && [ "$model_name" != "${OBOL_LLM_MODEL:-qwen36-fast}" ]; then
114-
fail "Hermes default model $model_name does not match QA LLM model ${OBOL_LLM_MODEL:-qwen36-fast}"
113+
if [ -n "${OBOL_LLM_ENDPOINT:-}" ] && [ "$model_name" != "${OBOL_LLM_MODEL:-qwen36-deep}" ]; then
114+
fail "Hermes default model $model_name does not match QA LLM model ${OBOL_LLM_MODEL:-qwen36-deep}"
115115
cleanup_pid "$PF_PID"
116116
emit_metrics
117117
exit 0

flows/flow-11-dual-stack.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@
3636
# FLOW11_BOB_HTTP_PORT FLOW11_BOB_HTTP_ALT_PORT
3737
# FLOW11_BOB_HTTPS_PORT FLOW11_BOB_HTTPS_ALT_PORT
3838
# OBOL_LLM_ENDPOINT required vLLM/llama.cpp/OpenAI-compatible endpoint
39-
# OBOL_LLM_MODEL endpoint model name (default: qwen36-fast)
39+
# OBOL_LLM_MODEL endpoint model name (default: qwen36-deep)
4040
source "$(dirname "$0")/lib.sh"
4141

4242
# ═════════════════════════════════════════════════════════════════
@@ -60,7 +60,7 @@ BOB_HTTP_ALT_PORT="${FLOW11_BOB_HTTP_ALT_PORT:-$(pick_free_port)}"
6060
BOB_HTTPS_PORT="${FLOW11_BOB_HTTPS_PORT:-$(pick_free_port)}"
6161
BOB_HTTPS_ALT_PORT="${FLOW11_BOB_HTTPS_ALT_PORT:-$(pick_free_port)}"
6262
FACILITATOR_URL="${FLOW11_FACILITATOR_URL:-https://x402.gcp.obol.tech}"
63-
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-fast}"
63+
OBOL_LLM_MODEL="${OBOL_LLM_MODEL:-qwen36-deep}"
6464
export OBOL_LLM_MODEL
6565
FLOW11_ARTIFACT_DIR="${FLOW11_ARTIFACT_DIR:-$OBOL_ROOT/.tmp/flow-11-$(date +%Y%m%d-%H%M%S)}"
6666
if ! BASE_SEPOLIA_RPC="$(resolve_base_sepolia_rpc "${FLOW11_BASE_SEPOLIA_RPC:-${BASE_SEPOLIA_RPC:-}}")"; then

0 commit comments

Comments
 (0)