Skip to content

Commit 4b96e87

Browse files
Sodawyxclaude
andcommitted
fix(runtime): inject CPU/memory/port defaults; skip endpoints on --no-wait; silence SDK validation warnings
Three problems surfaced when running the README's minimal example `ar runtime apply -f runtime.yaml`: 1. HTTP 400 "CPU is required; Memory is required; Port is required" The CLI passed cpu=null/memory=null/port=null through to the SDK even though the docs already promised default 2 cores / 4096 MB / 9000. Defaults were never actually applied. Add DEFAULT_CPU / DEFAULT_MEMORY_MB / DEFAULT_PORT in runtime_constants and inject them in to_runtime_create_input / to_runtime_update_input. `spec.container.port` keeps its documented precedence over `spec.port`, both fall back to DEFAULT_PORT. 2. HTTP 400 "runtime must be in READY status to create endpoints" under --no-wait apply_cmd unconditionally called reconcile_endpoints after reconcile_runtime — fine under --wait (we'd already polled runtime to READY), but under --no-wait the runtime is still CREATING and the backend rejects endpoint create. Gate reconcile_endpoints + poll_many_parallel on `wait`. Under --no-wait we just submit the runtime; an interactive run prints a stderr notice telling the user to re-apply once the runtime is READY (TTY-only so it doesn't pollute scripted JSON output). 3. SDK pydantic warning spam Every `list_all()` call deserializes every runtime in the workspace, and the SDK emits "validate type failed" WARNINGs whenever a server-side record doesn't match its current schema (other people's runtimes with codeConfiguration.language=java17, empty logConfiguration, etc.). A single apply emitted ~10 lines of noise. Install a logging.Filter on the `agentrun-logger` logger that drops exactly the "validate type failed" message. `--debug` removes the filter so debugging shows full logs. Docs: - runtime.md (en + zh) apply Options table: document the new --no-wait semantics; add a paragraph explaining the auto-injected resource defaults. Tests: - test_create_input_user_values_override_defaults — explicit values win. - test_create_input_container_port_wins_over_spec_port — precedence. - test_update_input_applies_same_defaults — symmetry with create. - Existing test_create_input_injects_system_tag_and_container_artifact now also asserts cpu=2.0 / memory=4096 / port=9000. - test_apply_create_happy_path tightened: under --no-wait, create_endpoint MUST NOT be called and endpoints list must be empty. - test_apply_update_path tightened: under --wait, create_endpoint IS called after the runtime reaches READY. Local gate: ruff + mypy clean, 525/525 tests pass, coverage 95.25%. Signed-off-by: Sodawyx <sodawyx@126.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Sodawyx <sodawyx@126.com>
1 parent 009807b commit 4b96e87

8 files changed

Lines changed: 119 additions & 17 deletions

File tree

docs/en/runtime.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,10 +43,14 @@ ar runtime apply -f FILE [--wait/--no-wait] [--timeout DURATION]
4343
| Flag | Type | Required | Default | Description |
4444
|------|------|----------|---------|-------------|
4545
| `-f`, `--file` | path | yes | | YAML file path (supports multi-document). |
46-
| `--wait/--no-wait` | flag | no | `--wait` | Poll runtime + endpoints to final status. |
46+
| `--wait/--no-wait` | flag | no | `--wait` | Poll runtime + endpoints to final status. Under `--no-wait` the runtime is submitted but **endpoints are not reconciled** — the backend rejects endpoint create/update while the runtime is still `CREATING`/`UPDATING`. Re-run apply once it reaches `READY`. |
4747
| `--timeout` | duration | no | `10m` | Polling timeout. Accepts `Ns`, `Nm`, `Nh`, or bare seconds. |
4848
| `--prune-endpoints/--no-prune-endpoints` | flag | no | `--prune-endpoints` | Delete remote endpoints absent from the YAML. |
4949

50+
The CLI injects sensible defaults for `cpu` (2 cores), `memory` (4096 MB) and
51+
`port` (9000) when the YAML omits them — the backend rejects null values for
52+
these three fields with HTTP 400.
53+
5054
### Examples
5155

5256
```bash

docs/zh/runtime.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,10 +41,13 @@ ar runtime apply -f FILE [--wait/--no-wait] [--timeout DURATION]
4141
| Flag | Type | Required | Default | Description |
4242
|------|------|----------|---------|-------------|
4343
| `-f`, `--file` | path | yes | | YAML 文件路径(支持多文档)。 |
44-
| `--wait/--no-wait` | flag | no | `--wait` | 轮询 runtime + endpoints 到终态。 |
44+
| `--wait/--no-wait` | flag | no | `--wait` | 轮询 runtime + endpoints 到终态。`--no-wait` 时仅提交 runtime 创建/更新,**不会 reconcile endpoint** —— 后端在 runtime 处于 `CREATING`/`UPDATING` 时会拒绝 endpoint create/update。等 runtime 到 `READY` 后再 apply 一次即可。 |
4545
| `--timeout` | duration | no | `10m` | 轮询超时。支持 `Ns` / `Nm` / `Nh` 或裸秒数。 |
4646
| `--prune-endpoints/--no-prune-endpoints` | flag | no | `--prune-endpoints` | 删除远端存在但 YAML 缺失的 endpoint。 |
4747

48+
YAML 中省略 `cpu` / `memory` / `port` 时,CLI 会自动注入合理默认值(2 核 /
49+
4096 MB / 9000)—— 后端对这三个字段的 null 会回复 HTTP 400。
50+
4851
### Examples
4952

5053
```bash

src/agentrun_cli/_utils/runtime_constants.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,13 @@
1414
DEFAULT_ENDPOINT_NAME = "default"
1515
DEFAULT_TARGET_VERSION = "LATEST"
1616

17+
# Resource defaults — the backend rejects CreateAgentRuntime with HTTP 400
18+
# "CPU is required; Memory is required; Port is required" when these are null.
19+
# Injecting them in the render layer keeps the minimal YAML example runnable.
20+
DEFAULT_CPU = 2.0 # cores
21+
DEFAULT_MEMORY_MB = 4096
22+
DEFAULT_PORT = 9000
23+
1724
POLL_INITIAL_INTERVAL = 3.0 # seconds
1825
POLL_MAX_INTERVAL = 10.0 # seconds (cap of exponential backoff)
1926
POLL_BACKOFF_FACTOR = 1.5

src/agentrun_cli/_utils/runtime_render.py

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,10 @@
1717
)
1818
from agentrun_cli._utils.runtime_constants import (
1919
ARTIFACT_TYPE_CONTAINER,
20+
DEFAULT_CPU,
2021
DEFAULT_ENDPOINT_NAME,
22+
DEFAULT_MEMORY_MB,
23+
DEFAULT_PORT,
2124
DEFAULT_TARGET_VERSION,
2225
SYSTEM_TAG_CLI,
2326
)
@@ -119,6 +122,16 @@ def _build_container(p: ParsedContainer, m):
119122
)
120123

121124

125+
def _resolve_port(p: ParsedAgentRuntime) -> int:
126+
"""container.port > spec.port > DEFAULT_PORT — matches the documented
127+
precedence and prevents the backend's 'Port is required' 400."""
128+
if p.container.port is not None:
129+
return p.container.port
130+
if p.port is not None:
131+
return p.port
132+
return DEFAULT_PORT
133+
134+
122135
def to_runtime_create_input(p: ParsedAgentRuntime):
123136
m = _sdk_models()
124137
return m["create_input"](
@@ -129,9 +142,9 @@ def to_runtime_create_input(p: ParsedAgentRuntime):
129142
artifact_type=ARTIFACT_TYPE_CONTAINER,
130143
system_tags=[SYSTEM_TAG_CLI],
131144
container_configuration=_build_container(p.container, m),
132-
cpu=p.cpu,
133-
memory=p.memory,
134-
port=p.port,
145+
cpu=p.cpu if p.cpu is not None else DEFAULT_CPU,
146+
memory=p.memory if p.memory is not None else DEFAULT_MEMORY_MB,
147+
port=_resolve_port(p),
135148
disk_size=p.disk_size,
136149
enable_session_isolation=p.enable_session_isolation,
137150
protocol_configuration=_build_protocol(p.protocol, m),
@@ -157,9 +170,9 @@ def to_runtime_update_input(p: ParsedAgentRuntime):
157170
artifact_type=ARTIFACT_TYPE_CONTAINER,
158171
system_tags=[SYSTEM_TAG_CLI],
159172
container_configuration=_build_container(p.container, m),
160-
cpu=p.cpu,
161-
memory=p.memory,
162-
port=p.port,
173+
cpu=p.cpu if p.cpu is not None else DEFAULT_CPU,
174+
memory=p.memory if p.memory is not None else DEFAULT_MEMORY_MB,
175+
port=_resolve_port(p),
163176
disk_size=p.disk_size,
164177
enable_session_isolation=p.enable_session_isolation,
165178
protocol_configuration=_build_protocol(p.protocol, m),

src/agentrun_cli/commands/runtime/apply_cmd.py

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -116,21 +116,27 @@ def apply_cmd(ctx, file_path, wait, timeout, prune_endpoints):
116116
rt_res = reconcile_runtime(parsed, client=runtime_cls)
117117
runtime = rt_res.runtime
118118

119+
ep_actions: list = []
119120
if wait:
120121
poll_until_final(
121122
runtime,
122123
resource_kind="AgentRuntime",
123124
cfg=poll_cfg,
124125
on_tick=lambda r, e, p=parsed: _progress(sys.stderr, p, r, e),
125126
)
127+
# Endpoint create/update is rejected by the backend with HTTP 400
128+
# ("runtime must be in READY status") whenever the runtime isn't
129+
# READY yet — so we only reconcile endpoints after the runtime has
130+
# reached a final status. Under --no-wait the runtime is still in
131+
# CREATING/UPDATING when we return, so we skip endpoint
132+
# reconciliation entirely and the user can re-run apply once the
133+
# runtime is READY.
134+
ep_actions = reconcile_endpoints(
135+
runtime,
136+
desired=parsed.endpoints,
137+
prune=prune_endpoints,
138+
)
126139

127-
ep_actions = reconcile_endpoints(
128-
runtime,
129-
desired=parsed.endpoints,
130-
prune=prune_endpoints,
131-
)
132-
133-
if wait:
134140
in_flight = [
135141
a.endpoint
136142
for a in ep_actions
@@ -143,6 +149,12 @@ def apply_cmd(ctx, file_path, wait, timeout, prune_endpoints):
143149
concurrency=ENDPOINT_POLL_CONCURRENCY,
144150
on_tick=lambda r, e, p=parsed: _progress(sys.stderr, p, r, e),
145151
)
152+
elif sys.stderr.isatty():
153+
sys.stderr.write(
154+
f"[runtime {parsed.name}] --no-wait: runtime submitted; "
155+
"endpoints will be reconciled on a subsequent apply once the "
156+
"runtime reaches READY.\n"
157+
)
146158

147159
results.append(
148160
{

src/agentrun_cli/main.py

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
agentrun super-agent run
1111
"""
1212

13+
import logging
1314
import os
1415

1516
import click
@@ -26,6 +27,24 @@
2627
from agentrun_cli.commands.tool_cmd import tool_group
2728

2829

30+
class _DropSdkValidationWarnings(logging.Filter):
31+
"""Drop the SDK's pydantic 'validate type failed' WARNINGs.
32+
33+
They fire from ``agentrun.utils.model.from_object`` whenever the SDK
34+
deserializes a server-side record whose shape doesn't match its current
35+
pydantic schema (e.g. a runtime someone else created with
36+
``codeConfiguration.language=java17`` or with an empty ``logConfiguration``).
37+
That noise is not actionable for the CLI user — a single ``ar runtime list``
38+
can emit a dozen of them. ``--debug`` re-enables full logging.
39+
"""
40+
41+
def filter(self, record: logging.LogRecord) -> bool:
42+
return "validate type failed" not in record.getMessage()
43+
44+
45+
logging.getLogger("agentrun-logger").addFilter(_DropSdkValidationWarnings())
46+
47+
2948
class AliasGroup(click.Group):
3049
"""Click Group that supports hidden command aliases."""
3150

@@ -95,9 +114,13 @@ def cli(ctx: click.Context, profile, region, output, debug):
95114
ctx.obj["output"] = output
96115

97116
if debug:
98-
import logging
99-
100117
logging.basicConfig(level=logging.DEBUG)
118+
# In debug mode users want to see the SDK's validation warnings, so
119+
# strip the filter we installed at import time.
120+
sdk_logger = logging.getLogger("agentrun-logger")
121+
for f in list(sdk_logger.filters):
122+
if isinstance(f, _DropSdkValidationWarnings):
123+
sdk_logger.removeFilter(f)
101124

102125

103126
# Register sub-command groups

tests/integration/test_runtime_cmd.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -174,6 +174,10 @@ def _refresh(self=None, *a, **k):
174174
assert out[0]["action"] == "create"
175175
assert out[0]["runtime"]["name"] == "my-agent"
176176
fake_runtime_cls.create.assert_called_once()
177+
# --no-wait must not touch endpoints — the backend rejects endpoint
178+
# create while the runtime is CREATING/UPDATING.
179+
created.create_endpoint.assert_not_called()
180+
assert out[0]["endpoints"] == []
177181

178182

179183
def test_apply_update_path(monkeypatch):
@@ -205,6 +209,8 @@ def test_apply_update_path(monkeypatch):
205209
assert result.exit_code == 0, result.output
206210
out = json.loads(result.output)
207211
assert out[0]["action"] == "update"
212+
# Default --wait path reconciles endpoints after runtime reaches READY.
213+
existing.create_endpoint.assert_called_once()
208214

209215

210216
def test_apply_runtime_failed_exits_5(monkeypatch):

tests/unit/test_runtime_render.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,40 @@ def test_create_input_injects_system_tag_and_container_artifact():
5353
assert inp.container_configuration.image == "img:v1"
5454
# code_configuration must not be set
5555
assert inp.code_configuration is None
56+
# Defaults injected — backend rejects nulls for these three fields.
57+
assert inp.cpu == 2.0
58+
assert inp.memory == 4096
59+
assert inp.port == 9000
60+
61+
62+
def test_create_input_user_values_override_defaults():
63+
p = ParsedAgentRuntime(
64+
name="my-agent",
65+
container=ParsedContainer(image="img:v1"),
66+
cpu=4,
67+
memory=16384,
68+
port=8080,
69+
)
70+
inp = to_runtime_create_input(p)
71+
assert inp.cpu == 4
72+
assert inp.memory == 16384
73+
assert inp.port == 8080
74+
75+
76+
def test_create_input_container_port_wins_over_spec_port():
77+
p = ParsedAgentRuntime(
78+
name="my-agent",
79+
container=ParsedContainer(image="img:v1", port=7777),
80+
port=9000,
81+
)
82+
assert to_runtime_create_input(p).port == 7777
83+
84+
85+
def test_update_input_applies_same_defaults():
86+
upd = to_runtime_update_input(_minimal_parsed())
87+
assert upd.cpu == 2.0
88+
assert upd.memory == 4096
89+
assert upd.port == 9000
5690

5791

5892
def test_endpoints_none_injects_default():

0 commit comments

Comments
 (0)