Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
ce9db15
improve(gr00t_inference): R0 — rebase + baseline snapshot from PR #90…
cagataycali May 23, 2026
ecf5f0f
review(gr00t_inference): R1 — drop silent host=127.0.0.1->0.0.0.0 aut…
cagataycali May 23, 2026
7762868
review(gr00t_inference): R3 — reconcile CHANGELOG to match R1 archite…
cagataycali May 23, 2026
621b3d8
review(gr00t_inference): R4 — fix 4 regex bugs in input validators
cagataycali May 23, 2026
58eef5f
style(gr00t_inference): R4 — apply ruff format
cagataycali May 23, 2026
b7c3b65
test(gr00t_inference): R1 — update test_recreates_when_force for new …
cagataycali May 23, 2026
f5c0a57
review(gr00t_inference): R1 - fix docker image ref regex falsely reje…
cagataycali May 23, 2026
4da630b
review(gr00t_inference): R2 — drop backslash from path-traversal split
strands-agent May 23, 2026
743baf1
review(gr00t_inference): R2 — fix host docstring to match R1 semantics
strands-agent May 23, 2026
c260235
review(gr00t_inference): R2 — remove dead host_was_explicit plumbing
strands-agent May 23, 2026
875b8ca
review(gr00t_inference): R3 - validate hf_repo/hf_subfolder/hf_local_…
cagataycali May 23, 2026
d04f697
review(gr00t_inference): R4 - fix hf_* validation placement + pin --h…
cagataycali May 23, 2026
d7a6690
review(gr00t_inference): R4 - rewrite host-kwarg-excluded test using …
cagataycali May 23, 2026
8079fb6
review(gr00t_inference): R5 - tighten hf_repo segment checks (fix CI …
cagataycali May 23, 2026
86d7142
review(gr00t_inference): R6 -- reject port=bool + hf_repo leading-dot…
cagataycali May 23, 2026
c7aaa6a
review(gr00t): R6 -- remove dual source of truth for loopback default
strands-agent May 23, 2026
fd9f21a
review(gr00t_inference): R7 -- ruff format the validation test file
cagataycali May 23, 2026
33288bd
review(gr00t_inference): R7 -- update _start_container test call site…
cagataycali May 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,120 @@
All notable behavioural changes to `strands-robots` are logged here. Follows
[Keep a Changelog](https://keepachangelog.com/) conventions.

## Unreleased - #196 (gr00t_inference validation hardening, supersedes #90)

### Added

- ``validate_inputs()`` centralises all parameter validation with action-aware
scoping: read-only actions (``find_containers``, ``list``, ``status``,
``stop``) only validate ``port``/``host``/``protocol``; mutating actions
(``start``, ``restart``, ``lifecycle``) validate the full parameter surface.
- ``host`` parameter now accepts both IP addresses and RFC-952 hostnames
(e.g. ``localhost``, ``host.docker.internal``). All-numeric strings that
fail ``ipaddress.ip_address()`` (e.g. ``127.0.01``, ``999.999.999.999``)
are rejected as obvious IP typos.
- ``protocol`` validation moved into ``validate_inputs()`` (previously
hand-rolled outside the helper, breaking the single-entry-point contract).
- ``pgrep`` patterns now match both ``--port N`` and ``--port=N`` forms,
preventing silent miss when the service is started with the ``=`` syntax.
- Integration tests that invoke ``gr00t_inference()`` end-to-end and assert
that invalid inputs are caught (pins the ``try/except ValueError`` wiring).
- End-to-end regression test for ``_stop_service`` cross-port-kill scenario:
verifies that a process on port 8000 is NOT killed when stopping port 80.
- Exception clauses narrowed throughout: ``_is_gr00t_process`` / ``_is_gr00t_host_process``
use ``(OSError, subprocess.SubprocessError, UnicodeDecodeError)``;
``_list_running_services``/``_is_service_running`` use ``OSError``;
``_stop_service`` uses ``(OSError, subprocess.SubprocessError)``;
``_start_service`` uses ``(OSError, RuntimeError)``.
Only ``_download_checkpoint`` retains ``except Exception`` (``# noqa: BLE001``)
because huggingface_hub raises varied, opaque exception types.
- ``action`` parameter validated against a complete allowlist of 10 valid
actions; unknown actions get a clear error with the valid set listed.
- ``image_name``, ``volumes``, and ``container_command`` parameters are now
validated (Docker image reference, path traversal, shell metacharacters).
- ``pgrep`` pattern factored into ``_PGREP_INFERENCE_PORT_FMT`` module-level
constant — single source of truth across all 4 usage sites.
- ``_PGREP_INFERENCE_PORT_FMT`` and ``_is_gr00t_*_process`` now match both
N1.5/N1.6 (``inference_service.py``) and N1.7 (``gr00t.eval.run_gr00t_server``)
entry-points. Closes the N1.7 stop/status identification gap.
- All exception types in process-probe helpers now log (WARNING for
PermissionError, DEBUG for other OSError/SubprocessError/UnicodeDecodeError).

### Changed

- ``validate_inputs()`` parameters are now all required (no defaults).
``gr00t_inference()`` is the single source of truth for default values;
the validator no longer duplicates them (prevents silent drift).
- ``_DOCKER_IMAGE_RE`` extended to support private-registry references with
port numbers (e.g. ``localhost:5000/myorg/img:tag``).
- ``_is_gr00t_process`` / ``_is_gr00t_host_process`` now require ``--port``
in the process cmdline to match — prevents false-killing editors or
log-tailers that happen to touch ``inference_service.py``.
- ``PermissionError`` in process probes now logs at WARNING level instead
of being silently swallowed.
- **BREAKING** Default ``host`` changed from ``0.0.0.0`` to ``127.0.0.1``
(loopback-only, per AGENTS.md > Review Learnings #86 > "Safety Defaults").
The ``host`` kwarg now exclusively controls the docker host-side bind via
``-p {host}:{port}:{port}`` — no silent rewrite. The inference server
inside the container is **always** invoked with ``--host 0.0.0.0``
(required by docker port forwarding; binding to container-loopback would
make the published port unreachable). User intent is honoured at the
docker layer:
- ``host="127.0.0.1"`` (default) → published port reachable on loopback only
- ``host="0.0.0.0"`` (explicit) → published port reachable on every host iface
**Migration:** if your downstream connects from a different host on the
same machine or across the network, pass ``host="0.0.0.0"`` explicitly
on the ``start`` / ``restart`` / ``start_container`` / ``lifecycle`` calls.
- Host-system fallback (``pgrep``) is documented as Linux-only. Non-Linux
platforms will see "No service running" rather than silently succeeding.

### Fixed

- ``hf_repo``, ``hf_subfolder``, ``hf_local_dir`` validation now runs for
``action='download_checkpoint'`` (R3 introduced these checks but placed
them after the ``_image_only_actions`` early-return, silently bypassing
the path-traversal guard for the action that actually consumes those
parameters; R4 hoists them above all action-specific gates).
- Inference server inside the container is now hardcoded to ``--host 0.0.0.0``
(R3 forwarded the user's ``host`` kwarg verbatim, so ``host="127.0.0.1"``
bound the service to container-loopback and the docker port-publish
forwarded to nothing — the headline loopback-default contract was
unreachable end-to-end).
- Duplicate ``torch_mock.manual_seed`` assignment in ``tests/mocks/torch_mock.py``.
- Option-injection guard: ``repo_url``, ``repo_tag``, ``policy_name`` starting
with ``-`` are rejected (prevents git/docker flag injection via subprocess argv).
- Host-system fallback (pgrep) now returns a clear error on non-Linux platforms
instead of silently reporting success.
- All-numeric hostname guard narrowed to multi-label patterns only — single-label
numerics (e.g. ``123``) are valid per RFC-1123.
- ``port`` validator rejects ``bool`` explicitly. ``isinstance(True, int) is True``
in Python (bool subclasses int) and ``1 <= True <= 65535`` evaluates ``True`` --
pre-fix, ``port=True`` passed validation and reached ``--port`` argv as the
string ``"True"`` (R6 review thread, ``gr00t_inference.py:223``). Pinned by
``TestPortBoolRejected``.
- ``hf_repo`` segment validation rejects any segment starting with ``.``
(catches ``.org/name``, ``org/.git``, ``...../name``, ``org/.name``). The
base regex ``[a-zA-Z0-9_.-]+/[a-zA-Z0-9_.-]+`` plus the existing segment
loop only rejected bare ``.`` / ``..`` segments and leading ``-`` -- the
validator now fails closed locally rather than relying on HuggingFace's
API to reject (R6 review thread, ``gr00t_inference.py:253``). Pinned by
``TestHfRepoLeadingDotSegments``.

### Notes

- Host validation is **broader** than before for hostnames (RFC-952 names like
``localhost`` and ``host.docker.internal`` now pass) but **stricter** for
IP-like typos (all-numeric labels like ``127.0.01`` are rejected).
- Validation scope covers ``port``, ``host``, ``protocol``, ``data_config``,
``embodiment_tag``, ``container_name``, TRT dtypes, ``checkpoint_path``,
``trt_engine_path``, ``image_name``, ``volumes``, and ``container_command``.
Parameters ``repo_url``, ``repo_tag``, and ``policy_name`` are
option-injection-guarded (reject leading ``-``). ``hf_repo``,
``hf_subfolder``, and ``hf_local_dir`` are path-validated (reject
traversal, shell metacharacters, and malformed repo IDs). The
``lifecycle`` phase is enum-checked.


## Unreleased - #178 (LiberoOffScreenRenderEngine retired)

### Removed: ``LiberoOffScreenRenderEngine`` simulation backend (BREAKING)
Expand Down
Loading
Loading