Sanitize peer-supplied mDNS labels in discover CLI#1689
Conversation
mDNS service properties (name, mac, version, platform, board) come from anything on the LAN that wants to broadcast _esphomelib._tcp.local. The discover CLI was passing those bytes through `data.decode()` (raising on non-UTF-8 input) and printing them straight to stdout with no control-character filter. A hostile broadcaster could clear or reflow the operator's terminal by stuffing ANSI escapes / newlines into the mac or version field. Route the values through `safe_label_str` (the same sanitizer used on the noise hello + protobuf hello paths) and use a "replace" error handler on the UTF-8 decode so an invalid byte sequence no longer breaks the zeroconf callback. Per-column length caps match the discover output FORMAT widths so a long value can't break the table alignment either. To keep the sanitizer importable from both the discover CLI and the noise frame helper without dragging the framing module's transitive deps, the caps and `safe_label_str` move to a dedicated `aioesphomeapi/_sanitize.py` (with `_sanitize.pxd` for cdef cimports on the noise hot path). `_frame_helper/base.py` keeps the old import points working via re-export so existing callers stay unchanged. Tests added for the discover decode path (None, str, bytes, invalid UTF-8, control-char stripping, length cap, non-ASCII printable survival). Same defensive surface as PRs #1664 / #1660 / #1656, applied to the mDNS channel.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1689 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 25 26 +1
Lines 4159 4165 +6
=========================================
+ Hits 4159 4165 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
PR Review — Sanitize peer-supplied mDNS labels in discover CLISolid, focused hardening of the discover CLI. The same 🟢 Suggestions1. Function name no longer reflects behaviour (`aioesphomeapi/discover.py`, L16-22)
2. Per-column caps drift silently if FORMAT changes (`aioesphomeapi/discover.py`, L17-22)
Low-cost defensive options:
Not blocking — the change as-is is strictly better than the status quo. 3. Invalid-UTF-8 test only checks the type, not the contract (`tests/test_discover.py`, L19-23)
Since Checklist
SummarySolid, focused hardening of the discover CLI. The same Automated review by Kōanabf3c2a |
There was a problem hiding this comment.
Pull request overview
This PR hardens aioesphomeapi-discover against hostile peer-supplied mDNS fields by ensuring all displayed labels are UTF-8 decoded safely, stripped of control characters, and length-capped to prevent terminal/log injection and table misalignment. It also extracts the shared sanitizer into a dedicated module so both the discover CLI and Cythonized noise path can reuse it.
Changes:
- Add
aioesphomeapi/_sanitize.py(+.pxd) withsafe_label_strand MAX_* caps; re-export from_frame_helper/base.pyfor backwards compatibility. - Sanitize all peer-controlled discover CLI fields (service name + mDNS properties) with UTF-8
errors="replace"and per-field length caps. - Add unit tests for
decode_bytes_or_unknowncovering invalid UTF-8, control stripping, and length capping.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
aioesphomeapi/discover.py |
Decode bytes with errors="replace" and sanitize/cap peer-controlled labels before printing. |
aioesphomeapi/_sanitize.py |
Introduce shared sanitizer + MAX_* constants in a lightweight module. |
aioesphomeapi/_sanitize.pxd |
Provide a Cython cpdef signature for safe_label_str for hot paths. |
aioesphomeapi/_frame_helper/base.py |
Import and re-export sanitizer/constants; keep _MAX_* cdef aliases. |
aioesphomeapi/_frame_helper/base.pxd |
Switch Cython import of safe_label_str to .._sanitize. |
aioesphomeapi/_frame_helper/noise.py |
Update imports to use the new sanitizer module. |
aioesphomeapi/_frame_helper/noise.pxd |
Switch Cython safe_label_str import to .._sanitize. |
setup.py |
Add _sanitize.py to the Cythonized extension list. |
tests/test_discover.py |
Add coverage for sanitizing/decoding behavior used by discover. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Derive per-column display caps from FORMAT widths via regex so the caps stay in lock-step with the table layout; an assertion at module load pins COLUMN_NAMES against the derived widths so a future FORMAT edit surfaces immediately. This also corrects the MAC cap that was set to MAX_MAC_LEN (16) while the FORMAT slot was only 12 chars wide, which would have let a 13-16 char MAC widen the column past its slot. - Rename `decode_bytes_or_unknown` to `decode_mdns_label_or_unknown` since the helper now decodes, strips non-printables and length-caps; the new name reflects all three behaviors. - Pin the invalid-UTF-8 contract by asserting the actual U+FFFD output instead of just `isinstance(result, str)`, so a future refactor that silently swaps the "replace" handler for UNKNOWN-or-empty trips a red test. - Add a `test_per_column_caps_match_format_widths` regression test that re-derives the widths from FORMAT and checks every cap, giving a second mechanical pin alongside the module-load assertion. - Add a module docstring and `__all__` to `_sanitize.py` so importers see the intent of the module and `from _sanitize import *` is bounded.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
WalkthroughThe PR centralizes mDNS label input sanitization by creating a shared ChangesInput Sanitization and Discovery Label Protection
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
What does this implement/fix?
mDNS service properties (name, mac, version, platform, board) come from anything on the LAN that wants to broadcast
_esphomelib._tcp.local.; the discover CLI was decoding them with baredata.decode()(which raises on non-UTF-8) and printing them straight to stdout with no control-character filter. A hostile broadcaster could clear or reflow the operator's terminal by stuffing ANSI escapes or newlines into themac/version/platform/boardfield, or break the column-aligned table with an oversized value.Route every peer-supplied label through
safe_label_str(the same sanitizer the noise hello and protobuf hello paths already use), use a"replace"UTF-8 error handler so a malformed payload no longer raises out of the zeroconf callback, and apply per-column length caps that match the discover outputFORMATwidths so a long value can't break alignment either.To keep the sanitizer importable from both the discover CLI and the noise frame helper without dragging the framing module's transitive deps in, the caps and
safe_label_strmove to a dedicatedaioesphomeapi/_sanitize.py(with_sanitize.pxdforcimporton the noise hot path)._frame_helper/base.pyre-exports the names so existingfrom .base import MAX_NAME_LENcallers keep working unchanged.Same defensive surface as #1664, #1660, #1656; mDNS is just a different peer-controlled channel feeding the same kind of output.
Testing:
tests/test_discover.pycovers theNone/str/bytespaths, invalid UTF-8 (must not raise), ANSI / newline / null / tab stripping, length capping, and non-ASCII printable survival. Full suite green locally (1092 passed, 2 skipped); Cython build withREQUIRE_CYTHON=1rebuilds cleanly.This finishes the work originally opened as #1669; the CodSpeed benchmarks for the sanitize hot path were lifted to #1687 and the unrelated
# type: ignorelint fix to #1688 so the change set here stays focused on the discover surface.Types of changes
Related issue or feature (if applicable):
Pull request in esphome (if applicable):
Checklist:
tests/folder).