Skip to content

review: structured code review of PR #596 (feat: add PRT3 ASCII serial connection type)#597

Draft
Copilot wants to merge 1 commit into
devfrom
copilot/prepare-high-level-review
Draft

review: structured code review of PR #596 (feat: add PRT3 ASCII serial connection type)#597
Copilot wants to merge 1 commit into
devfrom
copilot/prepare-high-level-review

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 27, 2026

Purpose

This PR adds a formal structured code review of PR #596 (feat: add PRT3 ASCII serial connection type) to the repository as docs/pr596-prt3-review.md.

The review was produced by a multi-agent analysis (advocate, sceptic, architect) examining the full diff (5,595 additions, 33 files).


Findings Summary for PR #596

🔴 Blocking Bugs (must fix before merge)

Bug #1async_message_manager.py: Assertion removal weakens all callers
assert isinstance(data, Container) in EventMessageHandler/ErrorMessageHandler was replaced with return False. This removes the contract enforcer for the entire binary-protocol pipeline — mis-routed objects now silently drop instead of crashing loudly. PRT3 dataclasses should travel through a dedicated handler, not through binary-protocol handlers with a softened guard.

Bug #2mqtt/core.py: Late-registrar on_connect passes None args to all MQTT handlers

args = self._last_connect_args or (self.client, None, None, None, None)
cls.on_connect(*args)

When _last_connect_args is None, every late-registering interface gets reason_code=None. Any handler calling reason_code.is_failure will throw AttributeError. Only replay on_connect when _last_connect_args is not None.

Bug #3panel.py: PRT3BufferFull (!) has no fast-fail in _prt3_send_wait
A panel buffer overflow emits ! (PRT3BufferFull). The _prt3_send_wait predicate only matches PRT3CommandEcho, so the overflow is ignored and the caller waits the full IO_TIMEOUT. During load_labels with default config (96 zones + 32 users + 8 areas = 136 polls), a single overflow causes a multi-minute startup freeze before failing. _prt3_send_wait needs a secondary predicate that returns PRT3BufferFull immediately.

Bug #4panel.py + paradox.py: Invalid PRT3_USER_CODE causes unhandled ValueError at arm time
encoder._validate_code() raises ValueError for non-numeric or >6-digit codes. Neither _build_partition_cmd() nor paradox.py's control_partition() catches it. Setting PRT3_USER_CODE = "abc" and issuing a disarm command crashes the async task. Validate PRT3_USER_CODE format at config load time.


🟡 Architecture Issues

Arch #1PRT3Panel(Panel) violates Liskov Substitution Principle
Panel carries binary-protocol surface (_eeprom_*, binary parse_message/get_message) that is dead weight in PRT3. control_zones() and control_outputs() raise NotImplementedError instead of returning bool as the interface contract requires. Suggested fix: extract a BinaryPanel intermediate class.

BasePanel (protocol-neutral)
├── BinaryPanel(BasePanel) — EEPROM, binary framing
│   ├── EvoPanel(BinaryPanel)
│   └── SpectraPanel(BinaryPanel)
└── PRT3Panel(BasePanel) — ASCII only

Arch #2 — 9 CONNECTION_TYPE == "PRT3" string guards erode Panel polymorphism
The existing codebase has zero == "Serial" / == "IP" guards in the orchestrator. Five of the nine new guards are candidates for Panel interface methods (supports_time_sync, send_close_connection, register_protocol_handlers, get_utility_key_configs, send_utility_key) — reducing paradox.py to the 3 genuinely structural connection-selection branches.

Arch #3runtime.py placeholder should be deleted
Empty file with "reserved for future" comment — delete until it has content.

Arch #4mqtt/core.py late-registrar fix should be a separate commit
Legitimate general bug fix, unrelated to PRT3, cannot be cherry-picked as bundled.


🟢 Minor Issues

# Issue Fix
M1 PRT3_SERIAL_BAUD range (2400, 115200) accepts hardware-invalid values Use [9600, 19200]
M2 G065 per-N overrides (exit_delay, entry_delay) bypass EVENT_MAP Document with cross-reference
M3 control_zones() / control_outputs() raise instead of returning False Return False + logger.warning
M4 Bare except Exception in handle_prt3_event_message hides storage inconsistency Break out expected exceptions
M5 User code safety claim ("never written to logs") narrower than stated Note raw bytes pass to connection.write()
M6 No tests for reconnect, buffer-full, or bad-config-at-arm paths Add 3 targeted tests

✅ Strengths

  • Clean parser.py/encoder.py/adapter.py separation
  • Strong test coverage on core parsing/encoding layer
  • Zero regression risk to existing Serial/IP connection types
  • Correct security note on PRT3_USER_CODE in config
  • handlers.py AttributeError guard is elegant
  • The feature itself solves an active, real user problem

Full review: docs/pr596-prt3-review.md

@sonarqubecloud
Copy link
Copy Markdown

@NaanyaBiz
Copy link
Copy Markdown

Thanks for the structured review — the multi-agent framing surfaced several things worth fixing. Before acting on the findings I traced each one against the code so we'd target the right things. Below is what I plan to act on, where I'd push back, and what I'd defer.

Findings I'll act on

# Finding Action
Bug #3 PRT3BufferFull not in _prt3_send_wait predicates Add a PRT3BufferFull predicate and fast-fail/retry on !
Bug #4 Invalid PRT3_USER_CODE raises ValueError at arm time Validate format (^\d{1,6}$) at config load, not first use
Arch #3 Empty runtime.py placeholder Delete
Minor #1 PRT3_SERIAL_BAUD accepts non-hardware values Set validator [9600, 19200] (matches CONNECTION_TYPE pattern)
Minor #2 G065 per-N overrides bypass EVENT_MAP Surface them as a "number_overrides" nested dict in EVENT_MAP
Minor #4 Bare except Exception in handle_prt3_event_message Catch expected (KeyError/AttributeError) explicitly with warning
Minor #5 User code visible if byte-level serial trace is enabled Doc note + mask in retry warning path
Minor #6 Missing tests: reconnect, buffer-full, bad config at arm time Add coverage

Findings whose impact I read differently

Bug #1assertreturn False in async_message_manager.py

For binary users only Container objects ever flow into these handlers, so neither the assert nor the guard ever fires — there's no behaviour change for them. For PRT3 users, EventMessageHandler and ErrorMessageHandler are registered alongside the PRT3 handler in paradox.py _register_connection_handlers. Without the guard, every PRT3 dataclass would trigger AssertionError on the first line of can_handle, making PRT3 mode unusable.

The architectural critique is solid though, and I'd happily take it further: register EventMessageHandler/ErrorMessageHandler only for binary connection types and a dedicated PRT3 handler only for PRT3, so the guard isn't needed at all. That fits naturally into the Arch #2 polymorphism cleanup if/when we tackle it.

Bug #2on_connect fallback args

The synthetic-None fallback exists to cover a microsecond race between self.state = CONNECTED (core.py:222) and self._last_connect_args = (...) (line 223) — the callback runs on the MQTT thread, so a registrar reading self.connected between those two assignments could see True with _last_connect_args still None. Both current on_connect overrides (basic.py:120, homeassistant.py:59) ignore reason_code, so the AttributeError path has no live target today.

That said, the suggested fix (skip replay when _last_connect_args is None) is strictly better — defensive against any future handler that does inspect reason_code. I'll change it. Fair point on the separate commit too; we can't unscramble it now but I'll keep that discipline going forward.

Bug #3 — severity scoping

Real defect, agreed on the fix. The n × IO_TIMEOUT freeze claim doesn't hold up against the code: _prt3_send_wait runs under core.request_lock and each call carries its own IO_TIMEOUT. A single buffer overflow costs one timeout (plus its retry) on the affected command, not a multi-minute cascade across load_labels. Worth fixing for clean retry semantics, not because startup will hang.

Bug #4 — severity scoping

Real defect, agreed on the fix. The "crashing the run loop" framing isn't quite right: the MQTT path is wrapped by mqtt_handle_decorator (basic.py:31-35) which catches Exception and logs; the text path runs under asyncio task supervision. The user-visible symptom is silent failure of every arm/disarm with stack traces in the log — bad UX, not a process crash. Validating at config load is still the right answer.

Minor #3control_zones / control_outputs raise instead of returning False

The orchestrator catches it: paradox.py:497-516 (control_zone) and paradox.py:630-647 (control_output) both except NotImplementedError and return False. The MQTT handler never sees the exception, so the Panel interface contract is honoured at the call site even though the panel-level method raises. I'd leave this as-is unless I'm misreading the path.

Deferring

Arch #1 (LSP) and Arch #2 (string guards) — agreed in principle, both worth doing. The concrete behaviour issues they cite are mitigated by Minor #3 above, so I'd fold the cleanup into a follow-up rather than the v1 merge: extract BinaryPanel, replace the 5 candidate guards with Panel methods, and (per Bug #1 above) wire handler registration through Panel.register_protocol_handlers(connection) so the Container guard becomes unnecessary.

Heads-up

While integration-testing on a live EVO192 panel over the last day, three orthogonal arm/disarm state bugs surfaced and are fixed in 3fe0b14e on the branch:

  • G014 with area=0 (global keypad disarm) was silently dropped because get_container_object("partition", 0) returns None. Now broadcast to all partitions.
  • G013 was mapped as "auto-armed" (arm=True); per PRT3 spec page 18 it's "Disarm with Master". When a master code disarmed, _update_partition_states fell through to the armed_away else branch.
  • HA showed a transient armed_stay state for 4-7 s after disarm-during-exit-delay because RA polls return stale armed_stay for ~hundreds of ms after AD&OK while the panel internally settles. Now applies disarmed state optimistically on &OK echo and freezes arm-key updates from RA for 3 s.

Architecture notes on the three-source state model (RA polling, G-events, command echo) added to docs/prt3-architecture.md. None of it overlaps with this review's findings, but flagging in case it changes how you'd want to sequence the merge.

NaanyaBiz pushed a commit to NaanyaBiz/pai that referenced this pull request Apr 29, 2026
- Bug ParadoxAlarmInterface#2: mqtt/core.py skip on_connect replay when _last_connect_args is None
- Bug ParadoxAlarmInterface#3: panel.py _prt3_send_wait fast-fails on PRT3BufferFull instead of full timeout
- Bug ParadoxAlarmInterface#4: PRT3_USER_CODE validated at config load; _build_partition_cmd catches ValueError
- Minor ParadoxAlarmInterface#1: PRT3_SERIAL_BAUD set validator [9600, 19200]; config validator extended for int lists
- Minor ParadoxAlarmInterface#2: G065 per-N overrides in EVENT_MAP number_overrides; from_prt3 uses generic lookup
- Minor ParadoxAlarmInterface#4: handle_prt3_event_message catches expected exceptions explicitly
- Minor ParadoxAlarmInterface#5: slice comment in retry warning; expanded PRT3_USER_CODE security doc
- Minor ParadoxAlarmInterface#6: tests for buffer-full retry, malformed code, PRT3_USER_CODE/baud validation
1369 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@NaanyaBiz
Copy link
Copy Markdown

Fixed have been pushed to #596 per comments above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants