Skip to content

fix multi-layer false-positive AI diagnoses (lab-observed 2026-05-26)#2

Merged
ehsan6sha merged 1 commit into
mainfrom
fix-false-positive-ai-diagnoses
May 26, 2026
Merged

fix multi-layer false-positive AI diagnoses (lab-observed 2026-05-26)#2
ehsan6sha merged 1 commit into
mainfrom
fix-false-positive-ai-diagnoses

Conversation

@ehsan6sha
Copy link
Copy Markdown
Member

fix multi-layer false-positive diagnoses (lab-observed 2026-05-26)

Five distinct bugs found while debugging a healthy lab device that the
AI plugin kept (loudly) reporting as broken. Each layer compounded the
next, producing a high-confidence "restart_fula" recommendation when
the device was actually fine.

  1. diag/internet false "discovery unreachable" (HTTP 403 misclassified)

    _helpers.py:https_head returned ok=False for HTTP 403, treating a
    "server responded — you can't HEAD this resource" answer as a
    network failure. discovery.fula.network's /relays only accepts
    POST; HEAD legitimately returns 403, yet the server is alive and
    the network path is fine. Lab ground truth: TLS handshake OK,
    ping 3ms, HTTP 403 — REACHABLE.

    Fix: new https_reachable() that returns ok=True on ANY HTTP
    response (2xx/3xx/4xx/5xx). Only network-level failures (DNS,
    TCP, TLS, timeout) count as unreachable. diag/internet now uses
    https_reachable for the discovery probe; https_head retained for
    google.com (the "internet itself works" canary). Regression test
    test_internet_discovery_403_is_reachable_not_captive guards
    against re-introducing the bug.

  2. diag/summary's power threshold turned 1 yellow event into nuclear red

    summary.py had red if ue > 0 else green — a single transient
    undervoltage event flipped power to red and dominated the AI's
    verdict. Graduated thresholds: 0=green, 1-2=yellow (acknowledge
    but don't panic), 3+=red (real PSU issue).

  3. Schema-invalid backend event killed the entire stream

    tool_call_loop.py returned on first validation failure, so when
    the 1.5B Qwen hallucinated a tool name (e.g. "diag/discovery"),
    the user's session bombed with [SCHEMA_VIOLATION] and no
    recommendations rendered.

    Fix: invalid tool_call now yields a recoverable error event +
    synthesizes a tool_result with ok=false + "unknown tool 'X'"
    message, so the model can self-correct on its next turn AND
    the UI keeps the session alive. Other invalid events yield a
    non-fatal error and the bridge continues. Regression tests:
    test_schema_invalid_tool_call_yields_synthetic_tool_result +
    updated test_schema_invalid_backend_event_emits_synthetic_error.

  4. System prompt let the model pre-confabulate + invent action names

    rkllm_runtime.py SYSTEM_PROMPT_TEMPLATE strengthened with three
    new hard rules:

    • Rule 5: ANY action MUST be in a XML block,
      never markdown prose (prose has no Approve button).
    • Rule 6: Read tool_response field by field — quote the actual
      field name. Don't confuse internet.latency_ms_avg with a
      clock offset (lab observed).
    • Rule 8: If user reports a symptom but diagnostics CONTRADICT
      it (e.g. user says disconnected but heartbeat.status=green
      http_status=200), ASK via <user_question> before acting.
    • Rule 9: NEVER emit a tier-2/3 destructive action at confidence

      0.7 when severity != "red".

    • Rule 10: relay.reservation_count=0 + wireguard.active=false
      are NOT problems on their own (normal for LAN-only devices).

    Plus BAD/GOOD examples showing what NOT to do.

  5. Server-side guardrails for when the model ignores rules 8/9 anyway

    1.5B Qwen still pattern-matches "user said disconnected + relay
    yellow → restart_fula at 95%" even with strengthened prompt.
    Belt-and-suspenders defense:

    apply_recommendation_guardrails() runs after recommendation
    parsing, BEFORE emission:

    • DROPS restart_fula entirely when heartbeat.status=green
      http_status=200 AND user prompt mentions disconnect /
      unreachable / can't see / offline. Restarting fula when the
      device IS heartbeating would create the very disconnect the
      user complained about (self-fulfilling bug).
    • CAPS confidence to 0.6 on restart-class actions
      (restart_fula, reset, wireguard.bounce, docker.restart,
      systemctl.restart) when verdict.severity is yellow/green.
      These actions should not be high-confidence on non-red
      severity.

    Six regression-guard tests in test_rkllm_runtime.py cover the
    exact lab scenario + adjacent cases (red severity passes,
    non-restart-class actions unaffected, non-connectivity prompts
    don't trigger the drop, etc.).

Tests: 230/230 pass.

End-to-end verified on lab pi@192.168.2.159 via hot-patch (before
container recreation wiped the patches): diag/internet correctly
reported discovery reachable, diag/summary returned power=green, AI
emitted proper verdict + recommended_action with HMAC token, and
the model began ASKING about WiFi configuration instead of jumping
to restart_fula.

Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com

Five distinct bugs found while debugging a healthy lab device that the
AI plugin kept (loudly) reporting as broken. Each layer compounded the
next, producing a high-confidence "restart_fula" recommendation when
the device was actually fine.

  1. diag/internet false "discovery unreachable" (HTTP 403 misclassified)

     _helpers.py:https_head returned ok=False for HTTP 403, treating a
     "server responded — you can't HEAD this resource" answer as a
     network failure. discovery.fula.network's /relays only accepts
     POST; HEAD legitimately returns 403, yet the server is alive and
     the network path is fine. Lab ground truth: TLS handshake OK,
     ping 3ms, HTTP 403 — REACHABLE.

     Fix: new https_reachable() that returns ok=True on ANY HTTP
     response (2xx/3xx/4xx/5xx). Only network-level failures (DNS,
     TCP, TLS, timeout) count as unreachable. diag/internet now uses
     https_reachable for the discovery probe; https_head retained for
     google.com (the "internet itself works" canary). Regression test
     test_internet_discovery_403_is_reachable_not_captive guards
     against re-introducing the bug.

  2. diag/summary's power threshold turned 1 yellow event into nuclear red

     summary.py had `red if ue > 0 else green` — a single transient
     undervoltage event flipped power to red and dominated the AI's
     verdict. Graduated thresholds: 0=green, 1-2=yellow (acknowledge
     but don't panic), 3+=red (real PSU issue).

  3. Schema-invalid backend event killed the entire stream

     tool_call_loop.py returned on first validation failure, so when
     the 1.5B Qwen hallucinated a tool name (e.g. "diag/discovery"),
     the user's session bombed with [SCHEMA_VIOLATION] and no
     recommendations rendered.

     Fix: invalid tool_call now yields a recoverable error event +
     synthesizes a tool_result with ok=false + "unknown tool 'X'"
     message, so the model can self-correct on its next turn AND
     the UI keeps the session alive. Other invalid events yield a
     non-fatal error and the bridge continues. Regression tests:
     test_schema_invalid_tool_call_yields_synthetic_tool_result +
     updated test_schema_invalid_backend_event_emits_synthetic_error.

  4. System prompt let the model pre-confabulate + invent action names

     rkllm_runtime.py SYSTEM_PROMPT_TEMPLATE strengthened with three
     new hard rules:
       - Rule 5: ANY action MUST be in a <recommendation> XML block,
         never markdown prose (prose has no Approve button).
       - Rule 6: Read tool_response field by field — quote the actual
         field name. Don't confuse internet.latency_ms_avg with a
         clock offset (lab observed).
       - Rule 8: If user reports a symptom but diagnostics CONTRADICT
         it (e.g. user says disconnected but heartbeat.status=green
         http_status=200), ASK via <user_question> before acting.
       - Rule 9: NEVER emit a tier-2/3 destructive action at confidence
         > 0.7 when severity != "red".
       - Rule 10: relay.reservation_count=0 + wireguard.active=false
         are NOT problems on their own (normal for LAN-only devices).

     Plus BAD/GOOD examples showing what NOT to do.

  5. Server-side guardrails for when the model ignores rules 8/9 anyway

     1.5B Qwen still pattern-matches "user said disconnected + relay
     yellow → restart_fula at 95%" even with strengthened prompt.
     Belt-and-suspenders defense:

     apply_recommendation_guardrails() runs after recommendation
     parsing, BEFORE emission:
       - DROPS restart_fula entirely when heartbeat.status=green
         http_status=200 AND user prompt mentions disconnect /
         unreachable / can't see / offline. Restarting fula when the
         device IS heartbeating would create the very disconnect the
         user complained about (self-fulfilling bug).
       - CAPS confidence to 0.6 on restart-class actions
         (restart_fula, reset, wireguard.bounce, docker.restart,
         systemctl.restart) when verdict.severity is yellow/green.
         These actions should not be high-confidence on non-red
         severity.

     Six regression-guard tests in test_rkllm_runtime.py cover the
     exact lab scenario + adjacent cases (red severity passes,
     non-restart-class actions unaffected, non-connectivity prompts
     don't trigger the drop, etc.).

Tests: 230/230 pass.

End-to-end verified on lab pi@192.168.2.159 via hot-patch (before
container recreation wiped the patches): diag/internet correctly
reported discovery reachable, diag/summary returned power=green, AI
emitted proper verdict + recommended_action with HMAC token, and
the model began ASKING about WiFi configuration instead of jumping
to restart_fula.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ehsan6sha ehsan6sha merged commit 8d3a31f into main May 26, 2026
2 checks passed
@ehsan6sha ehsan6sha deleted the fix-false-positive-ai-diagnoses branch May 26, 2026 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant