Skip to content

fix(engine): close unpinned rule-cache torn-read + review follow-ups to #49#50

Open
marcinpsk wants to merge 3 commits into
mainfrom
fix/rule-cache-review-followups
Open

fix(engine): close unpinned rule-cache torn-read + review follow-ups to #49#50
marcinpsk wants to merge 3 commits into
mainfrom
fix/rule-cache-review-followups

Conversation

@marcinpsk

@marcinpsk marcinpsk commented Jun 26, 2026

Copy link
Copy Markdown
Owner

Follow-ups from code review of the (already-merged) #49 rule-cache work.

Production fixes

  • Unpinned torn-read closed (engine.py) — find_matching_rule's unpinned path returned _RULE_CACHE["exact"]/["regex"]/["memo"] as three separate dict reads while a reload mutated those keys in place. Under threaded workers a concurrent reload (V1→V2) could pair V1's rules with V2's memo and memoize a stale result into the live memo (persisting until the next rule edit). _get_enabled_rules() now reads the global once into a local and a reload publishes a fresh dict via one atomic rebind, so a reader sees the whole old set or the whole new one — never a torn mix. The pin-prime path captures from the same consistent local.
  • min_version 4.2.0 → 4.3.0 (__init__.py) — perf(engine): load the enabled rule set once and memoize find_matching_rule #49 raised the documented floor to NetBox ≥ 4.3.0 in README/docs but left the enforced gate at 4.2.0; a 4.2.x install would load despite the docs. Gate now matches the docs.

Test hardening

  • test_reload_publishes_a_fresh_cache_dict_atomically — asserts the no-in-place-mutation invariant behind the torn-read fix (red-checked: fails against the old in-place code).
  • test_pinned_block_holds_snapshot_across_concurrent_cache_reload rewritten to spawn a real second thread, proving _pin is genuinely thread-local — a plain global now fails it (the old same-thread version could not catch that). Adds try/finally cache restoration.
  • test_memo_is_bounded now checks the cap after every insert, catching a mid-loop leak toward 2×cap that the old end-of-loop snapshot missed.
  • New EnabledRuleFingerprintColumnsTest — fails if any concrete model column is neither in _VERSION_COLUMNS nor an explicitly-justified exclusion, so a future match-affecting field can't silently break fingerprint-based cache invalidation.

Verification

Each fix red-checked (fails against un-fixed code, passes against the fix). Full suite 333 OK, coverage 97% (gate met), ruff clean.

Summary by CodeRabbit

  • Bug Fixes

    • Improved rule-cache refresh handling so rule matching stays consistent during updates, even under concurrent access.
    • Ensured pinned rule checks use a stable snapshot while processing, preventing mixed results during reloads.
  • Chores

    • Raised the minimum supported NetBox version to 4.3.0.
  • Tests

    • Expanded coverage for cache eviction, reload behavior, concurrency safety, and version compatibility.

…pinned torn-read

The unpinned path returned _RULE_CACHE["exact"], ["regex"], ["memo"] as three
separate dict reads while a reload mutated those keys in place, so a concurrent
reload on another worker thread could pair one rule-set version's rules with a
different version's memo and then memoize a stale-version result into the live
memo (persisting until the next rule edit).

_get_enabled_rules() now reads the module global once into a local, and a reload
publishes the new set by rebinding _RULE_CACHE to a fresh dict in a single atomic
assignment. A reader sees the whole old set or the whole new one — never exact
from V1 paired with memo from V2. The pin-prime path captures from the same
consistent local, so it can no longer snapshot a torn pair either.
PR #49 raised the stated compatibility floor to NetBox >= 4.3.0 in README.md and
docs/installation.md but left the enforced PluginConfig.min_version at 4.2.0, so a
4.2.x install would still load despite the docs declaring it unsupported. Align
the gate with the docs and pin them together with a test.
…ustiveness

- Add test_reload_publishes_a_fresh_cache_dict_atomically: asserts a reload swaps
  in a new cache dict instead of mutating the old one in place — the invariant
  behind the unpinned torn-read fix.
- Rewrite test_pinned_block_holds_snapshot_across_concurrent_cache_reload to use a
  real second thread, proving _pin is thread-local (a plain global now fails it),
  with try/finally cache restoration so the simulated reload can't leak.
- test_memo_is_bounded now checks the cap after every insert, catching a mid-loop
  leak toward 2*cap that the old end-of-loop snapshot missed.
- Add EnabledRuleFingerprintColumnsTest: fails if any concrete model column is
  neither fingerprinted nor explicitly excluded, so a future match-affecting field
  can't silently break fingerprint-based cache invalidation.
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

The plugin raises its minimum supported NetBox version to 4.3.0 and changes enabled-rule cache reloads to publish a fresh shared dict atomically. Tests now cover the version gate, memo bounds, concurrent snapshot stability, and fingerprint column coverage.

Changes

Rule cache version floor and atomic reload

Layer / File(s) Summary
Version floor update
netbox_interface_name_rules/__init__.py, netbox_interface_name_rules/tests/test_misc.py
InterfaceNameRulesConfig.min_version is set to 4.3.0, and a test asserts that exact minimum version.
Atomic enabled-rule cache rebinding
netbox_interface_name_rules/engine.py
_get_enabled_rules() now rebinding _RULE_CACHE to a fresh dict when the fingerprint version changes, and pinned cache reads snapshot exact, regex, and memo from one cache dict.
Cache behavior tests
netbox_interface_name_rules/tests/test_rules.py
Tests now check per-insert memo bounds, concurrent reload snapshot stability, and that every concrete rule field is either fingerprinted or explicitly excluded.

Sequence Diagram(s)

sequenceDiagram
  participant pinned_rule_cache
  participant _get_enabled_rules
  participant _RULE_CACHE
  participant _enabled_rules_version
  pinned_rule_cache->>_get_enabled_rules: resolve exact, regex, and memo snapshot
  _get_enabled_rules->>_RULE_CACHE: read current cache dict
  _get_enabled_rules->>_enabled_rules_version: compare fingerprint version
  _get_enabled_rules->>_RULE_CACHE: publish fresh dict when version changes
  _get_enabled_rules-->>pinned_rule_cache: return exact, regex, and memo
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

I hopped through the cache, neat and bright,
one fresh dict at a time in the night.
At 4.3.0 my burrow sings,
with pinned little rules and tidy springs.
🐇✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly reflects the main engine cache torn-read fix and follow-up work from #49.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/rule-cache-review-followups

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@sonarqubecloud

Copy link
Copy Markdown

@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@marcinpsk

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@marcinpsk

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@netbox_interface_name_rules/engine.py`:
- Around line 234-242: The pinned branch in the cache-resolving logic is still
sharing the mutable memo dict, so a concurrent clear can race with memo lookups
and trigger KeyError. Update the pinning path in the function that sets
_pin.exact/_pin.regex/_pin.memo so pinned requests use an isolated copy of
cache["memo"] (or otherwise synchronized/sentinel-based access) instead of
aliasing the shared global memo.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 20577c4c-99cd-45e9-9f71-3c76291f7c01

📥 Commits

Reviewing files that changed from the base of the PR and between 9e3aecb and 914ddba.

📒 Files selected for processing (4)
  • netbox_interface_name_rules/__init__.py
  • netbox_interface_name_rules/engine.py
  • netbox_interface_name_rules/tests/test_misc.py
  • netbox_interface_name_rules/tests/test_rules.py

Comment on lines 234 to +242
if pinned:
# Pin this thread to the freshly-resolved set for the rest of the block. The tuples are
# immutable and the memo dict is now this thread's own reference, so a later global reload
# leaves our snapshot — and the entries we memoize into it — untouched.
_pin.exact = _RULE_CACHE["exact"]
_pin.regex = _RULE_CACHE["regex"]
_pin.memo = _RULE_CACHE["memo"]
_pin.exact = cache["exact"]
_pin.regex = cache["regex"]
_pin.memo = cache["memo"]
_pin.primed = True
return _pin.exact, _pin.regex, _pin.memo
return _RULE_CACHE["exact"], _RULE_CACHE["regex"], _RULE_CACHE["memo"]
return cache["exact"], cache["regex"], cache["memo"]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Don’t pin the shared mutable memo dict.

_pin.memo = cache["memo"] still aliases the global cache memo; another thread can clear that dict at the memo cap while this pinned request is between sig in memo and memo[sig], causing a sporadic KeyError. Copy the memo for pinned blocks, or make memo access synchronized/sentinel-based.

Suggested localized fix
     if pinned:
         # Pin this thread to the freshly-resolved set for the rest of the block. The tuples are
-        # immutable and the memo dict is now this thread's own reference, so a later global reload
+        # immutable and the memo dict is copied for this thread, so a later global reload
         # leaves our snapshot — and the entries we memoize into it — untouched.
         _pin.exact = cache["exact"]
         _pin.regex = cache["regex"]
-        _pin.memo = cache["memo"]
+        _pin.memo = dict(cache["memo"])
         _pin.primed = True
-    return cache["exact"], cache["regex"], cache["memo"]
+        return _pin.exact, _pin.regex, _pin.memo
+    return cache["exact"], cache["regex"], cache["memo"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if pinned:
# Pin this thread to the freshly-resolved set for the rest of the block. The tuples are
# immutable and the memo dict is now this thread's own reference, so a later global reload
# leaves our snapshot — and the entries we memoize into it — untouched.
_pin.exact = _RULE_CACHE["exact"]
_pin.regex = _RULE_CACHE["regex"]
_pin.memo = _RULE_CACHE["memo"]
_pin.exact = cache["exact"]
_pin.regex = cache["regex"]
_pin.memo = cache["memo"]
_pin.primed = True
return _pin.exact, _pin.regex, _pin.memo
return _RULE_CACHE["exact"], _RULE_CACHE["regex"], _RULE_CACHE["memo"]
return cache["exact"], cache["regex"], cache["memo"]
if pinned:
# Pin this thread to the freshly-resolved set for the rest of the block. The tuples are
# immutable and the memo dict is copied for this thread, so a later global reload
# leaves our snapshot — and the entries we memoize into it — untouched.
_pin.exact = cache["exact"]
_pin.regex = cache["regex"]
_pin.memo = dict(cache["memo"])
_pin.primed = True
return _pin.exact, _pin.regex, _pin.memo
return cache["exact"], cache["regex"], cache["memo"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@netbox_interface_name_rules/engine.py` around lines 234 - 242, The pinned
branch in the cache-resolving logic is still sharing the mutable memo dict, so a
concurrent clear can race with memo lookups and trigger KeyError. Update the
pinning path in the function that sets _pin.exact/_pin.regex/_pin.memo so pinned
requests use an isolated copy of cache["memo"] (or otherwise
synchronized/sentinel-based access) instead of aliasing the shared global memo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant