Skip to content

fix(proxy): meaningful audit data — strip suffixes + cache title-fast-path matches#15

Merged
unknown0152 merged 1 commit into
masterfrom
fix/audit-release-name-and-title-cache
May 24, 2026
Merged

fix(proxy): meaningful audit data — strip suffixes + cache title-fast-path matches#15
unknown0152 merged 1 commit into
masterfrom
fix/audit-release-name-and-title-cache

Conversation

@unknown0152
Copy link
Copy Markdown
Owner

Summary

Fixes two compounding bugs that made every classifier_audit row report mismatch_type=missed_dkaudio regardless of what the proxy actually knew.

The two bugs

1. /learn/imported lookup never matched the cache:
Radarr's radarr_moviefile_scenename env var contains the title the proxy returned — INCLUDING the appended .DKaudio / .NFODV etc tags. nfo_cache.release_name stores the un-suffixed canonical title. So WHERE release_name=? could not find existing classifications.

2. Title-fast-path tags never wrote to cache:
In hunt_danish, fast paths (AUDIO_DK_RE, is_native_dk_title, ATTR matches) set found_hits[nid] then continue without cache_set. Even when the proxy correctly classified a release via title alone, there was no cache row for /learn/imported to find.

Fix

  • New strip_proxy_suffix(release_name) helper (regex strips trailing .DKaudio|.DKOK + zero or more .NFOxxx).
  • _handle_learn_imported strips the lookup name before WHERE release_name=? and before cache_set. The audit row keeps the original (suffixed) sceneName for debugging visibility.
  • Every fast-path branch in hunt_danish now calls cache_set(nid, tag, title, source=...) with the spec's source precedence:
    • title regex / native-DK → source="title"
    • attr-based detection → source="attr"

After this

  • classifier_audit.mismatch_type will show meaningful values (agreement, upgrade, missed_dkaudio, false_dkaudio, false_dkok) reflecting real proxy-vs-ffprobe accuracy
  • Title-fast-path verdicts contribute to cross-search cache reuse

Tests

11 new across 3 test files:

  • tests/test_strip_proxy_suffix.py (8 tests on the helper)
  • tests/test_learn_endpoint.py (+1 integration: suffixed POST finds un-suffixed cache row → agreement)
  • tests/test_title_path_caches.py (+2: title and attr fast-paths write to cache)

216 tests pass (205 baseline + 11 new).

Verified live

Pre-seed cache with ('test123', '.DKaudio', 'TestAudit.2026.NORDiC-FOO', source='title'). POST {release_name: "TestAudit.2026.NORDiC-FOO.DKaudio.NFODV", audio_languages: ["dan", ...]}. Response:

previous_tag:    .DKaudio   ✓
previous_source: title      ✓
mismatch_type:   agreement  ✓ (was 'missed_dkaudio' pre-fix)

🤖 Generated with Claude Code

…title-fast-path matches

Two related bugs together made every classifier_audit row report
predicted=NONE / missed_dkaudio, masking real proxy verdicts:

1. /learn/imported's WHERE release_name=? lookup never matched because
   Radarr's sceneName env var contains the proxy's appended .DKaudio +
   .NFOxxx tags (e.g. "Mufasa…-RCDiVX-xpost.DKaudio.NFODV"). nfo_cache
   stores the un-suffixed canonical name. Fix: strip_proxy_suffix()
   helper applied to the lookup name (NOT to the audit row — original
   sceneName is kept there for debugging).

2. Title-fast-path classifications in hunt_danish set found_hits[nid]
   then 'continue' without calling cache_set, so even when the proxy
   correctly classified a release via title regex or attr scan there
   was no cache row for /learn to find. Fix: cache_set on each fast-
   path branch with source='title' or source='attr' per spec
   precedence (ffprobe > nfo > description > attr > title).

After this:
  - Audit rows on already-classified releases show mismatch_type
    'agreement' (proxy and ffprobe agreed) instead of always
    'missed_dkaudio'.
  - title-only fast paths now contribute to cross-search cache hits.

11 new tests (strip helper + lookup integration + 2 fast-path cache writes).
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 24, 2026

Warning

Review limit reached

@unknown0152, we couldn't start this review because you've used your available PR reviews for now.

Your plan includes 1 review of capacity. Refill in 4 minutes and 26 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 00210bd6-ca89-4a8f-b530-7455e95106fb

📥 Commits

Reviewing files that changed from the base of the PR and between 398ebe4 and a693723.

📒 Files selected for processing (4)
  • dksubs-proxy.py
  • tests/test_learn_endpoint.py
  • tests/test_strip_proxy_suffix.py
  • tests/test_title_path_caches.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/audit-release-name-and-title-cache

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@unknown0152 unknown0152 merged commit ca7f8d2 into master May 24, 2026
2 checks passed
@unknown0152 unknown0152 deleted the fix/audit-release-name-and-title-cache branch May 24, 2026 21:56
unknown0152 added a commit that referenced this pull request May 31, 2026
…tle-cache

fix(proxy): meaningful audit data — strip suffixes + cache title-fast-path matches
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant