reporting: front-end technique & intent scan report views by stefanoamorelli · Pull Request #1834 · NVIDIA/garak

stefanoamorelli · 2026-06-04T10:41:27Z

Adds tabbed Technique/Intent matrix sections to the report (resolves #1705); the matrix data comes from #1704 and #1807, so this stays hidden until those changes land.

I stumbled across this while looking at the HTML scan report and noticed it had no technique & intent views, so I built them on my fork and am proposing the change upstream.

Add dedicated test files for donotanswer, grandma, phrasing, realtoxicityprompts, and sata probes. Each covers plugin loading, prompt generation, and module-specific behavior (inheritance, pruning, NLTK masking, dynamic class enumeration). Signed-off-by: boao.dong <markdba313@gmail.com>

…ests Pare down tests to validation of functionality unique to each probe module, per review feedback. Generic instantiation and prompt-not-empty checks are already covered by plugins/test_plugin_load.py. Signed-off-by: boao.dong <markdba313@gmail.com>

Signed-off-by: boao.dong <markdba313@gmail.com>

generators.function.Single overrode DEFAULT_PARAMS with only {"kwargs": {}}, so it did not pick up the base Generator.DEFAULT_PARAMS values. Because _apply_missing_instance_defaults iterates over the most-derived DEFAULT_PARAMS, params such as max_tokens, temperature, top_k, context_len, skip_seq_start and skip_seq_end were never set as instance attributes. Accessing them (e.g. generator.max_tokens) raised AttributeError. Merge Generator.DEFAULT_PARAMS into the class dict, matching the pattern used by the other generators (openai, rest, huggingface, cohere, ollama, etc.). Multiple inherits from Single and is covered as well. Adds a regression test asserting both Single and Multiple expose the inherited defaults while keeping the function-specific kwargs default. Fixes NVIDIA#1096 Signed-off-by: Aditya Singh <adisin650@gmail.com>

…VIDIA#830) When a user passes a module-name spec (e.g. `-p test`) that maps only to plugins marked `active = False`, `parse_plugin_spec` returns the clause in the rejected list and the CLI raises `Unknown probes: test`. That message is misleading: the module exists, but every plugin in it is inactive. Detect this case at the CLI rejection site by re-enumerating plugins and checking whether any inactive entries share the rejected clause as a namespace prefix. When they do, surface a message that names the module and points the user at calling specific plugins by name. Mixed rejections (some inactive-only, some truly unknown) still report both. The change is contained to `garak/cli.py` so `parse_plugin_spec` keeps its existing `(found, rejected)` signature and `--list_probes` callers are unaffected. A regression test exercises `-p test` and asserts the new wording. Signed-off-by: notnick2 <varun024123@gmail.com>

Director._scan_payload_dir catches JSONDecodeError/KeyError for a payload file, logs "Invalid payload, skipping", but does not continue. Execution falls through to the code that reads `payload_types` and records the file. For the first/only file this raises UnboundLocalError (payload_types is never bound), crashing the whole payload inventory. If a valid file was processed earlier, the stale `payload_types` is reused, silently registering the invalid file under the previous file's types. Add `continue` so invalid payloads are skipped, matching the log message and the commented-out raise. Adds a regression test. Co-authored-by: Claude Signed-off-by: Kymi808 <zeng.kyle13@gmail.com>

…erators Per review, refocus on the reported gap: not every generator exposes max_tokens, so the divergence probe guards the attribute before overriding it, the same way promptinject does with 'in dir(generator)'. Revert the function generator DEFAULT_PARAMS change, which is better handled as a separate enhancement PR. Fixes NVIDIA#1096. Signed-off-by: Aditya Singh <adisin650@gmail.com>

sphinx-rtd-theme 3.1.0 (released Jan 2026) added explicit support for Sphinx 9.x. Without a minimum version constraint, users with a recent Sphinx installation (>=9.0) would receive an older sphinx-rtd-theme that fails with: The 'sphinx_rtd_theme' theme does not support this version of Sphinx, because it uses the 'style' field in HTML templates, which was deprecated in Sphinx 5.1 and removed in Sphinx 7.0. Pinning sphinx-rtd-theme>=3.1.0 ensures compatibility with all current Sphinx releases while keeping the existing ReadTheDocs theme and NVIDIA-branded custom CSS intact. Closes NVIDIA#1529 Signed-off-by: Oleksandr Sanin <alexaaander.sanin@gmail.com>

…VIDIA#1811)

…VIDIA#1797)

Signed-off-by: Leon Derczynski <lderczynski@nvidia.com>

…A#1810) ## Summary `Director._scan_payload_dir` (`garak/payloads.py`) catches `JSONDecodeError`/`KeyError` for a payload file and logs `"Invalid payload, skipping"`, but it does **not** `continue`. Execution falls through to the code that reads `payload_types` and records the file: ```python except (json.JSONDecodeError, KeyError) as exc: msg = f"payload scan: Invalid payload, skipping: {payload_path}" logging.debug(msg, exc_info=exc) # raise garak.exception.PayloadFailure(msg) from exc payload_name = payload_path.stem payloads_found[payload_name] = { "path": payload_path, "types": payload_types, # <- read even when the except fired } ``` Consequences: - **Crash:** if the first/only scanned file is invalid, `payload_types` was never bound → `UnboundLocalError`, which aborts the whole payload inventory (`_refresh_payloads`). - **Silent data corruption:** if a valid file was processed earlier in the loop, the stale `payload_types` is reused, registering the invalid file under the **previous** file's types. This is reachable in normal use — `PAYLOAD_DIR` includes user / `XDG_DATA_DIR` custom payload directories, so a single stray or malformed `.json` there breaks payload loading. ## Fix Add `continue` in the `except` block so invalid payloads are skipped, matching the log message ("skipping") and the commented-out `raise`. One line; no other behavior changes. ## Why this is not a duplicate Checked open PRs (`_scan_payload_dir`, `payload scan`, `payloads` in title) and open issues (`invalid payload`) on NVIDIA/garak — none address this. ## Testing - Added `tests/test_payloads.py::test_scan_payload_dir_skips_invalid`, which scans a temp dir containing a malformed JSON, a JSON missing `payload_types`, and one valid payload; asserts only the valid payload is registered and no stale types leak. It **fails on `main`** (`UnboundLocalError`) and **passes** with this change. - Commands run: - `python3 -m pytest tests/test_payloads.py::test_scan_payload_dir_skips_invalid tests/test_payloads.py::test_non_json_direct_load tests/test_payloads.py::test_payload_Director` → 3 passed - `python3 -m black --check garak/payloads.py tests/test_payloads.py` → clean ## AI assistance AI assistance (Claude) was used to help locate the bug, implement the fix, and write the test. I reviewed every changed line and ran the tests above.

Signed-off-by: Jeffrey Martin <jemartin@nvidia.com>

Add technique and intent sections to the HTML scan report, resolving NVIDIA#1705 [1] and consuming the technique_intent_matrix digest field from NVIDIA#1704 [2]. The implementation mirrors the existing DetectorsView [3] and ProbesChart [4] panels: a shared pass-rate matrix renders the technique by intent grid, with a technique-first and an intent-first view behind a tab. Intent counts are pooled across techniques rather than averaging scores, matching how the backend pools cells. The section stays hidden for reports that predate the digest field, so older reports are unaffected. [1]: NVIDIA#1705 [2]: NVIDIA#1704 [3]: https://github.com/NVIDIA/garak/blob/main/garak-report/src/components/DetectorsView.tsx [4]: https://github.com/NVIDIA/garak/blob/main/garak-report/src/components/ProbesChart.tsx Signed-off-by: Stefano Amorelli <stefano@amorelli.tech>

jmartin-tech · 2026-06-04T12:58:57Z

This PR is targeting the wrong branch as this feature is still in development in the feature branch, I have updated the target though a rebase may be needed as well.

The description does not provide any samples of the proposed UX. Significant design discussion is needed to implement #1705 as that issue did not include an specification for what the desired users experience or level of detail to be presented to the user should be.

@otavionvidia this might be something you would be interested in engaging with to see if the ideas here mesh with the paths currently being explored for surfacing Technique & Intent data.

markd88 and others added 21 commits May 5, 2026 15:56

test: remove grandma and RTP probe tests, keep phrasing and sata

07cae20

Signed-off-by: boao.dong <markdba313@gmail.com>

fix: guard max_tokens override when the generator lacks it (NVIDIA#1795)

ac81de8

automatic garak/resources/plugin_cache.json update

d2e3728

fix(docs): pin sphinx-rtd-theme>=3.1.0 for Sphinx 9.x compatibility (N…

e0fccc0

…VIDIA#1811)

fix: clearer error when probespec module has only inactive plugins (N…

78f335a

…VIDIA#1797)

test: add unit tests for phrasing and sata probe modules (NVIDIA#1747)

2b12c25

section on new probe scope

5c1988e

Signed-off-by: Leon Derczynski <lderczynski@nvidia.com>

decenter linkedin

a830469

Signed-off-by: Leon Derczynski <lderczynski@nvidia.com>

point to specific docs on contrib/extend

d7693c8

Signed-off-by: Leon Derczynski <lderczynski@nvidia.com>

update social media accts

5c9f1df

Signed-off-by: Leon Derczynski <lderczynski@nvidia.com>

add specifics on style; explicitly point to contribution docs

3b942f4

Signed-off-by: Leon Derczynski <lderczynski@nvidia.com>

fix typo

478bf42

Signed-off-by: Jeffrey Martin <jemartin@nvidia.com>

docs: Update contribution, agent instructions (NVIDIA#1819)

6dbfae5

stefanoamorelli force-pushed the feature/frontend-technique-intent-views branch 4 times, most recently from 7f1cbcc to 6e0c96a Compare June 4, 2026 11:17

stefanoamorelli force-pushed the feature/frontend-technique-intent-views branch from 6e0c96a to 0b9e79c Compare June 4, 2026 11:17

jmartin-tech changed the base branch from main to feature/technique_intent June 4, 2026 12:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reporting: front-end technique & intent scan report views#1834

reporting: front-end technique & intent scan report views#1834
stefanoamorelli wants to merge 22 commits into
NVIDIA:feature/technique_intentfrom
stefanoamorelli:feature/frontend-technique-intent-views

stefanoamorelli commented Jun 4, 2026 •

edited

Loading

Uh oh!

jmartin-tech commented Jun 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

stefanoamorelli commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmartin-tech commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

stefanoamorelli commented Jun 4, 2026 •

edited

Loading

jmartin-tech commented Jun 4, 2026 •

edited

Loading