generator: add native Anthropic generator by NishchayMahor · Pull Request #1809 · NVIDIA/garak

NishchayMahor · 2026-05-28T18:59:17Z

Closes #263.

Adds a native AnthropicGenerator so Claude models can be targeted directly through Anthropic's API, sitting alongside the existing bedrock/litellm paths instead of relying on them.

Approach

Modeled after garak/generators/mistral.py, same shape and level of abstraction. The Anthropic Messages API is a small surface so the generator stays under 100 lines.

ENV_VAR = "ANTHROPIC_API_KEY" to match the other hosted generators.
extra_dependency_names = ["anthropic"] so the SDK loads through _load_deps and the import failure path is consistent with the rest of the codebase.
Default model is claude-sonnet-4-5 (non-dated alias, so it won't 404 when a snapshot is retired).
A small _split_system_and_messages helper pulls system turns out of the Conversation and passes them via the API's top-level system param. Anthropic doesn't accept system as a role inside messages.
Backoff wraps RateLimitError, APIConnectionError, and APIStatusError via GeneratorBackoffTrigger, matching the pattern in mistral.py.

Tests

tests/generators/test_anthropic.py covers:

A mocked end-to-end call against https://api.anthropic.com/v1/messages using respx, parallel to test_mistral.py.
Two unit tests for the system-extraction helper (with and without a system turn).
A live test_anthropic_chat that's skipped unless ANTHROPIC_API_KEY is set.

Local run:

3 passed, 1 skipped in 0.23s

black --check clean.

Notes

Added anthropic>=0.40.0 to pyproject.toml next to the other SDK pins.
New mock fixture at tests/_assets/generators/anthropic.json plus an anthropic_compat_mocks entry in tests/generators/conftest.py.
Did not touch bedrock.py or litellm.py, so the existing Claude-via-Bedrock and Claude-via-LiteLLM paths are unchanged.

Adds AnthropicGenerator backed by the anthropic SDK so Claude models can be exercised directly without going through litellm or bedrock. System turns are pulled out of the Conversation and passed via the Messages API's top-level system param; backoff is wired through the SDK's RateLimitError, APIConnectionError, and APIStatusError. Signed-off-by: Nishchay Mahor <nishchaymahor@gmail.com>

Two CI checks were red against this branch: - tests/test_docs.py was missing docs/source/generators/anthropic.rst plus the corresponding entry in docs/source/index_generators.rst. - tests/test_reqs.py treats every pyproject.toml dependency as spurious unless it also appears in requirements.txt. Add the docs stub modeled on mistral.rst, link it from the toctree, and mirror the anthropic pin in requirements.txt. Signed-off-by: Nishchay Mahor <nishchaymahor@gmail.com>

NishchayMahor · 2026-05-29T17:28:39Z

Heads up on the one red check (build (ubuntu-latest, 3.13)): the single failing test is test_detector_detect[detectors.packagehallucination.Dart], which fails with a 504 Gateway Time-out from the HuggingFace API while listing garak-llm/dart-20250811. Looks like an upstream HF blip rather than anything in this PR. The full result was 4500 passed, 102 skipped, 1 failed, and every other Python 3.13 matrix combo (macOS, ARM Linux) came back green. Happy to push a tiny no-op commit to re-trigger if a re-run would help.

patriciapampanelli

Thanks for taking this on, @NishchayMahor

patriciapampanelli · 2026-05-28T20:31:04Z

+    extra_dependency_names = ["anthropic"]
+
+    ENV_VAR = "ANTHROPIC_API_KEY"
+    DEFAULT_PARAMS = Generator.DEFAULT_PARAMS | {


max_tokens is already in Generator.DEFAULT_PARAMS, and name is supplied via --target_name at runtime. I think we can drop this DEFAULT_PARAMS override entirely.

patriciapampanelli · 2026-05-28T20:32:43Z

+    _unsafe_attributes = ["client"]
+
+    def _load_unsafe(self):
+        self.client = self.anthropic.Anthropic(api_key=self.api_key)


Notice OpenAICompatible exposes uri in DEFAULT_PARAMS and passes it as base_url. I think we can mirror that here so endpoint overrides flow through garak config rather than relying only on the SDK env var.

patriciapampanelli · 2026-05-29T19:09:32Z

+        call_kwargs = {
+            "model": self.name,
+            "max_tokens": self.max_tokens,
+            "messages": messages,
+        }
+        if system is not None:
+            call_kwargs["system"] = system
+        if self.temperature is not None:
+            call_kwargs["temperature"] = self.temperature
+        if self.top_k is not None:
+            call_kwargs["top_k"] = self.top_k


The kwargs here are enumerated by hand. The convention in OpenAICompatible is to map the known name mismatch (model from name) explicitly and let inspect.signature(self.client.messages.create).parameters pick up the rest. That catches top_p and any future SDK params for free.

patriciapampanelli · 2026-05-29T20:08:41Z

+            backoff_exception_types = [
+                self.anthropic.RateLimitError,
+                self.anthropic.APIConnectionError,
+                self.anthropic.APIStatusError,


APIStatusError catches every 4xx errors, which can't recover. Combined with no max_tries, this loops indefinitely. We should narrow it to the recoverable subset. Look at the bedrock.py

patriciapampanelli

Thanks for taking this on, @NishchayMahor

- Drop the DEFAULT_PARAMS override for max_tokens and name. max_tokens is inherited from Generator.DEFAULT_PARAMS and name comes in via --target_name. - Add uri to DEFAULT_PARAMS and thread it as base_url on the client, mirroring how OpenAICompatible exposes its endpoint, so config-driven overrides work. - Replace the hand-enumerated call_kwargs with inspect.signature against client.messages.create, matching the openai.py pattern. New SDK params like top_p now flow through without an edit. - Narrow the retry surface from APIStatusError to RateLimitError, APIConnectionError, APITimeoutError, and InternalServerError. 4xx caller bugs no longer loop indefinitely. - Add tests for the new DEFAULT_PARAMS shape and the uri-to-base_url path.

NishchayMahor · 2026-05-29T20:26:02Z

Thanks for the careful review @patriciapampanelli, this was super helpful. All four addressed in a9116b8:

DEFAULT_PARAMS override (line 26): dropped max_tokens and name. max_tokens is inherited from Generator.DEFAULT_PARAMS and name comes in via --target_name at runtime as you said. The override now just adds the Anthropic-specific knobs.
uri for endpoint override (line 34): added uri: None to DEFAULT_PARAMS and threaded it through to the SDK as base_url in _load_unsafe, matching the OpenAICompatible shape. Endpoint overrides now flow through garak config.
inspect.signature for kwargs (line 77): swapped the hand-enumerated kwargs for the same pattern openai.py uses. model is mapped from self.name explicitly; everything else gets picked up from inspect.signature(self.client.messages.create).parameters whenever the attribute exists and is not in suppressed_params. top_p and any future SDK kwargs now ride along for free.
Narrowing the exception catch (line 85): replaced the catch-all APIStatusError with RateLimitError | APIConnectionError | APITimeoutError | InternalServerError. The transient subset retries on backoff; 4xx caller bugs (400, 401, 403, 404, 422) now surface immediately instead of looping forever.

Also added two small unit tests: one for the new DEFAULT_PARAMS shape and one that confirms uri actually lands on the client's base_url. The existing tests still pass.

- Drop the DEFAULT_PARAMS override for max_tokens and name. max_tokens is inherited from Generator.DEFAULT_PARAMS and name comes in via --target_name. - Add uri to DEFAULT_PARAMS and thread it as base_url on the client, mirroring how OpenAICompatible exposes its endpoint, so config-driven overrides work. - Replace the hand-enumerated call_kwargs with inspect.signature against client.messages.create, matching the openai.py pattern. New SDK params like top_p now flow through without an edit. - Narrow the retry surface from APIStatusError to RateLimitError, APIConnectionError, APITimeoutError, and InternalServerError. 4xx caller bugs no longer loop indefinitely. - Add tests for the new DEFAULT_PARAMS shape and the uri-to-base_url path. Signed-off-by: Nishchay Mahor <nishchaymahor@gmail.com>

NishchayMahor added 2 commits May 28, 2026 11:58

jmartin-tech changed the title ~~generator: add native Anthropic generator (#263)~~ generator: add native Anthropic generator May 29, 2026

patriciapampanelli requested changes May 29, 2026

View reviewed changes

NishchayMahor force-pushed the feat/anthropic-generator-263 branch from a9116b8 to de6275d Compare May 29, 2026 20:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generator: add native Anthropic generator#1809

generator: add native Anthropic generator#1809
NishchayMahor wants to merge 3 commits into
NVIDIA:mainfrom
NishchayMahor:feat/anthropic-generator-263

NishchayMahor commented May 28, 2026 •

edited

Loading

Uh oh!

NishchayMahor commented May 29, 2026

Uh oh!

patriciapampanelli left a comment

Uh oh!

patriciapampanelli May 28, 2026

Uh oh!

patriciapampanelli May 28, 2026

Uh oh!

patriciapampanelli May 29, 2026

Uh oh!

patriciapampanelli May 29, 2026

Uh oh!

patriciapampanelli left a comment

Uh oh!

NishchayMahor commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NishchayMahor commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approach

Tests

Notes

Uh oh!

NishchayMahor commented May 29, 2026

Uh oh!

patriciapampanelli left a comment

Choose a reason for hiding this comment

Uh oh!

patriciapampanelli May 28, 2026

Choose a reason for hiding this comment

Uh oh!

patriciapampanelli May 28, 2026

Choose a reason for hiding this comment

Uh oh!

patriciapampanelli May 29, 2026

Choose a reason for hiding this comment

Uh oh!

patriciapampanelli May 29, 2026

Choose a reason for hiding this comment

Uh oh!

patriciapampanelli left a comment

Choose a reason for hiding this comment

Uh oh!

NishchayMahor commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NishchayMahor commented May 28, 2026 •

edited

Loading