Skip to content

generator: add native Anthropic generator#1809

Open
NishchayMahor wants to merge 3 commits into
NVIDIA:mainfrom
NishchayMahor:feat/anthropic-generator-263
Open

generator: add native Anthropic generator#1809
NishchayMahor wants to merge 3 commits into
NVIDIA:mainfrom
NishchayMahor:feat/anthropic-generator-263

Conversation

@NishchayMahor
Copy link
Copy Markdown

@NishchayMahor NishchayMahor commented May 28, 2026

Closes #263.

Adds a native AnthropicGenerator so Claude models can be targeted directly through Anthropic's API, sitting alongside the existing bedrock/litellm paths instead of relying on them.

Approach

Modeled after garak/generators/mistral.py, same shape and level of abstraction. The Anthropic Messages API is a small surface so the generator stays under 100 lines.

  • ENV_VAR = "ANTHROPIC_API_KEY" to match the other hosted generators.
  • extra_dependency_names = ["anthropic"] so the SDK loads through _load_deps and the import failure path is consistent with the rest of the codebase.
  • Default model is claude-sonnet-4-5 (non-dated alias, so it won't 404 when a snapshot is retired).
  • A small _split_system_and_messages helper pulls system turns out of the Conversation and passes them via the API's top-level system param. Anthropic doesn't accept system as a role inside messages.
  • Backoff wraps RateLimitError, APIConnectionError, and APIStatusError via GeneratorBackoffTrigger, matching the pattern in mistral.py.

Tests

tests/generators/test_anthropic.py covers:

  • A mocked end-to-end call against https://api.anthropic.com/v1/messages using respx, parallel to test_mistral.py.
  • Two unit tests for the system-extraction helper (with and without a system turn).
  • A live test_anthropic_chat that's skipped unless ANTHROPIC_API_KEY is set.

Local run:

3 passed, 1 skipped in 0.23s

black --check clean.

Notes

  • Added anthropic>=0.40.0 to pyproject.toml next to the other SDK pins.
  • New mock fixture at tests/_assets/generators/anthropic.json plus an anthropic_compat_mocks entry in tests/generators/conftest.py.
  • Did not touch bedrock.py or litellm.py, so the existing Claude-via-Bedrock and Claude-via-LiteLLM paths are unchanged.

Adds AnthropicGenerator backed by the anthropic SDK so Claude models
can be exercised directly without going through litellm or bedrock.
System turns are pulled out of the Conversation and passed via the
Messages API's top-level system param; backoff is wired through the
SDK's RateLimitError, APIConnectionError, and APIStatusError.

Signed-off-by: Nishchay Mahor <nishchaymahor@gmail.com>
Two CI checks were red against this branch:

- tests/test_docs.py was missing docs/source/generators/anthropic.rst
  plus the corresponding entry in docs/source/index_generators.rst.
- tests/test_reqs.py treats every pyproject.toml dependency as spurious
  unless it also appears in requirements.txt.

Add the docs stub modeled on mistral.rst, link it from the toctree, and
mirror the anthropic pin in requirements.txt.

Signed-off-by: Nishchay Mahor <nishchaymahor@gmail.com>
@NishchayMahor
Copy link
Copy Markdown
Author

Heads up on the one red check (build (ubuntu-latest, 3.13)): the single failing test is test_detector_detect[detectors.packagehallucination.Dart], which fails with a 504 Gateway Time-out from the HuggingFace API while listing garak-llm/dart-20250811. Looks like an upstream HF blip rather than anything in this PR. The full result was 4500 passed, 102 skipped, 1 failed, and every other Python 3.13 matrix combo (macOS, ARM Linux) came back green. Happy to push a tiny no-op commit to re-trigger if a re-run would help.

@jmartin-tech jmartin-tech changed the title generator: add native Anthropic generator (#263) generator: add native Anthropic generator May 29, 2026
Copy link
Copy Markdown
Collaborator

@patriciapampanelli patriciapampanelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on, @NishchayMahor

extra_dependency_names = ["anthropic"]

ENV_VAR = "ANTHROPIC_API_KEY"
DEFAULT_PARAMS = Generator.DEFAULT_PARAMS | {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_tokens is already in Generator.DEFAULT_PARAMS, and name is supplied via --target_name at runtime. I think we can drop this DEFAULT_PARAMS override entirely.

Comment thread garak/generators/anthropic.py Outdated
_unsafe_attributes = ["client"]

def _load_unsafe(self):
self.client = self.anthropic.Anthropic(api_key=self.api_key)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice OpenAICompatible exposes uri in DEFAULT_PARAMS and passes it as base_url. I think we can mirror that here so endpoint overrides flow through garak config rather than relying only on the SDK env var.

Comment thread garak/generators/anthropic.py Outdated
Comment on lines +67 to +77
call_kwargs = {
"model": self.name,
"max_tokens": self.max_tokens,
"messages": messages,
}
if system is not None:
call_kwargs["system"] = system
if self.temperature is not None:
call_kwargs["temperature"] = self.temperature
if self.top_k is not None:
call_kwargs["top_k"] = self.top_k
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kwargs here are enumerated by hand. The convention in OpenAICompatible is to map the known name mismatch (model from name) explicitly and let inspect.signature(self.client.messages.create).parameters pick up the rest. That catches top_p and any future SDK params for free.

Comment thread garak/generators/anthropic.py Outdated
Comment on lines +82 to +85
backoff_exception_types = [
self.anthropic.RateLimitError,
self.anthropic.APIConnectionError,
self.anthropic.APIStatusError,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APIStatusError catches every 4xx errors, which can't recover. Combined with no max_tries, this loops indefinitely. We should narrow it to the recoverable subset. Look at the bedrock.py

Copy link
Copy Markdown
Collaborator

@patriciapampanelli patriciapampanelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on, @NishchayMahor

NishchayMahor added a commit to NishchayMahor/garak that referenced this pull request May 29, 2026
- Drop the DEFAULT_PARAMS override for max_tokens and name. max_tokens is
  inherited from Generator.DEFAULT_PARAMS and name comes in via --target_name.
- Add uri to DEFAULT_PARAMS and thread it as base_url on the client, mirroring
  how OpenAICompatible exposes its endpoint, so config-driven overrides work.
- Replace the hand-enumerated call_kwargs with inspect.signature against
  client.messages.create, matching the openai.py pattern. New SDK params like
  top_p now flow through without an edit.
- Narrow the retry surface from APIStatusError to RateLimitError,
  APIConnectionError, APITimeoutError, and InternalServerError. 4xx caller
  bugs no longer loop indefinitely.
- Add tests for the new DEFAULT_PARAMS shape and the uri-to-base_url path.
@NishchayMahor
Copy link
Copy Markdown
Author

Thanks for the careful review @patriciapampanelli, this was super helpful. All four addressed in a9116b8:

  1. DEFAULT_PARAMS override (line 26): dropped max_tokens and name. max_tokens is inherited from Generator.DEFAULT_PARAMS and name comes in via --target_name at runtime as you said. The override now just adds the Anthropic-specific knobs.

  2. uri for endpoint override (line 34): added uri: None to DEFAULT_PARAMS and threaded it through to the SDK as base_url in _load_unsafe, matching the OpenAICompatible shape. Endpoint overrides now flow through garak config.

  3. inspect.signature for kwargs (line 77): swapped the hand-enumerated kwargs for the same pattern openai.py uses. model is mapped from self.name explicitly; everything else gets picked up from inspect.signature(self.client.messages.create).parameters whenever the attribute exists and is not in suppressed_params. top_p and any future SDK kwargs now ride along for free.

  4. Narrowing the exception catch (line 85): replaced the catch-all APIStatusError with RateLimitError | APIConnectionError | APITimeoutError | InternalServerError. The transient subset retries on backoff; 4xx caller bugs (400, 401, 403, 404, 422) now surface immediately instead of looping forever.

Also added two small unit tests: one for the new DEFAULT_PARAMS shape and one that confirms uri actually lands on the client's base_url. The existing tests still pass.

- Drop the DEFAULT_PARAMS override for max_tokens and name. max_tokens is
  inherited from Generator.DEFAULT_PARAMS and name comes in via --target_name.
- Add uri to DEFAULT_PARAMS and thread it as base_url on the client, mirroring
  how OpenAICompatible exposes its endpoint, so config-driven overrides work.
- Replace the hand-enumerated call_kwargs with inspect.signature against
  client.messages.create, matching the openai.py pattern. New SDK params like
  top_p now flow through without an edit.
- Narrow the retry surface from APIStatusError to RateLimitError,
  APIConnectionError, APITimeoutError, and InternalServerError. 4xx caller
  bugs no longer loop indefinitely.
- Add tests for the new DEFAULT_PARAMS shape and the uri-to-base_url path.

Signed-off-by: Nishchay Mahor <nishchaymahor@gmail.com>
@NishchayMahor NishchayMahor force-pushed the feat/anthropic-generator-263 branch from a9116b8 to de6275d Compare May 29, 2026 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

generator: anthropic

2 participants