Implement OpenAIResponsesGenerator by ABeltramo · Pull Request #1806 · NVIDIA/garak

ABeltramo · 2026-05-28T09:26:55Z

Implements a new OpenAI compatible generator OpenAIResponsesGenerator that uses the responses API instead of ChatCompletion.

This started as way to run Agent Breaker against agents that support the responses API, but I think it's valuable as a standalone Generator for any probe.
I'm currently storing the returned tool calls as Message notes, mainly so that I can do manual inspection of the attempts in the report.jsonl file. Ideally, I think there should be a generic mechanism for storing those so that detectors can pick those up properly. Probably outside the scope for this PR, just wanted to know if you already have a direction for integrating tool calling into Garak.

Verification

The new Generator can be used like others OpenAI compatible generators, ex:

  generators:
    openai:
      OpenAIResponsesGenerator:
        uri: http://localhost:1234/v1/
        api_key: dummy 
        tools:
          - type: mcp
            server_label: mcp_server
            server_url: http://localhost:8888/sse
            require_approval: "never"

jmartin-tech

This PR looks pretty useful, the new formats are something we are seeing more middle layers starting to mimic.

Some overall asks:

Please rebase this on main and adjust the PR target.
Given the comments made here about OpenAICompatible this might be functionality that has value in finding a way to integrate there.

jmartin-tech · 2026-05-28T12:46:50Z

    }


+class OpenAIResponsesGenerator(Generator):


This should probably extend OpenAICompatible or possibly be integrated into that class.

If kept separate, I would even go as far as suggesting this should be OpenAIReponsesCompatible as the intent described in the description suggests usage with any service that supports the Responses APi.

I originally picked a new Generator because I saw that OpenAICompatible had a bunch of logic specifics for the completion API. I have now inherited from OpenAICompatible and adapted __init__, _load_unsafe and _call_model

jmartin-tech · 2026-05-28T12:53:43Z

+        # response.output item types to collect into the final text.
+        # "message" captures standard assistant text; "reasoning" captures
+        # the model's reasoning summary (ResponseReasoningItem).
+        "output_types": ["message"],


This needs first class support, reasoning traces are additional data that garak needs to start gathering during the inference phase. As a first to support view I could see storing them as notes on the message but I don't think we should land them as an alternative value collected like this option suggests would be possible.

While is suspect the intent of making this a list is to gather both the option to not include messages becomes implied as valid, I am suggesting that should not be the case.

Removed output_type the rationale was to allow users to decide what should be included in the context for the detector, but it makes sense that we instead give first class support for the additional types like reasoning and tool calling.

jmartin-tech · 2026-05-28T13:02:33Z

+        "uri": None,
+        "instructions": None,
+        "tools": [],
+        "max_output_tokens": 1024,


This value should map to max_tokens. Since this generator is targeting the responses API this class should promote the garak standard generator option of max_tokens to be max_output_tokens during __init__ of this class instead of exposing max_output_tokens.

I understand this diverges from how OpenAIReasoningGenerator handled max_completion_tokens, however the evolution of how max_tokens is treated needs to standardize.

Fixed and aligned with max_tokens

jmartin-tech · 2026-05-28T13:05:37Z

+        super().__init__(self.name, config_root=config_root)
+
+    @staticmethod
+    def _build_input(prompt: Union[Conversation, str]):


No need to support string here, all prompts must be of type Conversation at this time.

Suggested change

def _build_input(prompt: Union[Conversation, str]):

def _build_input(prompt: Conversation):

jmartin-tech · 2026-05-29T14:18:57Z

+            if item_type == "message":
+                for part in item.content:
+                    if getattr(part, "type", None) == "output_text":
+                        text_parts.append(part.text)
+            elif item_type == "reasoning":
+                for summary in getattr(item, "summary", []):
+                    if getattr(summary, "type", None) == "summary_text":
+                        text_parts.append(summary.text)


Merging message text and reasoning text without may marker to delineate between them is not viable. Reasoning either needs to be stored separately or enclosed in something like the <think></think> tag from legacy APIs to ensure garak can properly exclude reasoning from the responses.

Reasoning summaries are now stored exclusively in notes["reasoning"] and never appear in Message.text

jmartin-tech · 2026-05-29T14:36:05Z

+        instructions = self.instructions or self._extract_system_prompt(prompt)
+        create_args = {
+            "model": self.name,
+            "input": self._build_input(prompt),
+            "max_output_tokens": self.max_output_tokens,
+        }


While I understand the format is different here, _extract_system_prompt and _build_input do not take into account additional input modalities supported by the Responses API that are already incorporated in OpenAICompatible._conversation_to_list(). This can be refactored to share that ETL process and enable this generator to support the additional input modalities.

Makes sense, I haven't thought about additional input modalities. _build_input now delegates to OpenAICompatible._conversation_to_list() for the actual turn conversion

jmartin-tech · 2026-05-29T14:39:13Z

+        self.key_env_var = self.ENV_VAR
+        self._load_unsafe()
+        super().__init__(self.name, config_root=config_root)


The Configurable class handles ENV_VAR support no need set here. Also _load_unsafe() should be called after all initialization and validation has completed.

Suggested change

self.key_env_var = self.ENV_VAR

self._load_unsafe()

super().__init__(self.name, config_root=config_root)

super().__init__(self.name, config_root=config_root)

self._load_unsafe()

Fixed. Since the class now inherits OpenAICompatible.__init__, both key_env_var assignment and _load_unsafe ordering are handled by the parent

Signed-off-by: ABeltramo <beltramo.ale@gmail.com>

ABeltramo · 2026-06-02T08:11:44Z

Thanks for the review @jmartin-tech I've set the base branch to main and I should've addressed all comments. Let me know if I've missed anything!

…onses generators CI was during testing hanging because of this. Signed-off-by: ABeltramo <beltramo.ale@gmail.com>

…generator Signed-off-by: ABeltramo <beltramo.ale@gmail.com>

jmartin-tech requested changes May 29, 2026

View reviewed changes

jmartin-tech mentioned this pull request May 29, 2026

probe: pdf injection via hidden text in documents #1757

Open

feat: implement OpenAIResponsesGenerator

8ff80f0

Signed-off-by: ABeltramo <beltramo.ale@gmail.com>

ABeltramo force-pushed the feature/openai-responses-generator branch from 169a5cc to 8ff80f0 Compare June 2, 2026 07:35

ABeltramo changed the base branch from feature/technique_intent to main June 2, 2026 07:36

feat: extend OpenAICompatible and address review comments

cd39f96

Signed-off-by: ABeltramo <beltramo.ale@gmail.com>

ABeltramo added 2 commits June 2, 2026 12:03

fix: don't access client.responses in OpenAICompatible for non-Resp…

c100cf5

…onses generators CI was during testing hanging because of this. Signed-off-by: ABeltramo <beltramo.ale@gmail.com>

fix: hang in CI when testing OpenAICompatible with the new responses …

4afa5bc

…generator Signed-off-by: ABeltramo <beltramo.ale@gmail.com>

	def _build_input(prompt: Union[Conversation, str]):
	def _build_input(prompt: Conversation):

Conversation

ABeltramo commented May 28, 2026

Verification

Uh oh!

jmartin-tech left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ABeltramo commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants