Skip to content

Commit 2cfe1fa

Browse files
authored
fix!: switch DALLEImageGenerator to gpt-image-2 (#11321)
1 parent df15d35 commit 2cfe1fa

6 files changed

Lines changed: 122 additions & 82 deletions

File tree

.github/workflows/slow.yml

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,9 @@ on:
3636

3737
jobs:
3838
check-if-changed:
39-
# This job checks if the relevant files have been changed.
40-
# We check for changes in the check-if-changed job instead of using paths/paths-ignore at workflow level.
41-
# This ensures the "Slow Integration Tests completed" job always runs, which is required by Branch Protection rules.
39+
# This job checks if the relevant files have been changed.
40+
# We check for changes in the check-if-changed job instead of using paths/paths-ignore at workflow level.
41+
# This ensures the "Slow Integration Tests completed" job always runs, which is required by Branch Protection rules.
4242
name: Check if changed
4343
runs-on: ubuntu-slim
4444
permissions:
@@ -71,6 +71,7 @@ jobs:
7171
- "haystack/components/generators/chat/hugging_face_local.py"
7272
- "haystack/components/generators/hugging_face_api.py"
7373
- "haystack/components/generators/hugging_face_local_generator.py"
74+
- "haystack/components/generators/openai_dalle.py"
7475
- "haystack/components/preprocessors/embedding_based_document_splitter.py"
7576
- "haystack/components/rankers/sentence_transformers_diversity.py"
7677
- "haystack/components/rankers/sentence_transformers_similarity.py"
@@ -94,6 +95,7 @@ jobs:
9495
- "test/components/generators/chat/test_hugging_face_local.py"
9596
- "test/components/generators/test_hugging_face_api.py"
9697
- "test/components/generators/test_hugging_face_local_generator.py"
98+
- "test/components/generators/test_openai_dalle.py"
9799
- "test/components/preprocessors/test_embedding_based_document_splitter.py"
98100
- "test/components/rankers/test_sentence_transformers_diversity.py"
99101
- "test/components/rankers/test_sentence_transformers_similarity.py"
@@ -168,11 +170,11 @@ jobs:
168170
needs: slow-integration-tests
169171

170172
steps:
171-
- name: Mark tests as completed
172-
run: |
173-
if [ "${{ needs.slow-integration-tests.result }}" = "failure" ]; then
174-
echo "Slow Integration Tests failed!"
175-
exit 1
176-
else
177-
echo "Slow Integration Tests completed!"
178-
fi
173+
- name: Mark tests as completed
174+
run: |
175+
if [ "${{ needs.slow-integration-tests.result }}" = "failure" ]; then
176+
echo "Slow Integration Tests failed!"
177+
exit 1
178+
else
179+
echo "Slow Integration Tests completed!"
180+
fi

docs-website/docs/pipeline-components/generators.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Generators are responsible for generating text after you give them a prompt. The
2323
| [CohereChatGenerator](generators/coherechatgenerator.mdx) | Enables chat completion using Cohere's LLMs. ||
2424
| [CohereGenerator](generators/coheregenerator.mdx) | Queries the LLM using Cohere API. ||
2525
| [CometAPIChatGenerator](generators/cometapichatgenerator.mdx) | Enables chat completion using AI models through the Comet API. ||
26-
| [DALLEImageGenerator](generators/dalleimagegenerator.mdx) | Generate images using OpenAI's DALL-E model. ||
26+
| [DALLEImageGenerator](generators/dalleimagegenerator.mdx) | Generate images using OpenAI's image generation models such as `gpt-image-2`. ||
2727
| [FallbackChatGenerator](generators/fallbackchatgenerator.mdx) | A ChatGenerator wrapper that tries multiple Chat Generators sequentially until one succeeds. ||
2828
| [GoogleAIGeminiChatGenerator](generators/googleaigeminichatgenerator.mdx) | Enables chat completion using Google Gemini models. **_This integration will be deprecated soon. We recommend using [GoogleGenAIChatGenerator](generators/googlegenaichatgenerator.mdx) integration instead._** ||
2929
| [GoogleAIGeminiGenerator](generators/googleaigeminigenerator.mdx) | Enables text generation using Google Gemini models. **_This integration will be deprecated soon. We recommend using [GoogleGenAIChatGenerator](generators/googlegenaichatgenerator.mdx) integration instead._** ||

docs-website/docs/pipeline-components/generators/dalleimagegenerator.mdx

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
title: "DALLEImageGenerator"
33
id: dalleimagegenerator
44
slug: "/dalleimagegenerator"
5-
description: "Generate images using OpenAI's DALL-E model."
5+
description: "Generate images using OpenAI's image generation models such as `gpt-image-2`."
66
---
77

88
# DALLEImageGenerator
99

10-
Generate images using OpenAI's DALL-E model.
10+
Generate images using OpenAI's image generation models such as `gpt-image-2`.
1111

1212
<div className="key-value-table">
1313

@@ -25,17 +25,17 @@ Generate images using OpenAI's DALL-E model.
2525

2626
## Overview
2727

28-
The `DALLEImageGenerator` component generates images using OpenAI's DALL-E model.
28+
The `DALLEImageGenerator` component generates images using OpenAI's image generation models (such as `gpt-image-2`).
2929

30-
By default, the component uses `dall-e-3` model, standard picture quality, and 1024x1024 resolution. You can change these parameters using `model` (during component initialization), `quality`, and `size` (during component initialization or run) parameters.
30+
By default, the component uses the `gpt-image-2` model, `"auto"` quality, and 1024x1024 resolution. You can change these parameters using `model` (during component initialization), `quality`, and `size` (during component initialization or run) parameters.
3131

3232
`DALLEImageGenerator` needs an OpenAI key to work. It uses an `OPENAI_API_KEY` environment variable by default. Otherwise, you can pass an API key at initialization with `api_key`:
3333

3434
```
3535
image_generator = DALLEImageGenerator(api_key=Secret.from_token("<your-api-key>"))
3636
```
3737

38-
Check our [API reference](/reference/generators-api#dalleimagegenerator) for the detailed component parameters description, or the [OpenAI documentation](https://platform.openai.com/docs/api-reference/images/create) for the details on OpenAI API parameters.
38+
Check our [API reference](/reference/generators-api#dalleimagegenerator) for the detailed component parameters description, or the [OpenAI documentation](https://developers.openai.com/api/reference/resources/images/methods/generate) for the details on OpenAI API parameters.
3939

4040
## Usage
4141

@@ -93,6 +93,6 @@ results = pipeline.run(
9393
generated_images = results["image_generator"]["images"]
9494
revised_prompt = results["image_generator"]["revised_prompt"]
9595

96-
print(f"Generated image URL: {generated_images[0]}")
96+
print(f"Generated image (base64-encoded): {generated_images[0]}")
9797
print(f"Revised prompt: {revised_prompt}")
9898
```

haystack/components/generators/openai_dalle.py

Lines changed: 36 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,20 @@
88
from openai import OpenAI
99
from openai.types.image import Image
1010

11-
from haystack import component, default_from_dict, default_to_dict
11+
from haystack import component, default_from_dict, default_to_dict, logging
1212
from haystack.utils import Secret
1313
from haystack.utils.http_client import init_http_client
1414

15+
logger = logging.getLogger(__name__)
16+
1517

1618
@component
1719
class DALLEImageGenerator:
1820
"""
19-
Generates images using OpenAI's DALL-E model.
21+
Generates images using OpenAI's image generation models such as `gpt-image-2`.
2022
2123
For details on OpenAI API parameters, see
22-
[OpenAI documentation](https://platform.openai.com/docs/api-reference/images/create).
24+
[OpenAI documentation](https://developers.openai.com/api/reference/resources/images/methods/generate).
2325
2426
### Usage example
2527
```python
@@ -32,10 +34,10 @@ class DALLEImageGenerator:
3234

3335
def __init__(
3436
self,
35-
model: str = "dall-e-3",
36-
quality: Literal["standard", "hd"] = "standard",
37-
size: Literal["256x256", "512x512", "1024x1024", "1792x1024", "1024x1792"] = "1024x1024",
38-
response_format: Literal["url", "b64_json"] = "url",
37+
model: str = "gpt-image-2",
38+
quality: Literal["auto", "high", "medium", "low"] = "auto",
39+
size: Literal["1024x1024", "1024x1536", "1536x1024", "auto"] = "1024x1024",
40+
response_format: Literal["b64_json"] = "b64_json",
3941
api_key: Secret = Secret.from_env_var("OPENAI_API_KEY"),
4042
api_base_url: str | None = None,
4143
organization: str | None = None,
@@ -44,14 +46,15 @@ def __init__(
4446
http_client_kwargs: dict[str, Any] | None = None,
4547
) -> None:
4648
"""
47-
Creates an instance of DALLEImageGenerator. Unless specified otherwise in `model`, uses OpenAI's dall-e-3.
48-
49-
:param model: The model to use for image generation. Can be "dall-e-2" or "dall-e-3".
50-
:param quality: The quality of the generated image. Can be "standard" or "hd".
51-
:param size: The size of the generated images.
52-
Must be one of 256x256, 512x512, or 1024x1024 for dall-e-2.
53-
Must be one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3 models.
54-
:param response_format: The format of the response. Can be "url" or "b64_json".
49+
Creates an instance of DALLEImageGenerator. Unless specified otherwise in `model`, uses OpenAI's gpt-image-2.
50+
51+
:param model: The model to use for image generation. Model names can be found in the
52+
[OpenAI documentation](https://developers.openai.com/api/docs/models/all).
53+
:param quality: The quality of the generated image. Can be "auto", "high", "medium", or "low".
54+
:param size: The size of the generated images. One of 1024x1024, 1024x1536, 1536x1024, or "auto".
55+
`gpt-image-2` also supports arbitrary sizes. You can find more information about supported sizes in
56+
the [OpenAI documentation](https://developers.openai.com/api/reference/resources/images/methods/generate).
57+
:param response_format: This parameter is ignored and only kept for backward compatibility.
5558
:param api_key: The OpenAI API key to connect to OpenAI.
5659
:param api_base_url: An optional base URL.
5760
:param organization: The Organization ID, defaults to `None`.
@@ -66,9 +69,13 @@ def __init__(
6669
For more information, see the [HTTPX documentation](https://www.python-httpx.org/api/#client).
6770
"""
6871
self.model = model
72+
if quality not in ["auto", "high", "medium", "low"]:
73+
logger.warning("Invalid quality: {quality}. Defaulting to 'auto'.", quality=quality)
74+
quality = "auto"
6975
self.quality = quality
7076
self.size = size
71-
self.response_format = response_format
77+
if response_format != "b64_json":
78+
logger.warning("response_format is ignored. A base64-encoded image will be returned.")
7279
self.api_key = api_key
7380
self.api_base_url = api_base_url
7481
self.organization = organization
@@ -97,40 +104,39 @@ def warm_up(self) -> None:
97104
def run(
98105
self,
99106
prompt: str,
100-
size: Literal["256x256", "512x512", "1024x1024", "1792x1024", "1024x1792"] | None = None,
101-
quality: Literal["standard", "hd"] | None = None,
102-
response_format: Literal["url", "b64_json"] | None = None,
107+
size: Literal["1024x1024", "1024x1536", "1536x1024", "auto"] | None = None,
108+
quality: Literal["auto", "high", "medium", "low"] | None = None,
109+
response_format: Literal["b64_json"] | None = None, # noqa: ARG002
103110
) -> dict[str, Any]:
104111
"""
105112
Invokes the image generation inference based on the provided prompt and generation parameters.
106113
107114
:param prompt: The prompt to generate the image.
108115
:param size: If provided, overrides the size provided during initialization.
109116
:param quality: If provided, overrides the quality provided during initialization.
110-
:param response_format: If provided, overrides the response format provided during initialization.
117+
:param response_format: This parameter is ignored and only kept for backward compatibility.
111118
112119
:returns:
113-
A dictionary containing the generated list of images and the revised prompt.
114-
Depending on the `response_format` parameter, the list of images can be URLs or base64 encoded JSON strings.
120+
A dictionary containing the generated list of images as base64 encoded JSON strings and the revised prompt.
115121
The revised prompt is the prompt that was used to generate the image, if there was any revision
116122
to the prompt made by OpenAI.
117123
"""
118124
if self.client is None:
119125
self.warm_up()
120126

127+
# at this point the client is initialized, but mypy doesn't know that
128+
assert self.client is not None
129+
121130
size = size or self.size
122131
quality = quality or self.quality
123-
response_format = response_format or self.response_format
124-
response = self.client.images.generate( # type: ignore[union-attr]
125-
model=self.model, prompt=prompt, size=size, quality=quality, response_format=response_format, n=1
126-
)
132+
response = self.client.images.generate(model=self.model, prompt=prompt, size=size, quality=quality, n=1)
133+
image_str = ""
134+
revised_prompt = ""
127135
if response.data is not None:
128136
image: Image = response.data[0]
129-
image_str = image.url or image.b64_json or ""
137+
image_str = image.b64_json or ""
130138
revised_prompt = image.revised_prompt or ""
131-
else:
132-
image_str = ""
133-
revised_prompt = ""
139+
134140
return {"images": [image_str], "revised_prompt": revised_prompt}
135141

136142
def to_dict(self) -> dict[str, Any]:
@@ -145,7 +151,6 @@ def to_dict(self) -> dict[str, Any]:
145151
model=self.model,
146152
quality=self.quality,
147153
size=self.size,
148-
response_format=self.response_format,
149154
api_key=self.api_key,
150155
api_base_url=self.api_base_url,
151156
organization=self.organization,
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
upgrade:
3+
- |
4+
``DALLEImageGenerator`` has been updated to account for OpenAI's retirement of the DALL-E models.
5+
The default model is now ``gpt-image-2`` (previously ``dall-e-3``). To migrate:
6+
- Update ``model`` value: besides ``gpt-image-2``, ``gpt-image-1`` and ``gpt-image-1-mini`` are also supported.
7+
- Update ``quality`` value: the new accepted values are ``auto``, ``high``, ``medium``, or ``low``
8+
(previously ``standard`` or ``hd``).
9+
- Update ``size`` value: the new accepted values are ``1024x1024``, ``1024x1536``, ``1536x1024``,
10+
or ``auto``. ``gpt-image-2`` also supports arbitrary sizes.
11+
- The ``response_format`` parameter is now ignored. The component always returns base64-encoded JSON.

0 commit comments

Comments
 (0)