microsoft
diff --git a/‎doc/code/targets/4_openai_video_target.ipynb‎
Lines changed: 112 additions & 8 deletions b/‎doc/code/targets/4_openai_video_target.ipynb‎
Lines changed: 112 additions & 8 deletions
diff --git a/‎doc/code/targets/4_openai_video_target.py‎
Lines changed: 73 additions & 1 deletion b/‎doc/code/targets/4_openai_video_target.py‎
Lines changed: 73 additions & 1 deletion
diff --git a/‎pyrit/models/message.py‎
Lines changed: 51 additions & 0 deletions b/‎pyrit/models/message.py‎
Lines changed: 51 additions & 0 deletions
@@ -7,15 +7,28 @@
    "source": [
     "# 4. OpenAI Video Target\n",
     "\n",
-    "This example shows how to use the video target to create a video from a text prompt.\n",
+    "`OpenAIVideoTarget` supports three modes:\n",
+    "- **Text-to-video**: Generate a video from a text prompt.\n",
+    "- **Remix**: Create a variation of an existing video (using `video_id` from a prior generation).\n",
+    "- **Text+Image-to-video**: Use an image as the first frame of the generated video.\n",
     "\n",
     "Note that the video scorer requires `opencv`, which is not a default PyRIT dependency. You need to install it manually or using `pip install pyrit[opencv]`."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "1",
+   "metadata": {},
+   "source": [
+    "## Text-to-Video\n",
+    "\n",
+    "This example shows the simplest mode: generating video from text prompts, with scoring."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "1",
+   "id": "2",
    "metadata": {},
    "outputs": [
     {
@@ -53,18 +66,18 @@
   },
   {
    "cell_type": "markdown",
-   "id": "2",
+   "id": "3",
    "metadata": {},
    "source": [
     "## Generating and scoring a video:\n",
     "\n",
-    "Using the video target you can send prompts to generate a video. The video scorer can evaluate the video content itself. Note this section is simply scoring the **video** not the audio.  "
+    "Using the video target you can send prompts to generate a video. The video scorer can evaluate the video content itself. Note this section is simply scoring the **video** not the audio."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "3",
+   "id": "4",
    "metadata": {},
    "outputs": [
     {
@@ -448,7 +461,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "4",
+   "id": "5",
    "metadata": {},
    "source": [
     "## Scoring video and audio **together**:\n",
@@ -461,7 +474,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "5",
+   "id": "6",
    "metadata": {},
    "outputs": [
     {
@@ -661,11 +674,102 @@
     ")\n",
     "\n",
     "for result in results:\n",
-    "    await ConsoleAttackResultPrinter().print_result_async(result=result, include_auxiliary_scores=True)  # type: ignore"
+    "    await ConsoleAttackResultPrinter().print_result_async(result=result, include_auxiliary_scores=True)  # type: ignore\n",
+    "\n",
+    "# Capture video_id from the first result for use in the remix section below\n",
+    "video_id = results[0].last_response.prompt_metadata[\"video_id\"]\n",
+    "print(f\"Video ID for remix: {video_id}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7",
+   "metadata": {},
+   "source": [
+    "## Remix (Video Variation)\n",
+    "\n",
+    "Remix creates a variation of an existing video. After any successful generation, the response\n",
+    "includes a `video_id` in `prompt_metadata`. Pass this back via `prompt_metadata={\"video_id\": \"<id>\"}` to remix."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pyrit.models import Message, MessagePiece\n",
+    "\n",
+    "# Remix using the video_id captured from the text-to-video section above\n",
+    "remix_piece = MessagePiece(\n",
+    "    role=\"user\",\n",
+    "    original_value=\"Make it a watercolor painting style\",\n",
+    "    prompt_metadata={\"video_id\": video_id},\n",
+    ")\n",
+    "remix_result = await video_target.send_prompt_async(message=Message([remix_piece]))  # type: ignore\n",
+    "print(f\"Remixed video: {remix_result[0].message_pieces[0].converted_value}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9",
+   "metadata": {},
+   "source": [
+    "## Text+Image-to-Video\n",
+    "\n",
+    "Use an image as the first frame of the generated video. The input image dimensions must match\n",
+    "the video resolution (e.g. 1280x720). Pass both a text piece and an `image_path` piece in the same message."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "10",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import uuid\n",
+    "\n",
+    "# Create a simple test image matching the video resolution (1280x720)\n",
+    "from PIL import Image\n",
+    "\n",
+    "from pyrit.common.path import HOME_PATH\n",
+    "\n",
+    "sample_image = HOME_PATH / \"assets\" / \"pyrit_architecture.png\"\n",
+    "resized = Image.open(sample_image).resize((1280, 720)).convert(\"RGB\")\n",
+    "\n",
+    "import tempfile\n",
+    "\n",
+    "tmp = tempfile.NamedTemporaryFile(suffix=\".jpg\", delete=False)\n",
+    "resized.save(tmp, format=\"JPEG\")\n",
+    "tmp.close()\n",
+    "image_path = tmp.name\n",
+    "\n",
+    "# Send text + image to the video target\n",
+    "i2v_target = OpenAIVideoTarget()\n",
+    "conversation_id = str(uuid.uuid4())\n",
+    "\n",
+    "text_piece = MessagePiece(\n",
+    "    role=\"user\",\n",
+    "    original_value=\"Animate this image with gentle camera motion\",\n",
+    "    conversation_id=conversation_id,\n",
+    ")\n",
+    "image_piece = MessagePiece(\n",
+    "    role=\"user\",\n",
+    "    original_value=image_path,\n",
+    "    converted_value_data_type=\"image_path\",\n",
+    "    conversation_id=conversation_id,\n",
+    ")\n",
+    "result = await i2v_target.send_prompt_async(message=Message([text_piece, image_piece]))  # type: ignore\n",
+    "print(f\"Text+Image-to-video result: {result[0].message_pieces[0].converted_value}\")"
    ]
   }
  ],
  "metadata": {
+  "jupytext": {
+   "main_language": "python"
+  },
   "language_info": {
    "codemirror_mode": {
     "name": "ipython",
 
@@ -11,10 +11,18 @@
 # %% [markdown]
 # # 4. OpenAI Video Target
 #
-# This example shows how to use the video target to create a video from a text prompt.
+# `OpenAIVideoTarget` supports three modes:
+# - **Text-to-video**: Generate a video from a text prompt.
+# - **Remix**: Create a variation of an existing video (using `video_id` from a prior generation).
+# - **Text+Image-to-video**: Use an image as the first frame of the generated video.
 #
 # Note that the video scorer requires `opencv`, which is not a default PyRIT dependency. You need to install it manually or using `pip install pyrit[opencv]`.
 
+# %% [markdown]
+# ## Text-to-Video
+#
+# This example shows the simplest mode: generating video from text prompts, with scoring.
+
 # %%
 from pyrit.executor.attack import (
     AttackExecutor,
@@ -123,3 +131,67 @@
 
 for result in results:
     await ConsoleAttackResultPrinter().print_result_async(result=result, include_auxiliary_scores=True)  # type: ignore
+
+# Capture video_id from the first result for use in the remix section below
+video_id = results[0].last_response.prompt_metadata["video_id"]
+print(f"Video ID for remix: {video_id}")
+
+# %% [markdown]
+# ## Remix (Video Variation)
+#
+# Remix creates a variation of an existing video. After any successful generation, the response
+# includes a `video_id` in `prompt_metadata`. Pass this back via `prompt_metadata={"video_id": "<id>"}` to remix.
+
+# %%
+from pyrit.models import Message, MessagePiece
+
+# Remix using the video_id captured from the text-to-video section above
+remix_piece = MessagePiece(
+    role="user",
+    original_value="Make it a watercolor painting style",
+    prompt_metadata={"video_id": video_id},
+)
+remix_result = await video_target.send_prompt_async(message=Message([remix_piece]))  # type: ignore
+print(f"Remixed video: {remix_result[0].message_pieces[0].converted_value}")
+
+# %% [markdown]
+# ## Text+Image-to-Video
+#
+# Use an image as the first frame of the generated video. The input image dimensions must match
+# the video resolution (e.g. 1280x720). Pass both a text piece and an `image_path` piece in the same message.
+
+# %%
+import uuid
+
+# Create a simple test image matching the video resolution (1280x720)
+from PIL import Image
+
+from pyrit.common.path import HOME_PATH
+
+sample_image = HOME_PATH / "assets" / "pyrit_architecture.png"
+resized = Image.open(sample_image).resize((1280, 720)).convert("RGB")
+
+import tempfile
+
+tmp = tempfile.NamedTemporaryFile(suffix=".jpg", delete=False)
+resized.save(tmp, format="JPEG")
+tmp.close()
+image_path = tmp.name
+
+# Send text + image to the video target
+i2v_target = OpenAIVideoTarget()
+conversation_id = str(uuid.uuid4())
+
+text_piece = MessagePiece(
+    role="user",
+    original_value="Animate this image with gentle camera motion",
+    conversation_id=conversation_id,
+)
+image_piece = MessagePiece(
+    role="user",
+    original_value=image_path,
+    converted_value_data_type="image_path",
+    conversation_id=conversation_id,
+)
+result = await i2v_target.send_prompt_async(message=Message([text_piece, image_piece]))  # type: ignore
+print(f"Text+Image-to-video result: {result[0].message_pieces[0].converted_value}")
@@ -51,6 +51,57 @@ def get_piece(self, n: int = 0) -> MessagePiece:
 
         return self.message_pieces[n]
 
+    def get_pieces_by_type(
+        self,
+        *,
+        data_type: Optional[PromptDataType] = None,
+        original_value_data_type: Optional[PromptDataType] = None,
+        converted_value_data_type: Optional[PromptDataType] = None,
+    ) -> list[MessagePiece]:
+        """
+        Return all message pieces matching the given data type.
+
+        Args:
+            data_type: Alias for converted_value_data_type (for convenience).
+            original_value_data_type: The original_value_data_type to filter by.
+            converted_value_data_type: The converted_value_data_type to filter by.
+
+        Returns:
+            A list of matching MessagePiece objects (may be empty).
+        """
+        effective_converted = converted_value_data_type or data_type
+        results = self.message_pieces
+        if effective_converted:
+            results = [p for p in results if p.converted_value_data_type == effective_converted]
+        if original_value_data_type:
+            results = [p for p in results if p.original_value_data_type == original_value_data_type]
+        return list(results)
+
+    def get_piece_by_type(
+        self,
+        *,
+        data_type: Optional[PromptDataType] = None,
+        original_value_data_type: Optional[PromptDataType] = None,
+        converted_value_data_type: Optional[PromptDataType] = None,
+    ) -> Optional[MessagePiece]:
+        """
+        Return the first message piece matching the given data type, or None.
+
+        Args:
+            data_type: Alias for converted_value_data_type (for convenience).
+            original_value_data_type: The original_value_data_type to filter by.
+            converted_value_data_type: The converted_value_data_type to filter by.
+
+        Returns:
+            The first matching MessagePiece, or None if no match is found.
+        """
+        pieces = self.get_pieces_by_type(
+            data_type=data_type,
+            original_value_data_type=original_value_data_type,
+            converted_value_data_type=converted_value_data_type,
+        )
+        return pieces[0] if pieces else None
+
     @property
     def api_role(self) -> ChatMessageRole:
         """