NVIDIA-NeMo
diff --git a/‎docs/code_reference/run_config.md‎
Lines changed: 15 additions & 0 deletions b/‎docs/code_reference/run_config.md‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎docs/concepts/columns.md‎
Lines changed: 3 additions & 3 deletions b/‎docs/concepts/columns.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/concepts/traces.md‎
Lines changed: 131 additions & 0 deletions b/‎docs/concepts/traces.md‎
Lines changed: 131 additions & 0 deletions
diff --git a/‎mkdocs.yml‎
Lines changed: 1 addition & 0 deletions b/‎mkdocs.yml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎packages/data-designer-config/src/data_designer/config/column_configs.py‎
Lines changed: 13 additions & 7 deletions b/‎packages/data-designer-config/src/data_designer/config/column_configs.py‎
Lines changed: 13 additions & 7 deletions
diff --git a/‎packages/data-designer-config/src/data_designer/config/run_config.py‎
Lines changed: 5 additions & 0 deletions b/‎packages/data-designer-config/src/data_designer/config/run_config.py‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎packages/data-designer-config/src/data_designer/config/utils/constants.py‎
Lines changed: 1 addition & 1 deletion b/‎packages/data-designer-config/src/data_designer/config/utils/constants.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎packages/data-designer-config/tests/config/test_columns.py‎
Lines changed: 1 addition & 1 deletion b/‎packages/data-designer-config/tests/config/test_columns.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎packages/data-designer-engine/src/data_designer/engine/column_generators/generators/llm_completion.py‎
Lines changed: 7 additions & 4 deletions b/‎packages/data-designer-engine/src/data_designer/engine/column_generators/generators/llm_completion.py‎
Lines changed: 7 additions & 4 deletions
@@ -3,4 +3,19 @@
 The `run_config` module defines runtime settings that control dataset generation behavior,
 including early shutdown thresholds, batch sizing, and non-inference worker concurrency.
 
+## Usage
+
+```python
+import data_designer.config as dd
+from data_designer.interface import DataDesigner
+
+data_designer = DataDesigner()
+data_designer.set_run_config(dd.RunConfig(
+    buffer_size=500,
+    max_conversation_restarts=3,
+))
+```
+
+## API Reference
+
 ::: data_designer.config.run_config
@@ -38,8 +38,8 @@ LLM-Text columns generate natural language text: product descriptions, customer
 
 Use **Jinja2 templating** in prompts to reference other columns. Data Designer automatically manages dependencies and injects the referenced column values into the prompt.
 
-!!! note "Reasoning Traces"
-    Models that support extended thinking (chain-of-thought reasoning) can capture their reasoning process in a separate `{column_name}__reasoning_trace` column–useful for understanding *why* the model generated specific content. This column is automatically added to the dataset if the model and service provider parse and return reasoning content.
+!!! note "Generation Traces"
+    LLM columns can optionally capture a full message trace in a separate `{column_name}__trace` column. Enable traces per-column via `with_trace=True` on the column config, or globally for all columns via `RunConfig(debug_override_save_all_column_traces=True)`. The trace includes the ordered message history for the final generation attempt (system/user/assistant), and may include model reasoning fields when the provider exposes them.
 
 ### 💻 LLM-Code Columns
 
@@ -147,6 +147,6 @@ You read this property for introspection but never set it—always computed from
 
 ### `side_effect_columns`
 
-Computed property listing columns created implicitly alongside the primary column. Currently, only LLM columns produce side effects (reasoning trace columns like `{name}__reasoning_trace` when models use extended thinking).
+Computed property listing columns created implicitly alongside the primary column. Currently, only LLM columns produce side effects (trace columns like `{name}__trace` when `with_trace=True` is set on the column or `debug_override_save_all_column_traces` is enabled globally).
 
 For detailed information on each column type, refer to the [column configuration code reference](../code_reference/column_configs.md).
@@ -0,0 +1,131 @@
+# Message Traces
+
+Traces capture the full conversation history during LLM generation, including system prompts, user prompts, model reasoning, and the final response. This visibility is essential for understanding model behavior, debugging generation issues, and iterating on prompts.
+
+## Overview
+
+When generating content with LLM columns, you often need to understand what happened during generation:
+
+- What system prompt was used?
+- What did the rendered user prompt look like?
+- Did the model provide any reasoning content?
+- Did the model retry after failures?
+- How did the model arrive at the final answer?
+
+Traces provide this visibility by capturing the ordered message history for each generation, including any multi-turn conversations that occur during retry scenarios.
+
+## Enabling Traces
+
+### Per-Column (Recommended)
+
+Enable `with_trace=True` on specific LLM columns:
+
+```python
+import data_designer.config as dd
+
+builder.add_column(
+    dd.LLMTextColumnConfig(
+        name="answer",
+        prompt="Answer: {{ question }}",
+        model_alias="nvidia-text",
+        with_trace=True,  # Enable trace for this column
+    )
+)
+```
+
+### Global Debug Override
+
+Enable traces for ALL LLM columns (useful during development):
+
+```python
+import data_designer.config as dd
+from data_designer.interface import DataDesigner
+
+data_designer = DataDesigner()
+data_designer.set_run_config(
+    dd.RunConfig(debug_override_save_all_column_traces=True)
+)
+```
+
+## Trace Column Naming
+
+When enabled, LLM columns produce an additional side-effect column:
+
+- `{column_name}__trace`
+
+For example, if your column is named `"answer"`, the trace column will be `"answer__trace"`.
+
+## Trace Data Structure
+
+Each trace is a `list[dict]` where each dict represents a message in the conversation.
+
+### Message Fields by Role
+
+| Role | Fields | Description |
+|------|--------|-------------|
+| `system` | `role`, `content` | System prompt setting model behavior |
+| `user` | `role`, `content` | User prompt (rendered from template) |
+| `assistant` | `role`, `content`, `reasoning_content` | Model response; may include reasoning from extended thinking models |
+
+### Example Trace (Simple Generation)
+
+A basic trace without retries:
+
+```python
+[
+    # System message (if configured)
+    {
+        "role": "system",
+        "content": "You are a helpful assistant that provides clear, concise answers."
+    },
+    # User message (the rendered prompt)
+    {
+        "role": "user",
+        "content": "What is the capital of France?"
+    },
+    # Final assistant response
+    {
+        "role": "assistant",
+        "content": "The capital of France is Paris.",
+        "reasoning_content": None  # May contain reasoning if model supports it
+    }
+]
+```
+
+### Example Trace (With Correction Retry)
+
+When `max_correction_steps > 0` and parsing fails, traces capture the retry conversation:
+
+```python
+[
+    # System message
+    {
+        "role": "system",
+        "content": "Return only valid JSON."
+    },
+    # User message
+    {
+        "role": "user",
+        "content": "Generate a person object with name and age."
+    },
+    # First attempt (invalid)
+    {
+        "role": "assistant",
+        "content": "Here's a person: {name: 'John', age: 30}"  # Invalid JSON
+    },
+    # Error feedback
+    {
+        "role": "user",
+        "content": "JSONDecodeError: Expecting property name enclosed in double quotes"
+    },
+    # Corrected response
+    {
+        "role": "assistant",
+        "content": "{\"name\": \"John\", \"age\": 30}"
+    }
+]
+```
+
+## See Also
+
+- **[Run Config](../code_reference/run_config.md)**: Runtime options including `debug_override_save_all_column_traces`
@@ -20,6 +20,7 @@ nav:
       - Validators: concepts/validators.md
       - Processors: concepts/processors.md
       - Person Sampling: concepts/person_sampling.md
+      - Traces: concepts/traces.md
   - Tutorials:
       - Overview: notebooks/README.md
       - The Basics: notebooks/1-the-basics.ipynb
 
@@ -14,7 +14,7 @@
 from data_designer.config.models import ImageContext
 from data_designer.config.sampler_params import SamplerParamsT, SamplerType
 from data_designer.config.utils.code_lang import CodeLang
-from data_designer.config.utils.constants import REASONING_TRACE_COLUMN_POSTFIX
+from data_designer.config.utils.constants import TRACE_COLUMN_POSTFIX
 from data_designer.config.utils.misc import assert_valid_jinja2_template, extract_keywords_from_jinja2_template
 from data_designer.config.validator_params import ValidatorParamsT, ValidatorType
 
@@ -143,8 +143,8 @@ class LLMTextColumnConfig(SingleColumnConfig):
 
     LLM text columns generate free-form text content using language models via LiteLLM.
     Prompts support Jinja2 templating to reference values from other columns, enabling
-    context-aware generation. The generated text can optionally include reasoning traces
-    when models support extended thinking.
+    context-aware generation. The generated text can optionally include message traces
+    capturing the full conversation history.
 
     Attributes:
         prompt: Prompt template for text generation. Supports Jinja2 syntax to
@@ -159,13 +159,18 @@ class LLMTextColumnConfig(SingleColumnConfig):
             `LLMStructuredColumnConfig` for structured output, `LLMCodeColumnConfig` for code.
         multi_modal_context: Optional list of image contexts for multi-modal generation.
             Enables vision-capable models to generate text based on image inputs.
+        with_trace: If True, creates a `{column_name}__trace` column containing the full
+            ordered message history (system/user/assistant) for the generation.
+            Can be overridden globally via `RunConfig.debug_override_save_all_column_traces`.
+            Defaults to False.
         column_type: Discriminator field, always "llm-text" for this configuration type.
     """
 
     prompt: str
     model_alias: str
     system_prompt: str | None = None
     multi_modal_context: list[ImageContext] | None = None
+    with_trace: bool = False
     column_type: Literal["llm-text"] = "llm-text"
 
     @staticmethod
@@ -186,14 +191,15 @@ def required_columns(self) -> list[str]:
 
     @property
     def side_effect_columns(self) -> list[str]:
-        """Returns the reasoning trace column, which may be generated alongside the main column.
+        """Returns the trace column, which may be generated alongside the main column.
 
-        Reasoning traces are only returned if the served model parses and returns reasoning content.
+        Traces are generated when `with_trace=True` on the column config or
+        when `RunConfig.debug_override_save_all_column_traces=True` globally.
 
         Returns:
-            List containing the reasoning trace column name.
+            List containing the trace column name.
         """
-        return [f"{self.name}{REASONING_TRACE_COLUMN_POSTFIX}"]
+        return [f"{self.name}{TRACE_COLUMN_POSTFIX}"]
 
     @model_validator(mode="after")
     def assert_prompt_valid_jinja(self) -> Self:
 
@@ -33,6 +33,10 @@ class RunConfig(ConfigBase):
         max_conversation_correction_steps: Maximum number of correction rounds permitted within a
             single conversation when generation tasks call `ModelFacade.generate(...)`. Must be >= 0.
             Default is 0.
+        debug_override_save_all_column_traces: If True, overrides per-column `with_trace` settings
+            and includes `__trace` columns for ALL LLM generations, containing the full ordered
+            message history (system/user/assistant) for the final generation attempt.
+            Useful for debugging. Default is False.
     """
 
     disable_early_shutdown: bool = False
@@ -42,6 +46,7 @@ class RunConfig(ConfigBase):
     non_inference_max_parallel_workers: int = Field(default=4, ge=1)
     max_conversation_restarts: int = Field(default=5, ge=0)
     max_conversation_correction_steps: int = Field(default=0, ge=0)
+    debug_override_save_all_column_traces: bool = False
 
     @model_validator(mode="after")
     def normalize_shutdown_settings(self) -> Self:
 
@@ -166,7 +166,7 @@ class NordColor(Enum):
 MAX_TOP_P = 1.0
 MIN_TOP_P = 0.0
 MIN_MAX_TOKENS = 1
-REASONING_TRACE_COLUMN_POSTFIX = "__reasoning_trace"
+TRACE_COLUMN_POSTFIX = "__trace"
 
 AVAILABLE_LOCALES = [
     "ar_AA",
 
@@ -85,7 +85,7 @@ def test_llm_text_column_config():
     assert llm_text_column_config.system_prompt == stub_system_prompt
     assert llm_text_column_config.column_type == DataDesignerColumnType.LLM_TEXT
     assert set(llm_text_column_config.required_columns) == {"some_column", "some_other_column"}
-    assert llm_text_column_config.side_effect_columns == ["test_llm_text__reasoning_trace"]
+    assert llm_text_column_config.side_effect_columns == ["test_llm_text__trace"]
 
     # invalid prompt
     with pytest.raises(
 
@@ -12,7 +12,7 @@
     LLMStructuredColumnConfig,
     LLMTextColumnConfig,
 )
-from data_designer.config.utils.constants import REASONING_TRACE_COLUMN_POSTFIX
+from data_designer.config.utils.constants import TRACE_COLUMN_POSTFIX
 from data_designer.engine.column_generators.generators.base import ColumnGeneratorWithModel, GenerationStrategy
 from data_designer.engine.column_generators.utils.prompt_renderer import (
     PromptType,
@@ -66,7 +66,7 @@ def generate(self, data: dict) -> dict:
             for context in self.config.multi_modal_context:
                 multi_modal_context.extend(context.get_contexts(deserialized_record))
 
-        response, reasoning_trace = self.model.generate(
+        response, trace = self.model.generate(
             prompt=self.prompt_renderer.render(
                 record=deserialized_record,
                 prompt_template=self.config.prompt,
@@ -87,8 +87,11 @@ def generate(self, data: dict) -> dict:
         serialized_output = self.response_recipe.serialize_output(response)
         data[self.config.name] = self._process_serialized_output(serialized_output)
 
-        if reasoning_trace:
-            data[self.config.name + REASONING_TRACE_COLUMN_POSTFIX] = reasoning_trace
+        should_save_trace = (
+            self.config.with_trace or self.resource_provider.run_config.debug_override_save_all_column_traces
+        )
+        if should_save_trace:
+            data[self.config.name + TRACE_COLUMN_POSTFIX] = [message.to_dict() for message in trace]
 
         return data