|
| 1 | +# Message Traces |
| 2 | + |
| 3 | +Traces capture the full conversation history during LLM generation, including system prompts, user prompts, model reasoning, and the final response. This visibility is essential for understanding model behavior, debugging generation issues, and iterating on prompts. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +When generating content with LLM columns, you often need to understand what happened during generation: |
| 8 | + |
| 9 | +- What system prompt was used? |
| 10 | +- What did the rendered user prompt look like? |
| 11 | +- Did the model provide any reasoning content? |
| 12 | +- Did the model retry after failures? |
| 13 | +- How did the model arrive at the final answer? |
| 14 | + |
| 15 | +Traces provide this visibility by capturing the ordered message history for each generation, including any multi-turn conversations that occur during retry scenarios. |
| 16 | + |
| 17 | +## Enabling Traces |
| 18 | + |
| 19 | +### Per-Column (Recommended) |
| 20 | + |
| 21 | +Enable `with_trace=True` on specific LLM columns: |
| 22 | + |
| 23 | +```python |
| 24 | +import data_designer.config as dd |
| 25 | + |
| 26 | +builder.add_column( |
| 27 | + dd.LLMTextColumnConfig( |
| 28 | + name="answer", |
| 29 | + prompt="Answer: {{ question }}", |
| 30 | + model_alias="nvidia-text", |
| 31 | + with_trace=True, # Enable trace for this column |
| 32 | + ) |
| 33 | +) |
| 34 | +``` |
| 35 | + |
| 36 | +### Global Debug Override |
| 37 | + |
| 38 | +Enable traces for ALL LLM columns (useful during development): |
| 39 | + |
| 40 | +```python |
| 41 | +import data_designer.config as dd |
| 42 | +from data_designer.interface import DataDesigner |
| 43 | + |
| 44 | +data_designer = DataDesigner() |
| 45 | +data_designer.set_run_config( |
| 46 | + dd.RunConfig(debug_override_save_all_column_traces=True) |
| 47 | +) |
| 48 | +``` |
| 49 | + |
| 50 | +## Trace Column Naming |
| 51 | + |
| 52 | +When enabled, LLM columns produce an additional side-effect column: |
| 53 | + |
| 54 | +- `{column_name}__trace` |
| 55 | + |
| 56 | +For example, if your column is named `"answer"`, the trace column will be `"answer__trace"`. |
| 57 | + |
| 58 | +## Trace Data Structure |
| 59 | + |
| 60 | +Each trace is a `list[dict]` where each dict represents a message in the conversation. |
| 61 | + |
| 62 | +### Message Fields by Role |
| 63 | + |
| 64 | +| Role | Fields | Description | |
| 65 | +|------|--------|-------------| |
| 66 | +| `system` | `role`, `content` | System prompt setting model behavior | |
| 67 | +| `user` | `role`, `content` | User prompt (rendered from template) | |
| 68 | +| `assistant` | `role`, `content`, `reasoning_content` | Model response; may include reasoning from extended thinking models | |
| 69 | + |
| 70 | +### Example Trace (Simple Generation) |
| 71 | + |
| 72 | +A basic trace without retries: |
| 73 | + |
| 74 | +```python |
| 75 | +[ |
| 76 | + # System message (if configured) |
| 77 | + { |
| 78 | + "role": "system", |
| 79 | + "content": "You are a helpful assistant that provides clear, concise answers." |
| 80 | + }, |
| 81 | + # User message (the rendered prompt) |
| 82 | + { |
| 83 | + "role": "user", |
| 84 | + "content": "What is the capital of France?" |
| 85 | + }, |
| 86 | + # Final assistant response |
| 87 | + { |
| 88 | + "role": "assistant", |
| 89 | + "content": "The capital of France is Paris.", |
| 90 | + "reasoning_content": None # May contain reasoning if model supports it |
| 91 | + } |
| 92 | +] |
| 93 | +``` |
| 94 | + |
| 95 | +### Example Trace (With Correction Retry) |
| 96 | + |
| 97 | +When `max_correction_steps > 0` and parsing fails, traces capture the retry conversation: |
| 98 | + |
| 99 | +```python |
| 100 | +[ |
| 101 | + # System message |
| 102 | + { |
| 103 | + "role": "system", |
| 104 | + "content": "Return only valid JSON." |
| 105 | + }, |
| 106 | + # User message |
| 107 | + { |
| 108 | + "role": "user", |
| 109 | + "content": "Generate a person object with name and age." |
| 110 | + }, |
| 111 | + # First attempt (invalid) |
| 112 | + { |
| 113 | + "role": "assistant", |
| 114 | + "content": "Here's a person: {name: 'John', age: 30}" # Invalid JSON |
| 115 | + }, |
| 116 | + # Error feedback |
| 117 | + { |
| 118 | + "role": "user", |
| 119 | + "content": "JSONDecodeError: Expecting property name enclosed in double quotes" |
| 120 | + }, |
| 121 | + # Corrected response |
| 122 | + { |
| 123 | + "role": "assistant", |
| 124 | + "content": "{\"name\": \"John\", \"age\": 30}" |
| 125 | + } |
| 126 | +] |
| 127 | +``` |
| 128 | + |
| 129 | +## See Also |
| 130 | + |
| 131 | +- **[Run Config](../code_reference/run_config.md)**: Runtime options including `debug_override_save_all_column_traces` |
0 commit comments