Conversation
`LLMTextColumnConfig.required_columns` and `ImageColumnConfig.required_columns` only extracted dependencies from Jinja2 prompt templates, missing columns referenced via `multi_modal_context`. This caused the async engine's execution graph to dispatch LLM tasks before their multi-modal seed columns were loaded, resulting in KeyError failures under DATA_DESIGNER_ASYNC_ENGINE=1. Made-with: Cursor
|
Docs preview: https://e702c059.dd-docs-preview.pages.dev
|
Greptile SummaryThis PR fixes a missing dependency declaration in
|
| Filename | Overview |
|---|---|
| packages/data-designer-config/src/data_designer/config/column_configs.py | Adds multi_modal_context column names to required_columns in both LLMTextColumnConfig and ImageColumnConfig, with deduplication via set; subclasses (LLMCodeColumnConfig, LLMStructuredColumnConfig, LLMJudgeColumnConfig) inherit the fix automatically. |
| packages/data-designer-config/tests/config/test_columns.py | Adds three new unit tests for multi-modal context inclusion and deduplication in LLMTextColumnConfig and ImageColumnConfig; tests are well-scoped and correct. |
Sequence Diagram
sequenceDiagram
participant DAG as Async Engine / DAG
participant ColCfg as LLMTextColumnConfig / ImageColumnConfig
participant SeedCol as Seed Image Column
Note over ColCfg: required_columns (before fix)
ColCfg->>DAG: returns only Jinja2 template deps
DAG->>ColCfg: schedules LLM task (missing seed dep!)
ColCfg->>SeedCol: KeyError — seed column not yet loaded
Note over ColCfg: required_columns (after fix)
ColCfg->>DAG: returns Jinja2 deps + multi_modal_context column names
DAG->>SeedCol: waits for seed column to be ready
SeedCol-->>DAG: seed column available
DAG->>ColCfg: schedules LLM task with correct ordering
Reviews (1): Last reviewed commit: "fix: include multi_modal_context columns..." | Re-trigger Greptile
Code Review: PR #522 — fix: include multi_modal_context columns in required_columnsSummaryThis PR fixes a bug where Files changed: 2 (+44 / -5)
FindingsCorrectness — No Issues
Tests — Adequate
Minor Observations (Non-blocking)
VerdictLGTM. This is a clean, focused bug fix. The change is minimal, correct, and consistent with the existing code style. The three new tests cover the key scenarios (inclusion, deduplication, both config types). No architectural concerns — the fix stays within the config layer and doesn't violate import direction rules. Ship it. |
📋 Summary
required_columnsonLLMTextColumnConfigandImageColumnConfigonly extracted dependencies from Jinja2 prompt/system_prompt templates, ignoring columns referenced viamulti_modal_context. This caused the async engine's execution graph to dispatch LLM tasks before their multi-modal seed columns were loaded, resulting inKeyErrorfailures underDATA_DESIGNER_ASYNC_ENGINE=1.🔗 Related Issue
Fixes #520
🔄 Changes
LLMTextColumnConfig.required_columnsnow includesmulti_modal_context[*].column_namein the returned dependency listImageColumnConfig.required_columns— same fix applied🧪 Testing
make testpasses✅ Checklist
Made with Cursor