feat: Jinja/dbt template preprocessing for SQL analysis tools#67
Closed
anandgupta42 wants to merge 5 commits intomainfrom
Closed
feat: Jinja/dbt template preprocessing for SQL analysis tools#67anandgupta42 wants to merge 5 commits intomainfrom
anandgupta42 wants to merge 5 commits intomainfrom
Conversation
SQL analysis tools (analyze, format, optimize, translate) now automatically detect and preprocess Jinja-templated dbt SQL before analysis. This enables meaningful analysis of dbt models without requiring users to manually strip template syntax. The preprocessor handles: ref(), source(), config(), var(), this, comments, if/elif/else/endif, for/endfor, set, macro blocks, adapter.dispatch, and other common dbt Jinja patterns. Closes #63 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| variables_found: list[str] = Field(default_factory=list) | ||
| macros_removed: list[str] = Field(default_factory=list) | ||
| warnings: list[str] = Field(default_factory=list) | ||
| error: str = Field(default=None) # Uses same pattern as rest of file |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Spawns `python -m altimate_engine.server` as a real subprocess and communicates via stdin/stdout JSON-RPC protocol, exactly as the TypeScript CLI bridge does in production. 19 E2E tests covering: - Core preprocessing (ref, source, var, config, complex dbt models) - Batch requests in a single server session - Auto-preprocessing in downstream tools (analyze, format, translate, optimize) - Error handling (invalid JSON, unknown methods, missing params) - JSON-RPC 2.0 protocol correctness (response IDs, field presence) Also fixes unit test assertions for downstream tools to handle environments without altimate_core gracefully. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines
502
to
+506
| original_sql=params_obj.sql, | ||
| optimized_sql=rw.get("rewritten_sql", params_obj.sql), | ||
| optimized_sql=rw.get("rewritten_sql", sql_to_optimize), | ||
| suggestions=suggestions, | ||
| anti_patterns=anti_patterns, | ||
| confidence=opt_confidence, |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
…ine.ts` Mixing `||` and `??` without explicit parentheses is a JS syntax error. Both error_message expressions in `ensureEngineImpl` now correctly group `(e?.stderr?.toString() || e?.message)` before the `??` fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines
584
to
591
| if jinja_fmt_note and not fmt_error: | ||
| fmt_error = jinja_fmt_note | ||
| result = SqlFormatResult( | ||
| success=raw.get("success", True), | ||
| formatted_sql=raw.get("formatted_sql", raw.get("sql")), | ||
| statement_count=raw.get("statement_count", 1), | ||
| error=raw.get("error"), | ||
| error=fmt_error, | ||
| ) |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
| # 4. {% if ... %}...{% endif %} — keep inner content | ||
| _RE_IF_OPEN = re.compile(r"\{%-?\s*if\b[^%]*?-?%\}") | ||
| _RE_ELIF = re.compile(r"\{%-?\s*elif\b[^%]*?-?%\}") | ||
| _RE_ELSE = re.compile(r"\{%-?\s*else\s*-?%\}") |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
1. Fix `error` field type in SqlPreprocessJinjaResult: use `str | None`
instead of `str = Field(default=None)` to match Pydantic v2 typing.
2. Fix sql.optimize fallback: use `params_obj.sql` (original) instead of
`sql_to_optimize` (preprocessed) so Jinja-only preprocessing isn't
incorrectly reported as an optimization.
3. Fix sql.format Jinja note: append as SQL comment to formatted output
instead of abusing the `error` field, which clients ignore on success.
4. Fix regex `[^%]*?` in Jinja preprocessor: use `.*?` so modulo
operators (%) inside tags like `{% if loop.index % 2 == 0 %}` are
matched correctly. Added test for this edge case.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use __jinja_expr__ placeholder for adapter.dispatch/return/log instead of empty string to avoid invalid SQL in expression positions - Add Jinja auto-preprocessing to lineage.check and sql.explain - Improve sql.optimize fallback: preserve original SQL when only Jinja preprocessing occurred (no actual rewrites) - Simplify sql.format: drop the SQL comment approach, just format the preprocessed SQL cleanly - Update tests to verify __jinja_expr__ placeholder behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sql_analyze,sql_format,sql_optimize,sql_translate) now automatically detect and preprocess Jinja-templated dbt SQL before analysissql_preprocess_jinjatool for explicit preprocessing when neededjinja_preprocessormodule that handles all common dbt Jinja patternsWhat changed
Python engine (
packages/altimate-engine/)New file:
src/altimate_engine/sql/jinja_preprocessor.pyCore preprocessor that handles:
{{ ref('model') }}model{{ source('src', 'table') }}src__table{{ config(...) }}{{ var('name') }}'__var_name__'{{ this }}/{{ this.identifier }}__this__{# comments #}{% if %}...{% endif %}{% for %}...{% endfor %}{% set %},{% macro %}...{% endmacro %}{{ adapter.dispatch() }},{{ log() }},{{ return() }}{{ expr }}__jinja_expr__+ warningModified:
src/altimate_engine/server.pysql.analyze: auto-preprocesses Jinja, adds note toconfidence_factorssql.translate: auto-preprocesses Jinja, adds warning about re-applying templatessql.optimize: auto-preprocesses Jinja, lowers confidence to "medium"sql.format: auto-preprocesses Jinja, adds note about plain SQL outputsql.preprocess_jinjaRPC method for explicit preprocessingModified:
src/altimate_engine/models.pySqlPreprocessJinjaParamsandSqlPreprocessJinjaResultmodelsTypeScript CLI (
packages/altimate-code/)New file:
src/tool/sql-preprocess-jinja.tssql_preprocess_jinjatool registered in the tool registryModified:
src/bridge/protocol.tsSqlPreprocessJinjaParamsandSqlPreprocessJinjaResultinterfacessql.preprocess_jinjatoBridgeMethodsregistryModified:
src/tool/registry.tsSqlPreprocessJinjaToolTests (
packages/altimate-engine/tests/)New file:
tests/test_jinja_preprocessor.py— 60 tests covering:contains_jinja()detection (7 tests)ref()macro handling (6 tests)source()macro handling (4 tests)var()macro handling (3 tests)config()removal (3 tests){{ this }}handling (3 tests){% if/elif/else/endif %}blocks (4 tests){% for/endfor %}blocks (1 test){% set %}handling (2 tests){% macro/endmacro %}blocks (1 test)Test plan
pytest tests/test_jinja_preprocessor.py -k "not ServerDispatch")tsc --noEmit— our files clean)altimate run "Analyze: SELECT * FROM {{ ref('stg_orders') }}"altimate run "Format: SELECT id FROM {{ source('raw', 'events') }} WHERE date > {{ var('start') }}"Closes #63
🤖 Generated with Claude Code