Skip to content

Commit cc1c275

Browse files
authored
Align native contract and SQL correctness (#184)
* Align native contract and SQL correctness * Project custom join predicate columns * Accept legacy native metric dependencies * Fix dotted graph metric SQL generation * Defer native metric inheritance until model merge * Detect native metrics-only YAML files * Ignore root-only SQL frontmatter * Defer native graph metric inheritance * Prefer exact graph metrics in Rust queries * Allow compact SQL graph definitions in Rust loader * Use graph metric SQL to resolve Rust owner * Preserve explicit derived inline aggregates * Resolve graph metric dependency owners
1 parent 3cf030f commit cc1c275

112 files changed

Lines changed: 6324 additions & 681 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/native-fixtures.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,8 @@ The suite currently covers:
7979
- Parameter interpolation in query filters.
8080
- Pre-aggregation routing shape and DuckDB execution against seeded rollup tables.
8181
- Semantic SQL rewrite cases for single-model and relationship queries.
82-
- Query-local table calculations on the Rust SQL compiler path, including Rust-only DuckDB result coverage.
82+
- Query-local table calculations for the shared Python/Rust subset. Python applies these after fetching rows;
83+
Rust compiles them into SQL window expressions.
8384
- Native `.sql` definition files.
8485
- Native SQL frontmatter model definitions.
8586
- YAML `sql_metrics` and `sql_segments` blocks.
@@ -112,6 +113,15 @@ The default Rust runner loads every manifest fixture, asserts `expected/validati
112113

113114
The `adbc-exec` Rust runner executes every query with `expected_result` or `rust_expected_result` through DuckDB ADBC, using the fixture seed SQL and result columns from the manifest. Any Rust-only expected output must include `rust_only_reason`. It is enabled in CI after installing the DuckDB ADBC driver.
114115

116+
Table-calculation fixture contract:
117+
118+
- Shared table calculations may use `percent_of_total`, `percent_of_previous`, `running_total`, `rank`, `row_number`, or `moving_average`.
119+
- Shared calculations should include deterministic query `order_by` when row order affects the result.
120+
- Python evaluates shared calculations with `TableCalculationProcessor` after query execution.
121+
- Rust evaluates shared calculations by compiling them into SQL expressions.
122+
- Rust-only table calculation types (`dense_rank`, `difference`, `lead`, `lag`) must use `rust_expected_result` and `rust_only_reason`.
123+
- Python-only post-query table calculation types (`percent_of_column_total`, `percentile`) stay out of shared native fixtures until Rust supports them.
124+
115125
## Adding Fixtures
116126

117127
Add the narrowest fixture that proves one semantic behavior. Avoid kitchen-sink fixtures unless the behavior itself is cross-feature interaction.

docs/native-format.md

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,20 @@ The native format has two source forms:
99

1010
The native format is the runtime contract. External formats such as LookML, MetricFlow, Hex, Rill, Malloy, Omni, Superset, GoodData, Snowflake Cortex, ThoughtSpot, Holistics, Tableau, AtScale SML, BSL, Yardstick, and Graphene GSQL should be converted into this format by Python importers before they are expected to run through the Rust native runtime.
1111

12+
## Rust Loader Scope
13+
14+
The Rust runtime and Rust CLI directory loader intentionally have a smaller direct
15+
input surface than Python:
16+
17+
- `.yml` / `.yaml`: native Sidemantic YAML or Cube YAML.
18+
- `.sql`: native Sidemantic SQL definition files.
19+
20+
They do not auto-detect LookML, MetricFlow/dbt manifests, Hex, Rill, Malloy,
21+
Omni, Superset, GoodData, Snowflake Cortex, ThoughtSpot, Holistics, Tableau,
22+
AtScale SML, BSL, Yardstick, or other external source formats. Convert those
23+
formats through the Python CLI/API first, then load the exported native YAML/SQL
24+
with the Rust runtime.
25+
1226
## Versioning
1327

1428
Current native format version: `1`.
@@ -61,6 +75,18 @@ Top-level sections:
6175
| `metrics` | No | Graph-level metrics. Rust assigns these to exactly one owning model when possible. |
6276
| `parameters` | No | Graph-level parameters for templates and query-time substitution. |
6377

78+
Top-level metrics are graph-scoped in the Python runtime. The Rust runtime does not
79+
store a separate graph-metric namespace at execution time; it assigns each top-level
80+
metric to one owning model by resolving explicit model references, metric dependencies,
81+
entity dimensions, or a single-model project fallback. If Rust cannot infer exactly
82+
one owner, loading fails. Portable native files should therefore make top-level metric
83+
dependencies explicit, for example `orders.total_revenue` rather than `total_revenue`
84+
when multiple models define the same local metric name. Dotted top-level metric names
85+
are allowed and are resolved by exact metric name before `model.metric` parsing.
86+
87+
Top-level parameters remain graph-scoped in both runtimes. Query APIs interpolate
88+
parameter values before SQL compilation.
89+
6490
## Models
6591

6692
Models describe physical or logical query sources.
@@ -94,6 +120,12 @@ At least one of `table`, `sql`, or `source_uri` should be present unless the mod
94120
| `pre_aggregations` | No | List of pre-aggregation definitions. |
95121
| `default_time_dimension` | No | Time dimension to add by default when the query needs time grouping. |
96122
| `default_grain` | No | Default time grain for the default time dimension. |
123+
| `auto_dimensions` | No | Python auto-discovery flag. Rust accepts `false` for compatibility and rejects `true` because it does not perform schema discovery. |
124+
125+
Canonical CLI-authored files should use `metrics` and `sql`. The native loaders
126+
also accept compatibility input aliases: model-level `measures` for `metrics`,
127+
dimension/metric `expr` for `sql`, and metric `measure` for `sql`. Exports use
128+
canonical field names.
97129

98130
Single-column primary key:
99131

@@ -332,8 +364,21 @@ relationships:
332364
| `primary_key_columns` | Conditional | Explicit target-column list. |
333365
| `through` | For many-to-many | Junction model. |
334366
| `through_foreign_key` | For many-to-many | Source-to-through key. |
367+
| `through_foreign_key_columns` | For many-to-many | Explicit source-to-through key columns. |
335368
| `related_foreign_key` | For many-to-many | Through-to-target key. |
336-
| `sql` | No | Custom join SQL using runtime placeholders where supported. |
369+
| `related_foreign_key_columns` | For many-to-many | Explicit through-to-target key columns. |
370+
| `sql` | No | Custom join SQL using `{from}` and `{to}` runtime placeholders. |
371+
372+
For CLI-authored native files, prefer explicit `foreign_key` and `primary_key`
373+
fields. Omitted keys are still supported for compatibility: `many_to_one`
374+
defaults the source key to `{name}_id`, while `one_to_many` and `one_to_one`
375+
default the related-side key to `id`; omitted `primary_key` resolves to the
376+
target model's declared primary key when building graph joins.
377+
378+
When `sql` is present, Python and Rust use it instead of the FK/PK-generated
379+
predicate. `{from}` is replaced with the source model's runtime alias and `{to}`
380+
with the target model's runtime alias. Reverse graph traversal swaps the
381+
placeholders automatically.
337382

338383
Relationship types:
339384

docs/runtime-feature-matrix.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ This matrix documents current product support for native Sidemantic projects. It
2727
| Conversion metrics | Yes | Yes, fixture-covered compile | No dedicated fixture yet | No dedicated fixture yet |
2828
| Retention metrics | Yes | Yes, fixture-covered compile | No dedicated fixture yet | No dedicated fixture yet |
2929
| Cohort metrics | Yes | Yes, fixture-covered compile | No dedicated fixture yet | No dedicated fixture yet |
30-
| Table calculations | Post-query processing | Yes, Rust fixture-covered compile and Rust-only result coverage | No dedicated fixture yet | No dedicated fixture yet |
30+
| Table calculations | Yes, shared fixture post-query result parity | Yes, shared fixture SQL/window result parity | No dedicated fixture yet | No dedicated fixture yet |
3131
| Pre-aggregation routing | Yes | Yes, fixture-covered compile | No dedicated fixture yet | No dedicated fixture yet |
3232
| Semantic SQL rewrite | Yes | Native subset, fixture-covered | Native subset target | Narrow subset |
3333
| DuckDB execution | Yes | Via ADBC, fixture result parity in CI | Native DuckDB process | No |

docs/rust-runtime.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,4 +105,4 @@ cd sidemantic-rs && cargo test --test native_fixtures
105105

106106
CI runs these in the `Native Compatibility` job.
107107

108-
The shared fixture suite currently includes executable coverage for basic models, joins, fanout-safe symmetric aggregation, many-to-many joins, parameters in filters, embedded SQL definitions, SQL frontmatter definitions, default time dimensions, segments, derived/ratio metrics, and pre-aggregation routing. Table calculations have Rust-only DuckDB result coverage because Python does not accept `table_calculations` in the native query API yet. `source_uri` is covered as a validation-only load fixture and query compilation rejects it until a concrete table or SQL source is provided.
108+
The shared fixture suite currently includes executable coverage for basic models, joins, fanout-safe symmetric aggregation, many-to-many joins, parameters in filters, embedded SQL definitions, SQL frontmatter definitions, default time dimensions, segments, derived/ratio metrics, table calculations, and pre-aggregation routing. `source_uri` is covered as a validation-only load fixture and query compilation rejects it until a concrete table or SQL source is provided.

scripts/generate_schema.py

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,54 @@
22
"""Generate JSON Schema from Pydantic models for YAML editor support."""
33

44
import json
5+
from copy import deepcopy
56
from pathlib import Path
67

78
from sidemantic import Dimension, Metric, Model, Parameter, Relationship, Segment
89

910

11+
def add_native_relationship_aliases(schema: dict) -> dict:
12+
"""Expose native YAML relationship aliases that map to Python API fields."""
13+
properties = schema.setdefault("properties", {})
14+
15+
if "foreign_key" in properties and "foreign_key_columns" not in properties:
16+
foreign_key_columns = deepcopy(properties["foreign_key"])
17+
foreign_key_columns["title"] = "Foreign Key Columns"
18+
foreign_key_columns["description"] = "Explicit source-column list (alias for foreign_key)"
19+
properties["foreign_key_columns"] = foreign_key_columns
20+
21+
if "primary_key" in properties and "primary_key_columns" not in properties:
22+
primary_key_columns = deepcopy(properties["primary_key"])
23+
primary_key_columns["title"] = "Primary Key Columns"
24+
primary_key_columns["description"] = "Explicit target-column list (alias for primary_key)"
25+
properties["primary_key_columns"] = primary_key_columns
26+
27+
if "sql" not in properties:
28+
properties["sql"] = {
29+
"anyOf": [{"type": "string"}, {"type": "null"}],
30+
"default": None,
31+
"description": "Custom join SQL using {from} and {to} runtime placeholders",
32+
"title": "Sql",
33+
}
34+
35+
return schema
36+
37+
38+
def patch_relationship_schemas(schema: dict) -> None:
39+
"""Patch every embedded Relationship schema emitted by Pydantic."""
40+
if not isinstance(schema, dict):
41+
return
42+
if schema.get("title") == "Relationship":
43+
add_native_relationship_aliases(schema)
44+
for value in schema.values():
45+
if isinstance(value, dict):
46+
patch_relationship_schemas(value)
47+
elif isinstance(value, list):
48+
for item in value:
49+
if isinstance(item, dict):
50+
patch_relationship_schemas(item)
51+
52+
1053
def generate_schema() -> dict:
1154
"""Generate JSON Schema for sidemantic YAML files."""
1255
# Get schemas from pydantic models
@@ -51,6 +94,8 @@ def generate_schema() -> dict:
5194
},
5295
}
5396

97+
patch_relationship_schemas(schema)
98+
5499
return schema
55100

56101

sidemantic-rs/examples/parity_adapter.rs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ enum Request {
2828
#[serde(default)]
2929
order_by: Vec<String>,
3030
limit: Option<usize>,
31+
offset: Option<usize>,
3132
#[serde(default)]
3233
ungrouped: bool,
3334
#[serde(default)]
@@ -137,6 +138,7 @@ fn handle(request: Request) -> sidemantic::Result<Response> {
137138
segments,
138139
order_by,
139140
limit,
141+
offset,
140142
ungrouped,
141143
skip_default_time_dimensions,
142144
dialect,
@@ -153,6 +155,9 @@ fn handle(request: Request) -> sidemantic::Result<Response> {
153155
if let Some(limit) = limit {
154156
query = query.with_limit(limit);
155157
}
158+
if let Some(offset) = offset {
159+
query = query.with_offset(offset);
160+
}
156161
let mut generator = SqlGenerator::new(&graph);
157162
if let Some(dialect) = dialect {
158163
generator = generator.with_dialect(parse_dialect(&dialect)?);
@@ -654,6 +659,10 @@ fn metric_aggregation_name(aggregation: Option<&Aggregation>) -> &'static str {
654659
Some(Aggregation::Min) => "min",
655660
Some(Aggregation::Max) => "max",
656661
Some(Aggregation::Median) => "median",
662+
Some(Aggregation::Stddev) => "stddev",
663+
Some(Aggregation::StddevPop) => "stddev_pop",
664+
Some(Aggregation::Variance) => "variance",
665+
Some(Aggregation::VariancePop) => "variance_pop",
657666
Some(Aggregation::Expression) | None => "sum",
658667
}
659668
}

0 commit comments

Comments
 (0)