Skip to content

Commit cd7a08e

Browse files
committed
Align native contract and SQL correctness
1 parent 326838a commit cd7a08e

108 files changed

Lines changed: 5065 additions & 521 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/native-fixtures.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,8 @@ The suite currently covers:
7979
- Parameter interpolation in query filters.
8080
- Pre-aggregation routing shape and DuckDB execution against seeded rollup tables.
8181
- Semantic SQL rewrite cases for single-model and relationship queries.
82-
- Query-local table calculations on the Rust SQL compiler path, including Rust-only DuckDB result coverage.
82+
- Query-local table calculations for the shared Python/Rust subset. Python applies these after fetching rows;
83+
Rust compiles them into SQL window expressions.
8384
- Native `.sql` definition files.
8485
- Native SQL frontmatter model definitions.
8586
- YAML `sql_metrics` and `sql_segments` blocks.
@@ -112,6 +113,15 @@ The default Rust runner loads every manifest fixture, asserts `expected/validati
112113

113114
The `adbc-exec` Rust runner executes every query with `expected_result` or `rust_expected_result` through DuckDB ADBC, using the fixture seed SQL and result columns from the manifest. Any Rust-only expected output must include `rust_only_reason`. It is enabled in CI after installing the DuckDB ADBC driver.
114115

116+
Table-calculation fixture contract:
117+
118+
- Shared table calculations may use `percent_of_total`, `percent_of_previous`, `running_total`, `rank`, `row_number`, or `moving_average`.
119+
- Shared calculations should include deterministic query `order_by` when row order affects the result.
120+
- Python evaluates shared calculations with `TableCalculationProcessor` after query execution.
121+
- Rust evaluates shared calculations by compiling them into SQL expressions.
122+
- Rust-only table calculation types (`dense_rank`, `difference`, `lead`, `lag`) must use `rust_expected_result` and `rust_only_reason`.
123+
- Python-only post-query table calculation types (`percent_of_column_total`, `percentile`) stay out of shared native fixtures until Rust supports them.
124+
115125
## Adding Fixtures
116126

117127
Add the narrowest fixture that proves one semantic behavior. Avoid kitchen-sink fixtures unless the behavior itself is cross-feature interaction.

docs/native-format.md

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,20 @@ The native format has two source forms:
99

1010
The native format is the runtime contract. External formats such as LookML, MetricFlow, Hex, Rill, Malloy, Omni, Superset, GoodData, Snowflake Cortex, ThoughtSpot, Holistics, Tableau, AtScale SML, BSL, and Yardstick should be converted into this format by Python importers before they are expected to run through the Rust native runtime.
1111

12+
## Rust Loader Scope
13+
14+
The Rust runtime and Rust CLI directory loader intentionally have a smaller direct
15+
input surface than Python:
16+
17+
- `.yml` / `.yaml`: native Sidemantic YAML or Cube YAML.
18+
- `.sql`: native Sidemantic SQL definition files.
19+
20+
They do not auto-detect LookML, MetricFlow/dbt manifests, Hex, Rill, Malloy,
21+
Omni, Superset, GoodData, Snowflake Cortex, ThoughtSpot, Holistics, Tableau,
22+
AtScale SML, BSL, Yardstick, or other external source formats. Convert those
23+
formats through the Python CLI/API first, then load the exported native YAML/SQL
24+
with the Rust runtime.
25+
1226
## Versioning
1327

1428
Current native format version: `1`.
@@ -61,6 +75,18 @@ Top-level sections:
6175
| `metrics` | No | Graph-level metrics. Rust assigns these to exactly one owning model when possible. |
6276
| `parameters` | No | Graph-level parameters for templates and query-time substitution. |
6377

78+
Top-level metrics are graph-scoped in the Python runtime. The Rust runtime does not
79+
store a separate graph-metric namespace at execution time; it assigns each top-level
80+
metric to one owning model by resolving explicit model references, metric dependencies,
81+
entity dimensions, or a single-model project fallback. If Rust cannot infer exactly
82+
one owner, loading fails. Portable native files should therefore make top-level metric
83+
dependencies explicit, for example `orders.total_revenue` rather than `total_revenue`
84+
when multiple models define the same local metric name. Dotted top-level metric names
85+
are allowed and are resolved by exact metric name before `model.metric` parsing.
86+
87+
Top-level parameters remain graph-scoped in both runtimes. Query APIs interpolate
88+
parameter values before SQL compilation.
89+
6490
## Models
6591

6692
Models describe physical or logical query sources.
@@ -94,6 +120,12 @@ At least one of `table`, `sql`, or `source_uri` should be present unless the mod
94120
| `pre_aggregations` | No | List of pre-aggregation definitions. |
95121
| `default_time_dimension` | No | Time dimension to add by default when the query needs time grouping. |
96122
| `default_grain` | No | Default time grain for the default time dimension. |
123+
| `auto_dimensions` | No | Python auto-discovery flag. Rust accepts `false` for compatibility and rejects `true` because it does not perform schema discovery. |
124+
125+
Canonical CLI-authored files should use `metrics` and `sql`. The native loaders
126+
also accept compatibility input aliases: model-level `measures` for `metrics`,
127+
dimension/metric `expr` for `sql`, and metric `measure` for `sql`. Exports use
128+
canonical field names.
97129

98130
Single-column primary key:
99131

@@ -332,8 +364,21 @@ relationships:
332364
| `primary_key_columns` | Conditional | Explicit target-column list. |
333365
| `through` | For many-to-many | Junction model. |
334366
| `through_foreign_key` | For many-to-many | Source-to-through key. |
367+
| `through_foreign_key_columns` | For many-to-many | Explicit source-to-through key columns. |
335368
| `related_foreign_key` | For many-to-many | Through-to-target key. |
336-
| `sql` | No | Custom join SQL using runtime placeholders where supported. |
369+
| `related_foreign_key_columns` | For many-to-many | Explicit through-to-target key columns. |
370+
| `sql` | No | Custom join SQL using `{from}` and `{to}` runtime placeholders. |
371+
372+
For CLI-authored native files, prefer explicit `foreign_key` and `primary_key`
373+
fields. Omitted keys are still supported for compatibility: `many_to_one`
374+
defaults the source key to `{name}_id`, while `one_to_many` and `one_to_one`
375+
default the related-side key to `id`; omitted `primary_key` resolves to the
376+
target model's declared primary key when building graph joins.
377+
378+
When `sql` is present, Python and Rust use it instead of the FK/PK-generated
379+
predicate. `{from}` is replaced with the source model's runtime alias and `{to}`
380+
with the target model's runtime alias. Reverse graph traversal swaps the
381+
placeholders automatically.
337382

338383
Relationship types:
339384

docs/runtime-feature-matrix.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ This matrix documents current product support for native Sidemantic projects. It
2727
| Conversion metrics | Yes | Yes, fixture-covered compile | No dedicated fixture yet | No dedicated fixture yet |
2828
| Retention metrics | Yes | Yes, fixture-covered compile | No dedicated fixture yet | No dedicated fixture yet |
2929
| Cohort metrics | Yes | Yes, fixture-covered compile | No dedicated fixture yet | No dedicated fixture yet |
30-
| Table calculations | Post-query processing | Yes, Rust fixture-covered compile and Rust-only result coverage | No dedicated fixture yet | No dedicated fixture yet |
30+
| Table calculations | Yes, shared fixture post-query result parity | Yes, shared fixture SQL/window result parity | No dedicated fixture yet | No dedicated fixture yet |
3131
| Pre-aggregation routing | Yes | Yes, fixture-covered compile | No dedicated fixture yet | No dedicated fixture yet |
3232
| Semantic SQL rewrite | Yes | Native subset, fixture-covered | Native subset target | Narrow subset |
3333
| DuckDB execution | Yes | Via ADBC, fixture result parity in CI | Native DuckDB process | No |

docs/rust-runtime.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,4 +105,4 @@ cd sidemantic-rs && cargo test --test native_fixtures
105105

106106
CI runs these in the `Native Compatibility` job.
107107

108-
The shared fixture suite currently includes executable coverage for basic models, joins, fanout-safe symmetric aggregation, many-to-many joins, parameters in filters, embedded SQL definitions, SQL frontmatter definitions, default time dimensions, segments, derived/ratio metrics, and pre-aggregation routing. Table calculations have Rust-only DuckDB result coverage because Python does not accept `table_calculations` in the native query API yet. `source_uri` is covered as a validation-only load fixture and query compilation rejects it until a concrete table or SQL source is provided.
108+
The shared fixture suite currently includes executable coverage for basic models, joins, fanout-safe symmetric aggregation, many-to-many joins, parameters in filters, embedded SQL definitions, SQL frontmatter definitions, default time dimensions, segments, derived/ratio metrics, table calculations, and pre-aggregation routing. `source_uri` is covered as a validation-only load fixture and query compilation rejects it until a concrete table or SQL source is provided.

scripts/generate_schema.py

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,54 @@
22
"""Generate JSON Schema from Pydantic models for YAML editor support."""
33

44
import json
5+
from copy import deepcopy
56
from pathlib import Path
67

78
from sidemantic import Dimension, Metric, Model, Parameter, Relationship, Segment
89

910

11+
def add_native_relationship_aliases(schema: dict) -> dict:
12+
"""Expose native YAML relationship aliases that map to Python API fields."""
13+
properties = schema.setdefault("properties", {})
14+
15+
if "foreign_key" in properties and "foreign_key_columns" not in properties:
16+
foreign_key_columns = deepcopy(properties["foreign_key"])
17+
foreign_key_columns["title"] = "Foreign Key Columns"
18+
foreign_key_columns["description"] = "Explicit source-column list (alias for foreign_key)"
19+
properties["foreign_key_columns"] = foreign_key_columns
20+
21+
if "primary_key" in properties and "primary_key_columns" not in properties:
22+
primary_key_columns = deepcopy(properties["primary_key"])
23+
primary_key_columns["title"] = "Primary Key Columns"
24+
primary_key_columns["description"] = "Explicit target-column list (alias for primary_key)"
25+
properties["primary_key_columns"] = primary_key_columns
26+
27+
if "sql" not in properties:
28+
properties["sql"] = {
29+
"anyOf": [{"type": "string"}, {"type": "null"}],
30+
"default": None,
31+
"description": "Custom join SQL using {from} and {to} runtime placeholders",
32+
"title": "Sql",
33+
}
34+
35+
return schema
36+
37+
38+
def patch_relationship_schemas(schema: dict) -> None:
39+
"""Patch every embedded Relationship schema emitted by Pydantic."""
40+
if not isinstance(schema, dict):
41+
return
42+
if schema.get("title") == "Relationship":
43+
add_native_relationship_aliases(schema)
44+
for value in schema.values():
45+
if isinstance(value, dict):
46+
patch_relationship_schemas(value)
47+
elif isinstance(value, list):
48+
for item in value:
49+
if isinstance(item, dict):
50+
patch_relationship_schemas(item)
51+
52+
1053
def generate_schema() -> dict:
1154
"""Generate JSON Schema for sidemantic YAML files."""
1255
# Get schemas from pydantic models
@@ -51,6 +94,8 @@ def generate_schema() -> dict:
5194
},
5295
}
5396

97+
patch_relationship_schemas(schema)
98+
5499
return schema
55100

56101

sidemantic-rs/src/config/loader.rs

Lines changed: 111 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ use crate::error::{Result, SidemanticError};
1515

1616
use super::schema::{CubeConfig, SidemanticConfig, NATIVE_FORMAT_VERSION};
1717
use super::sql_parser::{
18-
parse_sql_definitions, parse_sql_graph_definitions_extended, parse_sql_model,
18+
parse_sql_definitions, parse_sql_graph_definitions_extended, parse_sql_models,
1919
};
2020

2121
#[derive(Debug)]
@@ -226,52 +226,72 @@ fn model_from_sql_frontmatter(frontmatter: serde_yaml::Mapping) -> Result<Model>
226226
}
227227

228228
fn parse_sql_content(content: &str) -> Result<ParsedConfig> {
229-
let has_model_statement = {
230-
let upper = content.to_ascii_uppercase();
231-
upper.contains("MODEL") && upper.contains("MODEL (")
232-
};
233-
234229
let mut models: Vec<Model> = Vec::new();
235230
let mut top_level_metrics: Vec<Metric> = Vec::new();
236231
let mut top_level_parameters: Vec<Parameter> = Vec::new();
237232

238-
if has_model_statement {
239-
let model = parse_sql_model(content).map_err(|e| {
240-
SidemanticError::Validation(format!("failed to parse SQL model statement: {e}"))
241-
})?;
242-
let model_metric_names: HashSet<String> = model
243-
.metrics
244-
.iter()
245-
.map(|metric| metric.name.clone())
246-
.collect();
247-
models.push(model);
248-
249-
let (sql_metrics, _, sql_parameters, _) = parse_sql_graph_definitions_extended(content)
250-
.map_err(|e| {
251-
SidemanticError::Validation(format!("failed to parse SQL graph definitions: {e}"))
252-
})?;
253-
for metric in sql_metrics {
254-
if !model_metric_names.contains(&metric.name) {
255-
top_level_metrics.push(metric);
233+
match parse_sql_models(content) {
234+
Ok(parsed_models) => {
235+
let model_metric_names: HashSet<String> = parsed_models
236+
.iter()
237+
.flat_map(|model| model.metrics.iter().map(|metric| metric.name.clone()))
238+
.collect();
239+
models.extend(parsed_models);
240+
241+
let graph_definitions = parse_sql_graph_definitions_extended(content);
242+
let (sql_metrics, _, sql_parameters, _) = match graph_definitions {
243+
Ok(definitions) => definitions,
244+
Err(_)
245+
if content
246+
.trim_start()
247+
.to_ascii_lowercase()
248+
.starts_with("model ")
249+
&& content.to_ascii_lowercase().contains(" from ") =>
250+
{
251+
(Vec::new(), Vec::new(), Vec::new(), Vec::new())
252+
}
253+
Err(err) => {
254+
return Err(SidemanticError::Validation(format!(
255+
"failed to parse SQL graph definitions: {err}"
256+
)));
257+
}
258+
};
259+
for metric in sql_metrics {
260+
if !model_metric_names.contains(&metric.name) {
261+
top_level_metrics.push(metric);
262+
}
256263
}
264+
top_level_parameters.extend(sql_parameters);
257265
}
258-
top_level_parameters.extend(sql_parameters);
259-
} else {
260-
let (frontmatter, sql_body) = parse_sql_frontmatter_and_body(content)?;
261-
let (sql_metrics, sql_segments, sql_parameters, sql_preaggs) =
262-
parse_sql_graph_definitions_extended(&sql_body).map_err(|e| {
263-
SidemanticError::Validation(format!("failed to parse SQL graph definitions: {e}"))
264-
})?;
265-
top_level_parameters.extend(sql_parameters);
266-
267-
if let Some(frontmatter) = frontmatter {
268-
let mut model = model_from_sql_frontmatter(frontmatter)?;
269-
model.metrics.extend(sql_metrics);
270-
model.segments.extend(sql_segments);
271-
model.pre_aggregations.extend(sql_preaggs);
272-
models.push(model);
273-
} else {
274-
top_level_metrics.extend(sql_metrics);
266+
Err(model_err)
267+
if content
268+
.trim_start()
269+
.to_ascii_lowercase()
270+
.starts_with("model ") =>
271+
{
272+
return Err(SidemanticError::Validation(format!(
273+
"failed to parse SQL model statement: {model_err}"
274+
)));
275+
}
276+
Err(_) => {
277+
let (frontmatter, sql_body) = parse_sql_frontmatter_and_body(content)?;
278+
let (sql_metrics, sql_segments, sql_parameters, sql_preaggs) =
279+
parse_sql_graph_definitions_extended(&sql_body).map_err(|e| {
280+
SidemanticError::Validation(format!(
281+
"failed to parse SQL graph definitions: {e}"
282+
))
283+
})?;
284+
top_level_parameters.extend(sql_parameters);
285+
286+
if let Some(frontmatter) = frontmatter {
287+
let mut model = model_from_sql_frontmatter(frontmatter)?;
288+
model.metrics.extend(sql_metrics);
289+
model.segments.extend(sql_segments);
290+
model.pre_aggregations.extend(sql_preaggs);
291+
models.push(model);
292+
} else {
293+
top_level_metrics.extend(sql_metrics);
294+
}
275295
}
276296
}
277297

@@ -345,10 +365,14 @@ pub fn load_from_sql_string_with_metadata(content: &str) -> Result<LoadedGraphMe
345365
///
346366
/// This function:
347367
/// 1. Recursively finds all `.yml`/`.yaml`/`.sql` files
348-
/// 2. Auto-detects format (Sidemantic vs Cube.js)
368+
/// 2. Auto-detects only native Sidemantic YAML vs Cube YAML for YAML files
349369
/// 3. Parses and collects all models
350370
/// 4. Infers relationships from FK naming conventions
351371
/// 5. Returns a unified SemanticGraph
372+
///
373+
/// External formats supported by the Python package (LookML, MetricFlow, Hex,
374+
/// Rill, Malloy, and similar) must be converted to native YAML/SQL before using
375+
/// the Rust runtime loader.
352376
pub fn load_from_directory(dir: impl AsRef<Path>) -> Result<SemanticGraph> {
353377
Ok(load_from_directory_with_metadata(dir)?.graph)
354378
}
@@ -1001,7 +1025,9 @@ fn infer_relationships(models: &mut HashMap<String, Model>) {
10011025
primary_key_columns: Some(target_primary_keys.clone()),
10021026
through: None,
10031027
through_foreign_key: None,
1028+
through_foreign_key_columns: None,
10041029
related_foreign_key: None,
1030+
related_foreign_key_columns: None,
10051031
sql: None,
10061032
metadata: None,
10071033
},
@@ -1019,7 +1045,9 @@ fn infer_relationships(models: &mut HashMap<String, Model>) {
10191045
primary_key_columns: Some(target_primary_keys),
10201046
through: None,
10211047
through_foreign_key: None,
1048+
through_foreign_key_columns: None,
10221049
related_foreign_key: None,
1050+
related_foreign_key_columns: None,
10231051
sql: None,
10241052
metadata: None,
10251053
},
@@ -1218,6 +1246,47 @@ METRIC (
12181246
.contains("Unsupported native Sidemantic format version 2; supported version is 1"));
12191247
}
12201248

1249+
#[test]
1250+
fn test_load_from_sql_string_supports_compact_model_syntax() {
1251+
let sql = r#"
1252+
model orders from orders (
1253+
primary key (order_id)
1254+
status
1255+
sum(amount) as revenue
1256+
)
1257+
"#;
1258+
1259+
let loaded = load_from_sql_string_with_metadata(sql).unwrap();
1260+
let orders = loaded.graph.get_model("orders").unwrap();
1261+
assert!(orders.get_dimension("status").is_some());
1262+
assert!(orders.get_metric("revenue").is_some());
1263+
}
1264+
1265+
#[test]
1266+
fn test_load_from_sql_string_keeps_multiple_legacy_models_separate() {
1267+
let sql = r#"
1268+
MODEL (name orders, table orders, primary_key order_id);
1269+
METRIC order_count AS COUNT(*);
1270+
1271+
MODEL (name customers, table customers, primary_key customer_id);
1272+
METRIC customer_count AS COUNT(*);
1273+
"#;
1274+
1275+
let loaded = load_from_sql_string_with_metadata(sql).unwrap();
1276+
1277+
let orders = loaded.graph.get_model("orders").unwrap();
1278+
assert!(orders.get_metric("order_count").is_some());
1279+
assert!(orders.get_metric("customer_count").is_none());
1280+
1281+
let customers = loaded.graph.get_model("customers").unwrap();
1282+
assert!(customers.get_metric("customer_count").is_some());
1283+
assert!(customers.get_metric("order_count").is_none());
1284+
assert_eq!(
1285+
loaded.model_order,
1286+
vec!["orders".to_string(), "customers".to_string()]
1287+
);
1288+
}
1289+
12211290
#[test]
12221291
fn test_sql_frontmatter_version_is_not_model_metadata() {
12231292
let sql = r#"

sidemantic-rs/src/config/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,5 +15,5 @@ pub use loader::{
1515
pub use schema::{CubeConfig, ModelConfig, SidemanticConfig};
1616
pub use sql_parser::{
1717
parse_sql_definitions, parse_sql_graph_definitions, parse_sql_graph_definitions_extended,
18-
parse_sql_model, parse_sql_statement_blocks,
18+
parse_sql_model, parse_sql_models, parse_sql_statement_blocks,
1919
};

0 commit comments

Comments
 (0)