Skip to content

Commit cb53745

Browse files
feat: semantic layer extension (#37815)
1 parent 9e91ae8 commit cb53745

141 files changed

Lines changed: 18849 additions & 665 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/superset-python-unittest.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ jobs:
5454
SUPERSET_SECRET_KEY: not-a-secret
5555
run: |
5656
pytest --durations-min=0.5 --cov=superset/sql/ ./tests/unit_tests/sql/ --cache-clear --cov-fail-under=100
57+
pytest --durations-min=0.5 --cov=superset/semantic_layers/ ./tests/unit_tests/semantic_layers/ --cache-clear --cov-fail-under=100
5758
- name: Upload code coverage
5859
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v5
5960
with:

UPDATING.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,13 @@ The Deck.gl MapBox chart's **Opacity**, **Default longitude**, **Default latitud
4646

4747
**To restore fit-to-data behavior:** Open the chart in Explore, clear the **Default longitude**, **Default latitude**, and **Zoom** fields in the Viewport section, and re-save the chart.
4848

49+
### Combined datasource list endpoint
50+
51+
Added a new combined datasource list endpoint at `GET /api/v1/datasource/` to serve datasets and semantic views in one response.
52+
53+
- The endpoint is available to users with at least one of `can_read` on `Dataset` or `SemanticView`.
54+
- Semantic views are included only when the `SEMANTIC_LAYERS` feature flag is enabled.
55+
- The endpoint enforces strict `order_column` validation and returns `400` for invalid sort columns.
4956
### ClickHouse minimum driver version bump
5057

5158
The minimum required version of `clickhouse-connect` has been raised to `>=0.13.0`. If you are using the ClickHouse connector, please upgrade your `clickhouse-connect` package. The `_mutate_label` workaround that appended hash suffixes to column aliases has also been removed, as it is no longer needed with modern versions of the driver.

docker/pythonpath_dev/superset_config.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,13 @@ class CeleryConfig:
105105

106106
CELERY_CONFIG = CeleryConfig
107107

108-
FEATURE_FLAGS = {"ALERT_REPORTS": True, "DATASET_FOLDERS": True}
108+
FEATURE_FLAGS = {
109+
"ALERT_REPORTS": True,
110+
"DATASET_FOLDERS": True,
111+
"ENABLE_EXTENSIONS": True,
112+
"SEMANTIC_LAYERS": True,
113+
}
114+
EXTENSIONS_PATH = "/app/docker/extensions"
109115
ALERT_REPORTS_NOTIFICATION_DRY_RUN = True
110116
WEBDRIVER_BASEURL = f"http://superset_app{os.environ.get('SUPERSET_APP_ROOT', '/')}/" # When using docker compose baseurl should be http://superset_nginx{ENV{BASEPATH}}/ # noqa: E501
111117
# The base URL for the email report hyperlinks.

docs/developer_docs/extensions/contribution-types.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -224,3 +224,52 @@ async def analysis_guide(ctx: Context) -> str:
224224
```
225225

226226
See [MCP Integration](./mcp) for implementation details.
227+
228+
### Semantic Layers
229+
230+
Extensions can register custom semantic layer implementations that allow Superset to connect to external data modeling frameworks. Each semantic layer defines how to authenticate, discover semantic views (tables/metrics/dimensions), and execute queries against the external system.
231+
232+
```python
233+
from superset_core.semantic_layers.decorators import semantic_layer
234+
from superset_core.semantic_layers.layer import SemanticLayer
235+
236+
from my_extension.config import MyConfig
237+
from my_extension.view import MySemanticView
238+
239+
240+
@semantic_layer(
241+
id="my_platform",
242+
name="My Data Platform",
243+
description="Connect to My Data Platform's semantic layer",
244+
)
245+
class MySemanticLayer(SemanticLayer[MyConfig, MySemanticView]):
246+
configuration_class = MyConfig
247+
248+
@classmethod
249+
def from_configuration(cls, configuration: dict) -> "MySemanticLayer":
250+
config = MyConfig.model_validate(configuration)
251+
return cls(config)
252+
253+
@classmethod
254+
def get_configuration_schema(cls, configuration=None) -> dict:
255+
return MyConfig.model_json_schema()
256+
257+
@classmethod
258+
def get_runtime_schema(cls, configuration=None, runtime_data=None) -> dict:
259+
return {"type": "object", "properties": {}}
260+
261+
def get_semantic_views(self, runtime_configuration: dict) -> set[MySemanticView]:
262+
# Return available views from the external platform
263+
...
264+
265+
def get_semantic_view(self, name: str, additional_configuration: dict) -> MySemanticView:
266+
# Return a specific view by name
267+
...
268+
```
269+
270+
**Note**: The `@semantic_layer` decorator automatically detects context and applies appropriate ID prefixing:
271+
272+
- **Extension context**: ID prefixed as `extensions.{publisher}.{name}.{id}`
273+
- **Host context**: Original ID used as-is
274+
275+
The decorator registers the class in the semantic layers registry, making it available in the UI for users to create connections. The `configuration_class` should be a Pydantic model that defines the fields needed to connect (credentials, project, database, etc.). Superset uses the model's JSON schema to render the configuration form dynamically.

docs/static/feature-flags.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,12 @@
8181
"lifecycle": "development",
8282
"description": "Expand nested types in Presto into extra columns/arrays. Experimental, doesn't work with all nested types."
8383
},
84+
{
85+
"name": "SEMANTIC_LAYERS",
86+
"default": false,
87+
"lifecycle": "development",
88+
"description": "Enable semantic layers and show semantic views alongside datasets"
89+
},
8490
{
8591
"name": "TABLE_V2_TIME_COMPARISON_ENABLED",
8692
"default": false,

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,7 @@ module = [
288288
"superset.tags.filters",
289289
"superset.commands.security.update",
290290
"superset.commands.security.create",
291+
"superset.semantic_layers.api",
291292
]
292293
warn_unused_ignores = false
293294

superset-core/pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,8 @@ classifiers = [
4343
]
4444
dependencies = [
4545
"flask-appbuilder>=5.0.2,<6",
46+
"isodate>=0.7.0",
47+
"pyarrow>=16.0.0",
4648
"pydantic>=2.8.0",
4749
"sqlalchemy>=1.4.0,<2.0",
4850
"sqlalchemy-utils>=0.38.0, <0.43", # expanding lowerbound to work with pydoris
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
from __future__ import annotations
19+
20+
from typing import Any
21+
22+
from pydantic import BaseModel
23+
24+
25+
def build_configuration_schema(
26+
config_class: type[BaseModel],
27+
configuration: BaseModel | None = None,
28+
) -> dict[str, Any]:
29+
"""
30+
Build a JSON schema from a Pydantic configuration class.
31+
32+
Handles generic boilerplate that any semantic layer with dynamic fields needs:
33+
34+
- Reorders properties to match model field order (Pydantic sorts alphabetically)
35+
- When ``configuration`` is None, sets ``enum: []`` on all ``x-dynamic`` properties
36+
so the frontend renders them as empty dropdowns
37+
38+
Semantic layer implementations call this instead of
39+
``model_json_schema()`` directly,
40+
then only need to add their own dynamic population logic.
41+
"""
42+
schema = config_class.model_json_schema()
43+
44+
# Pydantic sorts properties alphabetically; restore model field order
45+
field_order = [
46+
field.alias or name for name, field in config_class.model_fields.items()
47+
]
48+
schema["properties"] = {
49+
key: schema["properties"][key]
50+
for key in field_order
51+
if key in schema["properties"]
52+
}
53+
54+
if configuration is None:
55+
for prop_schema in schema["properties"].values():
56+
if prop_schema.get("x-dynamic"):
57+
prop_schema["enum"] = []
58+
59+
return schema
60+
61+
62+
def check_dependencies(
63+
prop_schema: dict[str, Any],
64+
configuration: BaseModel,
65+
) -> bool:
66+
"""
67+
Check whether a dynamic property's dependencies are satisfied.
68+
69+
Reads the ``x-dependsOn`` list from the property schema and returns ``True``
70+
when every referenced attribute on ``configuration`` is truthy.
71+
"""
72+
dependencies = prop_schema.get("x-dependsOn", [])
73+
return all(getattr(configuration, dep, None) for dep in dependencies)
Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
"""
19+
Semantic layer DAO interfaces for superset-core.
20+
21+
Provides abstract DAO classes for semantic layers and views that define the
22+
interface contract. Host implementations replace these with concrete classes
23+
backed by SQLAlchemy during initialization.
24+
25+
Usage:
26+
from superset_core.semantic_layers.daos import (
27+
AbstractSemanticLayerDAO,
28+
AbstractSemanticViewDAO,
29+
)
30+
"""
31+
32+
from __future__ import annotations
33+
34+
from abc import abstractmethod
35+
from typing import Any, ClassVar
36+
37+
from superset_core.common.daos import BaseDAO
38+
from superset_core.semantic_layers.models import SemanticLayerModel, SemanticViewModel
39+
40+
41+
class AbstractSemanticLayerDAO(BaseDAO[SemanticLayerModel]):
42+
"""
43+
Abstract DAO interface for SemanticLayer.
44+
45+
Host implementations will replace this class during initialization
46+
with a concrete DAO providing actual database access.
47+
"""
48+
49+
model_cls: ClassVar[type[Any] | None] = None
50+
base_filter = None
51+
id_column_name = "uuid"
52+
uuid_column_name = "uuid"
53+
54+
@classmethod
55+
@abstractmethod
56+
def validate_uniqueness(cls, name: str) -> bool:
57+
"""
58+
Validate that a semantic layer name is unique.
59+
60+
:param name: Semantic layer name to validate
61+
:return: True if the name is unique, False otherwise
62+
"""
63+
...
64+
65+
@classmethod
66+
@abstractmethod
67+
def validate_update_uniqueness(cls, layer_uuid: str, name: str) -> bool:
68+
"""
69+
Validate that a semantic layer name is unique for an update operation,
70+
excluding the layer being updated.
71+
72+
:param layer_uuid: UUID of the semantic layer being updated
73+
:param name: New name to validate
74+
:return: True if the name is unique, False otherwise
75+
"""
76+
...
77+
78+
@classmethod
79+
@abstractmethod
80+
def find_by_name(cls, name: str) -> SemanticLayerModel | None:
81+
"""
82+
Find a semantic layer by name.
83+
84+
:param name: Semantic layer name
85+
:return: SemanticLayerModel instance or None
86+
"""
87+
...
88+
89+
@classmethod
90+
@abstractmethod
91+
def get_semantic_views(cls, layer_uuid: str) -> list[SemanticViewModel]:
92+
"""
93+
Get all semantic views associated with a semantic layer.
94+
95+
:param layer_uuid: UUID of the semantic layer
96+
:return: List of SemanticViewModel instances
97+
"""
98+
...
99+
100+
101+
class AbstractSemanticViewDAO(BaseDAO[SemanticViewModel]):
102+
"""
103+
Abstract DAO interface for SemanticView.
104+
105+
Host implementations will replace this class during initialization
106+
with a concrete DAO providing actual database access.
107+
"""
108+
109+
model_cls: ClassVar[type[Any] | None] = None
110+
base_filter = None
111+
id_column_name = "id"
112+
uuid_column_name = "uuid"
113+
114+
@classmethod
115+
@abstractmethod
116+
def validate_uniqueness(
117+
cls,
118+
name: str,
119+
layer_uuid: str,
120+
configuration: dict[str, Any],
121+
) -> bool:
122+
"""
123+
Validate that a semantic view is unique within a semantic layer.
124+
125+
Uniqueness is determined by the combination of name, layer UUID, and
126+
configuration.
127+
128+
:param name: View name
129+
:param layer_uuid: UUID of the parent semantic layer
130+
:param configuration: Configuration dict to compare
131+
:return: True if unique, False otherwise
132+
"""
133+
...
134+
135+
@classmethod
136+
@abstractmethod
137+
def validate_update_uniqueness(
138+
cls,
139+
view_uuid: str,
140+
name: str,
141+
layer_uuid: str,
142+
configuration: dict[str, Any],
143+
) -> bool:
144+
"""
145+
Validate that a semantic view is unique within a semantic layer for an
146+
update operation, excluding the view being updated.
147+
148+
:param view_uuid: UUID of the view being updated
149+
:param name: New name to validate
150+
:param layer_uuid: UUID of the parent semantic layer
151+
:param configuration: Configuration dict to compare
152+
:return: True if unique, False otherwise
153+
"""
154+
...
155+
156+
@classmethod
157+
@abstractmethod
158+
def find_by_name(cls, name: str, layer_uuid: str) -> SemanticViewModel | None:
159+
"""
160+
Find a semantic view by name within a semantic layer.
161+
162+
:param name: View name
163+
:param layer_uuid: UUID of the parent semantic layer
164+
:return: SemanticViewModel instance or None
165+
"""
166+
...
167+
168+
169+
__all__ = ["AbstractSemanticLayerDAO", "AbstractSemanticViewDAO"]

0 commit comments

Comments
 (0)