Skip to content

Commit f37fdef

Browse files
committed
Replace theme with machine tags in design
ModelKey loses theme and type fields; gains name (from entry point key) and class_name (entry point value). Theme becomes overture:theme=buildings declared in [project].keywords rather than parsed from entry point naming. Introduces machine tag format [namespace:]key[=value] with tags_by_key/tags_by_namespace helpers in system. CLI replaces --theme with generic --group-by <key> that works for any structured tag dimension. Adds codegen path generation appendix: feature models use tags_by_key for theme directory, supplementary types use schema_root tag + module prefix stripping.
1 parent 40ade46 commit f37fdef

1 file changed

Lines changed: 167 additions & 43 deletions

File tree

docs/designs/tags.md

Lines changed: 167 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -5,46 +5,65 @@ string labels declared by package authors and derived by tag providers. Tags
55
become the filtering and grouping mechanism for model discovery, driven by
66
package authors rather than central coordination.
77

8-
Theme remains a first-class structural field. It maps to data partitioning and
9-
entry point naming, distinct from tags which are descriptive metadata.
8+
Theme becomes a tag (`overture:theme=buildings`), not a structural field.
9+
`system` provides generic key-based grouping without understanding what
10+
"theme" means. Any package can declare theme tags (or any other structured
11+
tag) without special support in the discovery layer.
1012

11-
## Phase 1: Basic Tagging
13+
## Phase 1: Structured Tags
1214

1315
Replace namespace with tags, move discovery to `system`, update CLI.
1416

17+
### Tag Format
18+
19+
Tags are strings following the pattern `[namespace:]key[=value]`:
20+
21+
- **Plain**: `overture`, `draft`
22+
- **Namespaced**: `system:extension` -- `:` separates ownership/reservation
23+
- **Namespaced k/v**: `overture:theme=buildings`, `codegen:schema_root=overture.schema`
24+
25+
`:` signals ownership/reservation. `=` signals a dimension with a value
26+
(groupable via `--group-by`). One level of each -- no nested colons or
27+
multiple `=` signs.
28+
1529
### Data Model
1630

17-
`ModelKey` drops `namespace`, gains `tags`:
31+
`ModelKey` drops `namespace` and `theme`, gains `name`:
1832

1933
```python
2034
@dataclass(frozen=True, slots=True)
2135
class ModelKey:
22-
theme: str | None
23-
type: str
24-
tags: frozenset[str]
25-
class_name: str
36+
name: str # friendly name from entry point key
37+
class_name: str # entry point value (module:Class)
38+
tags: frozenset[str] # plain and structured tags
2639
```
2740

41+
No `namespace` field, no `theme` field. Both are tags.
42+
43+
`name` is the entry point key (before any `#` suffix). It serves as the
44+
model's friendly identifier in CLI output and codegen path generation.
45+
2846
### Entry Point Format
2947

30-
Names change from `namespace:theme:type` to `theme:type` (or just `type` for
31-
non-themed models), with optional `#tag1,tag2` suffix. The group remains
48+
Entry point keys are `name` or `name#tag1,tag2`. No `theme:type` prefix --
49+
theme is a package-level keyword. The entry point group remains
3250
`overture.models`.
3351

3452
```toml
3553
[project]
36-
keywords = ["some-org-tag"]
54+
keywords = ["overture:theme=buildings"]
3755

3856
[project.entry-points."overture.models"]
39-
"buildings:building" = "overture.schema.buildings:Building"
40-
"buildings:building_part#draft" = "overture.schema.buildings:BuildingPart"
57+
building = "overture.schema.buildings:Building"
58+
"building_part#draft" = "overture.schema.buildings:BuildingPart"
4159
```
4260

4361
A model's tags are the union of:
4462

4563
- `[project].keywords` from the distribution metadata (read via
4664
`entry_point.dist.metadata["Keywords"]`)
4765
- Per-model `#` tags from the entry point name
66+
- Tag providers (Phase 2)
4867

4968
Note: `[project].keywords` is also used for PyPI search, so schema tags and
5069
PyPI keywords share a namespace. This was considered and accepted -- in
@@ -54,9 +73,35 @@ needed.
5473

5574
### Tag Prefix Reservation
5675

57-
The `system:` prefix is reserved for tag providers. Discovery rejects
58-
author-declared tags (keywords and `#`) that start with `system:`. Static
59-
sources containing `system:*` tags produce an error (or warning + discard).
76+
The `system:` prefix is reserved for tag providers registered by
77+
`overture-schema-system`. Discovery rejects author-declared tags (keywords
78+
and `#`) that start with `system:`. Static sources containing `system:*` tags
79+
produce an error (or warning + discard).
80+
81+
All other prefixes are convention-based with no enforcement.
82+
83+
### Tag Parsing Helpers
84+
85+
`system` provides utilities for working with structured tags:
86+
87+
```python
88+
def tags_by_key(tags: frozenset[str], key: str) -> set[str]:
89+
"""Extract values for k/v tags with the given key.
90+
91+
tags_by_key(frozenset({"overture:theme=buildings", "overture", "draft"}), "overture:theme")
92+
-> {"buildings"}
93+
"""
94+
95+
def tags_by_namespace(tags: frozenset[str], namespace: str) -> set[str]:
96+
"""Extract tag bodies within a namespace.
97+
98+
tags_by_namespace(frozenset({"system:extension", "overture"}), "system")
99+
-> {"extension"}
100+
"""
101+
```
102+
103+
`tags_by_key` is the primitive that CLI `--group-by` and codegen path
104+
generation build on. Both live in `system` alongside `ModelKey`.
60105

61106
### Discovery
62107

@@ -72,51 +117,61 @@ intersect the filter set are included. When `None`, all models are returned.
72117
Implementation:
73118

74119
1. Iterate `overture.models` entry points
75-
2. Split name on `#` to separate `theme:type` from per-model tags
76-
3. Split the prefix on `:` -- one part = type only (theme is `None`), two
77-
parts = theme + type
78-
4. Load the model class via `entry_point.load()`
79-
5. Read `entry_point.dist.metadata["Keywords"]` for package-level tags
80-
6. Union package tags and per-model tags, rejecting any `system:*` tags
81-
7. Build `ModelKey` with `frozenset` tags
120+
2. Split name on `#` to separate `name` from per-model tags
121+
3. Load the model class via `entry_point.load()`
122+
4. Read `entry_point.dist.metadata["Keywords"]` for package-level tags
123+
5. Union package tags and per-model tags, rejecting any `system:*` tags
124+
6. Run tag providers (Phase 2)
125+
7. Build `ModelKey(name=name, class_name=entry_point.value, tags=frozenset(tags))`
82126

83127
### CLI
84128

85-
`--namespace` and `--overture-types` are removed. `--tag` (repeatable) replaces
86-
them. `--theme` and `--type` remain.
129+
`--namespace`, `--overture-types`, and `--theme` are removed. Replaced by:
130+
131+
- `--tag <tag>` (repeatable, OR within tags, AND across other dimensions)
132+
- `--group-by <key>` -- group output by values of matching `[ns:]key=*` tags
87133

88134
Repeated `--tag` flags use OR: `--tag foo --tag bar` matches models with
89-
either tag. Repeated `--theme` flags also use OR. Filters compose as AND
90-
across dimensions: `--tag foo --theme buildings` takes the OR result from tags
91-
and intersects it with the OR result from themes.
135+
either tag.
92136

93-
`list-types` groups by theme and displays tags per model:
137+
`list-types` with `--group-by` groups by the specified tag key and displays
138+
tags per model:
94139

95140
```
141+
$ overture-schema list-types --group-by overture:theme
142+
96143
buildings
97144
building overture
98145
building_part overture, draft
146+
99147
places
100148
place overture
101-
transportation
102-
connector overture
103-
segment overture
104-
(unthemed)
149+
150+
(ungrouped)
105151
sources overture
106152
```
107153

154+
Models without a matching tag appear in "(ungrouped)".
155+
156+
`--group-by` is generic: `--group-by acme:category` works if packages use
157+
`acme:category=widgets` tags. Default behavior (no `--group-by`): flat list
158+
with tags column.
159+
108160
### Migration
109161

110-
Existing `namespace:theme:type` entry points become `theme:type` (or `type`).
111-
The `namespace` field is removed from `ModelKey`. The `overture` namespace
112-
does not need replacement as a keyword or `#` tag -- Phase 2 introduces a tag
113-
provider in `core` that derives the `overture` tag from `OvertureFeature`
114-
inheritance. The `annex` namespace has no equivalent; annex packages either
115-
declare their own keywords or rely on tag providers.
162+
Existing `namespace:theme:type` entry points become `name` (just the type
163+
name). The `namespace` and `theme` fields are removed from `ModelKey`. Theme
164+
moves to `overture:theme=X` in `[project].keywords` on each theme package.
165+
166+
The `overture` namespace does not need replacement as a keyword or `#` tag --
167+
Phase 2 introduces a tag provider in `core` that derives the `overture` tag
168+
from `OvertureFeature` inheritance. The `annex` namespace has no equivalent;
169+
annex packages either declare their own keywords or rely on tag providers.
116170

117171
### What Moves
118172

119173
- `discover_models`, `ModelKey` move to `overture.schema.system.discovery`
174+
- Tag parsing helpers (`tags_by_key`, `tags_by_namespace`) live in `system`
120175
- CLI imports from `system` directly
121176

122177
---
@@ -194,6 +249,10 @@ def overture_provider(
194249
return tags
195250
```
196251

252+
This provider does NOT add `overture:theme=X` -- theme comes from package
253+
keywords. The provider adds only the flat `overture` tag based on
254+
inheritance.
255+
197256
Theme packages do not need `"overture"` in `[project].keywords`. The tag is
198257
derived from `OvertureFeature` inheritance by this provider.
199258

@@ -246,9 +305,9 @@ def extension_provider(
246305
```
247306

248307
Package authors register extensions as regular `overture.models` entry points.
249-
Extensions follow the same naming rules as other models: `theme:type` for
250-
themed extensions, or just `type` for unthemed ones. The `system:extension`
251-
tag is derived automatically; authors never declare it.
308+
Extensions follow the same naming rules as other models: `name` or
309+
`name#tag1,tag2`. The `system:extension` tag is derived automatically; authors
310+
never declare it.
252311

253312
### Extension Discovery Helper
254313

@@ -277,6 +336,8 @@ The CLI treats `system:extension` as a presentation hint:
277336
they extend. Core models show a compact cross-reference to available extensions.
278337

279338
```
339+
$ overture-schema list-types --group-by overture:theme
340+
280341
buildings
281342
building overture
282343
+ capacity, operating-hours
@@ -297,7 +358,7 @@ extensions
297358
after the model's own fields.
298359

299360
```
300-
building (buildings) overture
361+
building overture, overture:theme=buildings
301362
302363
Fields:
303364
geometry Geometry
@@ -376,3 +437,66 @@ is TBD.
376437
- **Organizational policy**: a company publishes their own manifest package
377438
with their own approved set (including approved extensions).
378439
- **CLI filtering**: `--tag approved` shows only certified models.
440+
441+
---
442+
443+
## Appendix: Codegen Path Generation
444+
445+
Two strategies depending on whether a type has an entry point.
446+
447+
### Feature Models (have entry points and tags)
448+
449+
`tags_by_key(key.tags, "overture:theme")` provides the theme directory
450+
(omitted if absent). `to_snake_case(spec.name)` provides the subdirectory and
451+
filename.
452+
453+
```
454+
{theme}/{slug}/{slug}.md # with theme tag
455+
{slug}/{slug}.md # without theme tag
456+
```
457+
458+
No module path involved for features.
459+
460+
### Supplementary Types (no entry points)
461+
462+
Module prefix stripping replaces theme-tag matching for path generation.
463+
464+
**`schema_root` resolution** (checked in order):
465+
466+
1. **Explicit tag**: `codegen:schema_root=overture.schema` in package keywords
467+
2. **Tag provider**: `core` registers a provider that adds
468+
`codegen:schema_root=overture.schema` to models from Overture packages
469+
3. **Heuristic**: compute common prefix of module paths across the entry
470+
point group (the `compute_schema_root()` fallback)
471+
472+
**Path algorithm** for any type:
473+
474+
1. Resolve `schema_root` for the type's distribution
475+
2. Get `__module__`, strip `{schema_root}.` prefix
476+
3. Walk remaining components, keep only packages
477+
(`hasattr(mod, '__path__')`) -- drops file-level modules
478+
4. Result is the directory path
479+
480+
| `__module__` | Root | After strip | Pkg-filtered | Path |
481+
|---|---|---|---|---|
482+
| `o.s.divisions.division.models` | `overture.schema` | `divisions.division.models` | `divisions.division` | `divisions/division/` |
483+
| `acme.widgets.places.models` | `acme.widgets` | `places.models` | `places` | `places/` |
484+
| `acme.widgets.places.restaurants.types` | `acme.widgets` | `places.restaurants.types` | `places.restaurants` | `places/restaurants/` |
485+
486+
Then the existing nesting logic decides: single-feature reference nests under
487+
that feature; multi-feature reference stays at theme/intermediate level.
488+
489+
### Core/System Types
490+
491+
Uses the same `schema_root` + package-filter algorithm. With
492+
`schema_root=overture.schema`, a type in `overture.schema.core.cartography`
493+
yields path `core/` (if `cartography` is a file) or `core/cartography/`
494+
(if promoted to a package).
495+
496+
The hardcoded `_SUBSYSTEM_MAP` exists because file-level modules lose
497+
path significance when filtered. Promoting meaningful modules (like
498+
`cartography.py`) to packages eliminates the map -- a code organization
499+
change noted as follow-up work.
500+
501+
`compute_schema_root()` and `_SCHEMA_PREFIX` are replaced by the
502+
`schema_root` tag.

0 commit comments

Comments
 (0)