|
| 1 | +# Tags: Extensible Model Grouping via Entry Points |
| 2 | + |
| 3 | +Replace the hardcoded `namespace` concept (`"overture"`, `"annex"`) with tags -- |
| 4 | +string labels declared by package authors and derived by tag providers. Tags |
| 5 | +become the filtering and grouping mechanism for model discovery, driven by |
| 6 | +package authors rather than central coordination. |
| 7 | + |
| 8 | +Theme remains a first-class structural field. It maps to data partitioning and |
| 9 | +entry point naming, distinct from tags which are descriptive metadata. |
| 10 | + |
| 11 | +## Phase 1: Basic Tagging |
| 12 | + |
| 13 | +Replace namespace with tags, move discovery to `system`, update CLI. |
| 14 | + |
| 15 | +### Data Model |
| 16 | + |
| 17 | +`ModelKey` drops `namespace`, gains `tags`: |
| 18 | + |
| 19 | +```python |
| 20 | +@dataclass(frozen=True, slots=True) |
| 21 | +class ModelKey: |
| 22 | + theme: str | None |
| 23 | + type: str |
| 24 | + tags: frozenset[str] |
| 25 | + class_name: str |
| 26 | +``` |
| 27 | + |
| 28 | +### Entry Point Format |
| 29 | + |
| 30 | +Names change from `namespace:theme:type` to `theme:type` (or just `type` for |
| 31 | +non-themed models), with optional `#tag1,tag2` suffix. The group remains |
| 32 | +`overture.models`. |
| 33 | + |
| 34 | +```toml |
| 35 | +[project] |
| 36 | +keywords = ["some-org-tag"] |
| 37 | + |
| 38 | +[project.entry-points."overture.models"] |
| 39 | +"buildings:building" = "overture.schema.buildings:Building" |
| 40 | +"buildings:building_part#draft" = "overture.schema.buildings:BuildingPart" |
| 41 | +``` |
| 42 | + |
| 43 | +A model's tags are the union of: |
| 44 | + |
| 45 | +- `[project].keywords` from the distribution metadata (read via |
| 46 | + `entry_point.dist.metadata["Keywords"]`) |
| 47 | +- Per-model `#` tags from the entry point name |
| 48 | + |
| 49 | +Note: `[project].keywords` is also used for PyPI search, so schema tags and |
| 50 | +PyPI keywords share a namespace. This was considered and accepted -- in |
| 51 | +practice, schema-relevant keywords and PyPI discoverability keywords overlap |
| 52 | +naturally, and the `#` per-model mechanism covers cases where finer control is |
| 53 | +needed. |
| 54 | + |
| 55 | +### Tag Prefix Reservation |
| 56 | + |
| 57 | +The `system:` prefix is reserved for tag providers. Discovery rejects |
| 58 | +author-declared tags (keywords and `#`) that start with `system:`. Static |
| 59 | +sources containing `system:*` tags produce an error (or warning + discard). |
| 60 | + |
| 61 | +### Discovery |
| 62 | + |
| 63 | +`discover_models` and `ModelKey` move to `overture.schema.system.discovery`: |
| 64 | + |
| 65 | +```python |
| 66 | +def discover_models(tags: set[str] | None = None) -> dict[ModelKey, type[BaseModel]]: |
| 67 | +``` |
| 68 | + |
| 69 | +When `tags` is provided, any-match semantics apply: models whose effective tags |
| 70 | +intersect the filter set are included. When `None`, all models are returned. |
| 71 | + |
| 72 | +Implementation: |
| 73 | + |
| 74 | +1. Iterate `overture.models` entry points |
| 75 | +2. Split name on `#` to separate `theme:type` from per-model tags |
| 76 | +3. Split the prefix on `:` -- one part = type only (theme is `None`), two |
| 77 | + parts = theme + type |
| 78 | +4. Load the model class via `entry_point.load()` |
| 79 | +5. Read `entry_point.dist.metadata["Keywords"]` for package-level tags |
| 80 | +6. Union package tags and per-model tags, rejecting any `system:*` tags |
| 81 | +7. Build `ModelKey` with `frozenset` tags |
| 82 | + |
| 83 | +### CLI |
| 84 | + |
| 85 | +`--namespace` and `--overture-types` are removed. `--tag` (repeatable) replaces |
| 86 | +them. `--theme` and `--type` remain. |
| 87 | + |
| 88 | +Repeated `--tag` flags use OR: `--tag foo --tag bar` matches models with |
| 89 | +either tag. Repeated `--theme` flags also use OR. Filters compose as AND |
| 90 | +across dimensions: `--tag foo --theme buildings` takes the OR result from tags |
| 91 | +and intersects it with the OR result from themes. |
| 92 | + |
| 93 | +`list-types` groups by theme and displays tags per model: |
| 94 | + |
| 95 | +``` |
| 96 | +buildings |
| 97 | + building overture |
| 98 | + building_part overture, draft |
| 99 | +places |
| 100 | + place overture |
| 101 | +transportation |
| 102 | + connector overture |
| 103 | + segment overture |
| 104 | +(unthemed) |
| 105 | + sources overture |
| 106 | +``` |
| 107 | + |
| 108 | +### Migration |
| 109 | + |
| 110 | +Existing `namespace:theme:type` entry points become `theme:type` (or `type`). |
| 111 | +The `namespace` field is removed from `ModelKey`. The `overture` namespace |
| 112 | +does not need replacement as a keyword or `#` tag -- Phase 2 introduces a tag |
| 113 | +provider in `core` that derives the `overture` tag from `OvertureFeature` |
| 114 | +inheritance. The `annex` namespace has no equivalent; annex packages either |
| 115 | +declare their own keywords or rely on tag providers. |
| 116 | + |
| 117 | +### What Moves |
| 118 | + |
| 119 | +- `discover_models`, `ModelKey` move to `overture.schema.system.discovery` |
| 120 | +- CLI imports from `system` directly |
| 121 | + |
| 122 | +--- |
| 123 | + |
| 124 | +## Phase 2: Tag Providers |
| 125 | + |
| 126 | +Derived tags from model introspection. The `overture` tag (currently a hardcoded |
| 127 | +namespace) becomes a derived tag produced by a provider in `core`. |
| 128 | + |
| 129 | +### Provider Mechanism |
| 130 | + |
| 131 | +A new entry point group `overture.tag_providers` registers callables. The entry |
| 132 | +point key is informational (describes the provider's purpose) and not |
| 133 | +interpreted by discovery. |
| 134 | + |
| 135 | +Provider signature: |
| 136 | + |
| 137 | +```python |
| 138 | +(model_class: type[BaseModel], key: ModelKey, tags: set[str]) -> set[str] |
| 139 | +``` |
| 140 | + |
| 141 | +Providers receive the accumulated tag set and return a tag set (either the |
| 142 | +same object mutated or a new set). Tags accumulate across providers: each |
| 143 | +provider sees the result of all prior providers plus the static tags. The |
| 144 | +caller copies the set before passing it to each provider and diffs the copy |
| 145 | +against the return value to detect additions and removals. Providers can |
| 146 | +remove static tags (from keywords and `#`) and tags added by earlier |
| 147 | +providers. |
| 148 | + |
| 149 | +**Unresolved: provider ordering.** Since providers chain, execution order |
| 150 | +affects results. Leading candidate: alphabetical by entry point name with |
| 151 | +rc.d-style numeric prefixes (`10_extensions`, `50_overture`, `90_approved`). |
| 152 | +Since the entry point key is informational, providers self-register their |
| 153 | +ordering. Published guidance would define what the priority ranges mean (e.g., |
| 154 | +0-19 for structural tags, 20-49 for derived tags, 50-79 for classification, |
| 155 | +80-99 for approval/policy). Ranges are illustrative -- actual boundaries TBD. |
| 156 | + |
| 157 | +Providers run after static tag resolution, before filtering. Only tag |
| 158 | +providers registered by `overture-schema-system` may add `system:`-prefixed |
| 159 | +tags. Discovery enforces this by checking `entry_point.dist.name` before |
| 160 | +accepting `system:` additions. This is a convention boundary, not a security |
| 161 | +mechanism -- any package could name itself `overture-schema-system`. It |
| 162 | +prevents accidental `system:` claims from third-party providers and |
| 163 | +ecosystem-specific packages (like `core`). |
| 164 | + |
| 165 | +### Two-Phase Discovery |
| 166 | + |
| 167 | +With tag providers, `discover_models` becomes: |
| 168 | + |
| 169 | +1. **Load phase**: load models, resolve static tags (keywords + `#`) |
| 170 | +2. **Enrich phase**: load all `overture.tag_providers` entry points, run each |
| 171 | + provider (passing a copy of the current tags), diff input/output to detect |
| 172 | + changes, enforce `system:` prefix rules |
| 173 | +3. Freeze final `set[str]` tags into `frozenset[str]` on `ModelKey`, apply |
| 174 | + filter, return |
| 175 | + |
| 176 | +### The `overture` Tag Provider |
| 177 | + |
| 178 | +Lives in `core`, registered as an entry point: |
| 179 | + |
| 180 | +```toml |
| 181 | +# packages/overture-schema-core/pyproject.toml |
| 182 | +[project.entry-points."overture.tag_providers"] |
| 183 | +overture = "overture.schema.core.tags:overture_provider" |
| 184 | +``` |
| 185 | + |
| 186 | +```python |
| 187 | +from overture.schema.core import OvertureFeature |
| 188 | + |
| 189 | +def overture_provider( |
| 190 | + model_class: type[BaseModel], key: ModelKey, tags: set[str] |
| 191 | +) -> set[str]: |
| 192 | + if isinstance(model_class, type) and issubclass(model_class, OvertureFeature): |
| 193 | + tags.add("overture") |
| 194 | + return tags |
| 195 | +``` |
| 196 | + |
| 197 | +Theme packages do not need `"overture"` in `[project].keywords`. The tag is |
| 198 | +derived from `OvertureFeature` inheritance by this provider. |
| 199 | + |
| 200 | +--- |
| 201 | + |
| 202 | +## Phase 3: Extension Support |
| 203 | + |
| 204 | +Layer extension registration and discovery onto the tagging system. The |
| 205 | +extension mechanism itself (how extensions modify or compose with base models) |
| 206 | +is TBD and designed separately. This phase builds on Roel's proposal but that |
| 207 | +design is still subject to change. What follows covers registration, discovery, |
| 208 | +and CLI display only. |
| 209 | + |
| 210 | +### Extension Primitives |
| 211 | + |
| 212 | +`@extends` decorator and `Extends()` annotation move from `core` to |
| 213 | +`overture.schema.system`. These declare that a model augments one or more base |
| 214 | +models: |
| 215 | + |
| 216 | +```python |
| 217 | +@extends(Building, Place) |
| 218 | +class OperatingHours(BaseModel): |
| 219 | + hours: ... |
| 220 | + timezone: ... |
| 221 | +``` |
| 222 | + |
| 223 | +Or as a `NewType` with annotation: |
| 224 | + |
| 225 | +```python |
| 226 | +OperatingHours = Annotated[OperatingHoursModel, Extends(Building, Place)] |
| 227 | +``` |
| 228 | + |
| 229 | +### Extension Tag Provider |
| 230 | + |
| 231 | +Lives in `system`, registered as an entry point: |
| 232 | + |
| 233 | +```toml |
| 234 | +# packages/overture-schema-system/pyproject.toml |
| 235 | +[project.entry-points."overture.tag_providers"] |
| 236 | +extensions = "overture.schema.system.tags:extension_provider" |
| 237 | +``` |
| 238 | + |
| 239 | +```python |
| 240 | +def extension_provider( |
| 241 | + model_class: type[BaseModel], key: ModelKey, tags: set[str] |
| 242 | +) -> set[str]: |
| 243 | + if extends_classes(model_class): |
| 244 | + tags.add("system:extension") |
| 245 | + return tags |
| 246 | +``` |
| 247 | + |
| 248 | +Package authors register extensions as regular `overture.models` entry points. |
| 249 | +Extensions follow the same naming rules as other models: `theme:type` for |
| 250 | +themed extensions, or just `type` for unthemed ones. The `system:extension` |
| 251 | +tag is derived automatically; authors never declare it. |
| 252 | + |
| 253 | +### Extension Discovery Helper |
| 254 | + |
| 255 | +Extensions declare which base models they augment (via `@extends` / |
| 256 | +`Extends()`), but callers typically need the reverse: given a base model, which |
| 257 | +extensions apply to it? A convenience function in `system` builds this reverse |
| 258 | +mapping: |
| 259 | + |
| 260 | +```python |
| 261 | +def extension_graph( |
| 262 | + models: dict[ModelKey, type[BaseModel]], |
| 263 | +) -> dict[type[BaseModel], list[type[BaseModel]]]: |
| 264 | + """Map base models to the extensions that apply to them.""" |
| 265 | +``` |
| 266 | + |
| 267 | +Filters `models` for the `system:extension` tag, calls `extends_classes()` on |
| 268 | +each to get its declared base models, and inverts the relationship. The CLI |
| 269 | +uses this to show registered extensions alongside the models they extend |
| 270 | +without reimplementing the traversal. |
| 271 | + |
| 272 | +### CLI Display |
| 273 | + |
| 274 | +The CLI treats `system:extension` as a presentation hint: |
| 275 | + |
| 276 | +**`list-types`**: extensions appear in a separate section listing which models |
| 277 | +they extend. Core models show a compact cross-reference to available extensions. |
| 278 | + |
| 279 | +``` |
| 280 | +buildings |
| 281 | + building overture |
| 282 | + + capacity, operating-hours |
| 283 | + building_part overture, draft |
| 284 | +
|
| 285 | +places |
| 286 | + place overture |
| 287 | + + operating-hours |
| 288 | +
|
| 289 | +extensions |
| 290 | + capacity system:extension |
| 291 | + extends: building |
| 292 | + operating-hours system:extension |
| 293 | + extends: building, place |
| 294 | +``` |
| 295 | + |
| 296 | +**`describe-type` on a core model**: includes a "Registered extensions" section |
| 297 | +after the model's own fields. |
| 298 | + |
| 299 | +``` |
| 300 | +building (buildings) overture |
| 301 | +
|
| 302 | + Fields: |
| 303 | + geometry Geometry |
| 304 | + subtype BuildingSubtype | None |
| 305 | + ... |
| 306 | +
|
| 307 | + Registered extensions: |
| 308 | + capacity max_occupancy, floor_area, ... |
| 309 | + operating-hours hours, timezone, ... |
| 310 | +``` |
| 311 | + |
| 312 | +**`describe-type` on an extension**: shows which models it extends and full |
| 313 | +field definitions. |
| 314 | + |
| 315 | +--- |
| 316 | + |
| 317 | +## Phase 4: Manifest-Driven Approval (Sketch) |
| 318 | + |
| 319 | +A tag provider that checks models against a curated manifest, enabling |
| 320 | +organizations to certify which models (and extensions) they endorse. |
| 321 | + |
| 322 | +### Concept |
| 323 | + |
| 324 | +A tag provider reads a manifest of approved entries and adds an approval tag |
| 325 | +to matching models. This lets an organization certify models without requiring |
| 326 | +changes to the model packages themselves. |
| 327 | + |
| 328 | +### Shape |
| 329 | + |
| 330 | +The manifest lives in a dedicated package (e.g., |
| 331 | +`overture-schema-official-extensions`) that depends on nothing but `system`. |
| 332 | +It registers a tag provider: |
| 333 | + |
| 334 | +```toml |
| 335 | +# packages/overture-schema-official-extensions/pyproject.toml |
| 336 | +[project.entry-points."overture.tag_providers"] |
| 337 | +approved = "overture.schema.official_extensions:approved_provider" |
| 338 | +``` |
| 339 | + |
| 340 | +The manifest can match against class names (entry point values), entry point |
| 341 | +names, and/or package names with optional version specifiers. Package names |
| 342 | +are harder to collide than class names, making them a stronger identifier for |
| 343 | +third-party extensions. |
| 344 | + |
| 345 | +```python |
| 346 | +APPROVED = { |
| 347 | + # by class name |
| 348 | + "overture.schema.buildings:Building", |
| 349 | + "overture.schema.buildings:BuildingPart", |
| 350 | + # by package name (with optional version specifier) |
| 351 | + "acme-schema-parcels>=1.0", |
| 352 | +} |
| 353 | + |
| 354 | +def approved_provider( |
| 355 | + model_class: type[BaseModel], key: ModelKey, tags: set[str] |
| 356 | +) -> set[str]: |
| 357 | + if _matches_manifest(key): |
| 358 | + tags.add("approved") |
| 359 | + return tags |
| 360 | +``` |
| 361 | + |
| 362 | +Note: `approved` is an unprefixed tag (not `system:approved`) since this |
| 363 | +provider lives outside `system`. |
| 364 | + |
| 365 | +Note: `_matches_manifest` needs more than `ModelKey` to match against package |
| 366 | +names. `ModelKey` carries `class_name` (the entry point value) but not the |
| 367 | +distribution/package name. The provider will need access to additional context |
| 368 | +(e.g., the entry point object itself, or an enriched key). The right mechanism |
| 369 | +is TBD. |
| 370 | + |
| 371 | +### Use Cases |
| 372 | + |
| 373 | +- **Overture release certification**: only models in the manifest are tagged |
| 374 | + `approved` for a given release. The manifest package is versioned alongside |
| 375 | + the release. |
| 376 | +- **Organizational policy**: a company publishes their own manifest package |
| 377 | + with their own approved set (including approved extensions). |
| 378 | +- **CLI filtering**: `--tag approved` shows only certified models. |
0 commit comments