Skip to content

Commit 224a8f7

Browse files
committed
refactor(codegen): reorganize flat layout into sub-packages
Move modules into three sub-packages matching the architecture layers: - extraction/ (14 modules): type analysis, specs, extractors, constraints - layout/ (2 modules): module layout, type collection - markdown/ (6 modules + templates): pipeline, renderer, type formatting, links, paths, reverse references Three modules renamed to drop redundant prefixes: field_constraint_description → extraction/field_constraints model_constraint_description → extraction/model_constraints example_loader → extraction/examples Templates flattened from templates/markdown/ to markdown/templates/.
1 parent 1132e48 commit 224a8f7

59 files changed

Lines changed: 225 additions & 187 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

packages/overture-schema-codegen/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,13 +67,13 @@ module structure. Link computation and reverse references enable cross-page navi
6767
Jinja2 templates for feature pages (with field tables, constraint sections, and
6868
examples), enum pages, NewType pages, and aggregate primitive/geometry reference pages.
6969

70-
`markdown_pipeline.py` orchestrates the full pipeline without I/O, returning
70+
`markdown/pipeline.py` orchestrates the full pipeline without I/O, returning
7171
`list[RenderedPage]`. The CLI writes files to disk with Docusaurus frontmatter.
7272

7373
## Programmatic use
7474

7575
```python
76-
from overture.schema.codegen.type_analyzer import analyze_type, TypeKind
76+
from overture.schema.codegen.extraction.type_analyzer import analyze_type, TypeKind
7777

7878
info = analyze_type(some_annotation)
7979
assert info.kind == TypeKind.PRIMITIVE

packages/overture-schema-codegen/docs/design.md

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ Extraction TypeInfo, FieldSpec, ModelSpec, EnumSpec, ...
6060
Discovery discover_models() from overture-schema-core
6161
```
6262

63-
`markdown_pipeline.py` orchestrates the pipeline without I/O: it expands feature trees,
63+
`markdown/pipeline.py` orchestrates the pipeline without I/O: it expands feature trees,
6464
collects supplementary types, builds placement registries, computes reverse references,
6565
and calls renderers -- returning `RenderedPage` objects. The CLI (`cli.py`) is a thin
6666
Click wrapper that calls `generate_markdown_pages()` and writes files to disk.
@@ -74,26 +74,26 @@ graph TD
7474
DM -->|"dict[ModelKey, type]"| EX
7575
7676
subgraph Extraction
77-
EX["type_analyzer / extractors"]
77+
EX["extraction/type_analyzer / extractors"]
7878
EX -->|"ModelSpec, UnionSpec"| TREE["expand_model_tree()"]
7979
end
8080
8181
TREE -->|"FeatureSpec[]"| OL
8282
8383
subgraph "Output Layout"
84-
OL["type_collection"]
85-
OL -->|"SupplementarySpec{}"| PA["path_assignment"]
86-
PA -->|"dict[str, Path]"| LC["link_computation"]
87-
RR["reverse_references"]
84+
OL["layout/type_collection"]
85+
OL -->|"SupplementarySpec{}"| PA["markdown/path_assignment"]
86+
PA -->|"dict[str, Path]"| LC["markdown/link_computation"]
87+
RR["markdown/reverse_references"]
8888
end
8989
9090
subgraph Rendering
91-
R["markdown_renderer"]
92-
TR["type_registry"] -.->|"type name resolution"| R
91+
R["markdown/renderer"]
92+
TR["extraction/type_registry"] -.->|"type name resolution"| R
9393
end
9494
9595
subgraph Orchestration
96-
MP["markdown_pipeline"]
96+
MP["markdown/pipeline"]
9797
end
9898
9999
OL --> MP
@@ -139,12 +139,12 @@ NewType active at that depth.
139139

140140
Extraction is split by entity kind:
141141

142-
- `model_extraction.py`: Pydantic model -> `ModelSpec` (fields in MRO-aware
142+
- `extraction/model_extraction.py`: Pydantic model -> `ModelSpec` (fields in MRO-aware
143143
documentation order, alias-resolved names, model-level constraints)
144-
- `enum_extraction.py`: Enum class -> `EnumSpec`
145-
- `newtype_extraction.py`: NewType -> `NewTypeSpec`
146-
- `union_extraction.py`: Discriminated union alias -> `UnionSpec`
147-
- `primitive_extraction.py`: Numeric primitives -> `PrimitiveSpec`
144+
- `extraction/enum_extraction.py`: Enum class -> `EnumSpec`
145+
- `extraction/newtype_extraction.py`: NewType -> `NewTypeSpec`
146+
- `extraction/union_extraction.py`: Discriminated union alias -> `UnionSpec`
147+
- `extraction/primitive_extraction.py`: Numeric primitives -> `PrimitiveSpec`
148148

149149
Each calls `analyze_type()` for field types. Tree expansion (`expand_model_tree()`)
150150
walks MODEL-kind fields to populate nested model references, with a shared cache and
@@ -212,7 +212,7 @@ syntax. Extraction and the type registry carry no presentation logic.
212212

213213
### Type registry
214214

215-
`type_registry.py` maps type names to per-target string representations via
215+
`extraction/type_registry.py` maps type names to per-target string representations via
216216
`TypeMapping`. `format_type_string()` wraps the resolved name with list/optional
217217
qualifiers. `is_semantic_newtype()` distinguishes NewTypes that deserve their own
218218
identity (like `FeatureVersion` wrapping `int32`) from pass-through aliases to
@@ -223,12 +223,12 @@ registered primitives.
223223
Jinja2 templates for feature, enum, NewType, primitives, and geometry pages.
224224
`render_feature()` expands MODEL-kind fields inline with dot-notation (e.g.,
225225
`sources[].dataset`), stopping at cycle boundaries. `format_type()` in
226-
`markdown_type_format.py` converts `TypeInfo` into link-aware display strings using
226+
`markdown/type_format.py` converts `TypeInfo` into link-aware display strings using
227227
`LinkContext`.
228228

229229
### Constraint prose
230230

231-
`field_constraint_description.py` and `model_constraint_description.py` convert
231+
`extraction/field_constraints.py` and `extraction/model_constraints.py` convert
232232
constraint objects into human-readable descriptions. Field constraints produce inline
233233
text. Model constraints produce section-level descriptions and per-field notes, with
234234
consolidation for related conditional constraints (`require_if` / `forbid_if` grouped by
@@ -249,13 +249,13 @@ rather than being split into dot-notation rows. The pipeline computes `dict_path
249249
## Extension Points
250250

251251
**Adding a new output target** (Arrow schemas next, PySpark expressions after): Add a
252-
column to `TypeMapping` in `type_registry.py` for type-name resolution. Write a new
253-
renderer module that consumes specs and the type registry. The extraction layer and
252+
column to `TypeMapping` in `extraction/type_registry.py` for type-name resolution. Write
253+
a new renderer module that consumes specs and the type registry. The extraction layer and
254254
output layout are target-independent.
255255

256-
**Adding a new type kind**: Add a variant to `TypeKind` in `type_analyzer.py`. Handle it
257-
in the terminal classification of `analyze_type()`. Add an extraction function and spec
258-
dataclass if needed. Update renderers to handle the new kind.
256+
**Adding a new type kind**: Add a variant to `TypeKind` in `extraction/type_analyzer.py`.
257+
Handle it in the terminal classification of `analyze_type()`. Add an extraction function
258+
and spec dataclass if needed. Update renderers to handle the new kind.
259259

260260
**Adding a new constraint type**: The iterative unwrapper collects it automatically (any
261261
`Annotated` metadata becomes a `ConstraintSource`). Add a case to

packages/overture-schema-codegen/docs/walkthrough.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ unions. From that point forward both model features and union features satisfy t
7171

7272
Two modules with no internal dependencies. Both serve multiple layers.
7373

74-
### case_conversion.py
74+
### extraction/case_conversion.py
7575

7676
Converts PascalCase to snake_case with two compiled regexes. `_ACRONYM_BOUNDARY` inserts
7777
an underscore between an uppercase run and a capitalized word start: `HTMLParser`
@@ -87,7 +87,7 @@ the system passes through this function.
8787
'hex_color.md'
8888
```
8989

90-
### docstring.py
90+
### extraction/docstring.py
9191

9292
Distinguishes author-written docstrings from auto-generated ones. Both `Enum` and
9393
`NewType` produce default docstrings that vary across Python versions. Rather than
@@ -202,7 +202,7 @@ handle them directly.
202202

203203
## 4. Data structures
204204

205-
`specs.py` defines the vocabulary shared between extraction and rendering. Every spec is
205+
`extraction/specs.py` defines the vocabulary shared between extraction and rendering. Every spec is
206206
a dataclass with no methods beyond field access and, in `UnionSpec`'s case, one cached
207207
property.
208208

@@ -240,7 +240,7 @@ individual ones.
240240

241241
### Classification functions
242242

243-
Three functions at the bottom of `specs.py` classify discovery results. `is_model_class`
243+
Three functions at the bottom of `extraction/specs.py` classify discovery results. `is_model_class`
244244
is a `TypeGuard` that checks `isinstance(obj, type) and issubclass(obj, BaseModel)`.
245245
`is_union_alias` calls `analyze_type` and checks for `UNION` kind -- the only place
246246
outside the type analyzer that touches Python type annotations. `filter_model_classes`
@@ -377,7 +377,7 @@ Two modules convert constraint objects into human-readable text.
377377

378378
### Field constraints
379379

380-
`field_constraint_description.py` pattern-matches constraint types. `Interval` renders
380+
`extraction/field_constraints.py` pattern-matches constraint types. `Interval` renders
381381
as `lower <= x <= upper` using Unicode comparison operators. Single-bound constraints
382382
(`Ge`, `Gt`, `Le`, `Lt`) render as `>= value` or `< value`. Length constraints
383383
(`MinLen`, `MaxLen`) render as plain prose (e.g. "Minimum length: 1"). `GeometryTypeConstraint` lists
@@ -396,7 +396,7 @@ docstring, class name, and pattern. Otherwise it delegates to
396396

397397
### Model constraints
398398

399-
`model_constraint_description.py` handles model-level constraints from decorators.
399+
`extraction/model_constraints.py` handles model-level constraints from decorators.
400400
`analyze_model_constraints` returns two things in one pass: a list of section-level
401401
descriptions and a dict mapping field names to the constraint descriptions that
402402
reference them.
@@ -504,7 +504,7 @@ provenance rather than direct field reference.
504504

505505
## 13. Markdown type formatting
506506

507-
`markdown_type_format.py` converts `TypeInfo` into display strings for markdown output.
507+
`markdown/type_format.py` converts `TypeInfo` into display strings for markdown output.
508508

509509
`format_type` handles the full range of field types. Single-value Literals render as
510510
`"value"` in backticks. Semantic NewTypes and enums/models get markdown links via
@@ -530,11 +530,11 @@ function uses `source_type.__name__` rather than `base_type` for link resolution
530530

531531
## 14. Markdown rendering
532532

533-
`markdown_renderer.py` is the template driver.
533+
`markdown/renderer.py` is the template driver.
534534

535535
### Templates
536536

537-
Six Jinja2 templates in `templates/markdown/`. `feature.md.jinja2` renders a field table
537+
Six Jinja2 templates in `markdown/templates/`. `feature.md.jinja2` renders a field table
538538
with Name, Type, and Description columns, an optional Constraints section, an optional
539539
Examples section, and a "Used By" partial. `enum.md.jinja2` renders a bullet list of
540540
values. `newtype.md.jinja2` shows underlying type and constraints with provenance links.
@@ -629,7 +629,7 @@ skip rather than failing the pipeline.
629629

630630
### The pipeline
631631

632-
`generate_markdown_pages` in `markdown_pipeline.py` is the "main" function. It takes
632+
`generate_markdown_pages` in `markdown/pipeline.py` is the "main" function. It takes
633633
feature specs and a schema root, returns rendered pages without touching the filesystem.
634634
Eight steps:
635635

packages/overture-schema-codegen/src/overture/schema/codegen/cli.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,20 +8,20 @@
88

99
from overture.schema.core.discovery import discover_models
1010

11-
from .markdown_pipeline import generate_markdown_pages
12-
from .model_extraction import extract_model
13-
from .module_layout import (
11+
from .extraction.model_extraction import extract_model
12+
from .extraction.specs import (
13+
FeatureSpec,
14+
is_model_class,
15+
is_union_alias,
16+
)
17+
from .extraction.union_extraction import extract_union
18+
from .layout.module_layout import (
1419
OUTPUT_ROOT,
1520
compute_schema_root,
1621
entry_point_class,
1722
entry_point_module,
1823
)
19-
from .specs import (
20-
FeatureSpec,
21-
is_model_class,
22-
is_union_alias,
23-
)
24-
from .union_extraction import extract_union
24+
from .markdown.pipeline import generate_markdown_pages
2525

2626
log = logging.getLogger(__name__)
2727

packages/overture-schema-codegen/src/overture/schema/codegen/extraction/__init__.py

Whitespace-only changes.

packages/overture-schema-codegen/src/overture/schema/codegen/case_conversion.py renamed to packages/overture-schema-codegen/src/overture/schema/codegen/extraction/case_conversion.py

File renamed without changes.

packages/overture-schema-codegen/src/overture/schema/codegen/docstring.py renamed to packages/overture-schema-codegen/src/overture/schema/codegen/extraction/docstring.py

File renamed without changes.

packages/overture-schema-codegen/src/overture/schema/codegen/enum_extraction.py renamed to packages/overture-schema-codegen/src/overture/schema/codegen/extraction/enum_extraction.py

File renamed without changes.

packages/overture-schema-codegen/src/overture/schema/codegen/example_loader.py renamed to packages/overture-schema-codegen/src/overture/schema/codegen/extraction/examples.py

File renamed without changes.

packages/overture-schema-codegen/src/overture/schema/codegen/field_constraint_description.py renamed to packages/overture-schema-codegen/src/overture/schema/codegen/extraction/field_constraints.py

File renamed without changes.

0 commit comments

Comments
 (0)