Skip to content

Commit 7015a0a

Browse files
committed
dev notes
1 parent 71c0f4d commit 7015a0a

2 files changed

Lines changed: 461 additions & 0 deletions

File tree

docs/dimension-resolution.md

Lines changed: 254 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,254 @@
1+
# Dimension Resolution
2+
3+
## Current Problem
4+
5+
When loading components with array fields that depend on dimensions (e.g., TDIS with `perlen` array of shape `(nper,)`), the dimension resolution fails:
6+
7+
```
8+
ValueError: Couldn't resolve dims: ['nper']
9+
```
10+
11+
This happens because `_resolve_dimensions()` tries to find `nper` on `self_` during attrs initialization, but `nper` hasn't been set yet.
12+
13+
## Planned Architecture (Issue #167)
14+
15+
The data model refactor introduces explicit dimension handling via protocols:
16+
17+
### DimensionProvider Protocol
18+
19+
Components that define dimensions implement this:
20+
21+
```python
22+
class DimensionProvider(Protocol):
23+
def get_dimensions(self) -> dict[str, int]: ...
24+
def get_dimension_scope(self, dim_name: str) -> str | type: ...
25+
```
26+
27+
- **TDIS** provides `nper` at global/simulation scope
28+
- **DIS/DISV/DISU** provide `nlay`, `nrow`, `ncol`, `nodes` at model scope
29+
30+
### DimensionRegistry Protocol
31+
32+
Containers that aggregate dimensions:
33+
34+
```python
35+
class DimensionRegistry(Protocol):
36+
def register_dimension_provider(self, provider: DimensionProvider): ...
37+
def resolve_dimension(self, dim_name: str) -> int | None: ...
38+
def get_all_dimensions(self) -> dict[str, int]: ...
39+
```
40+
41+
- **Simulation** registry handles global scope (time dimensions)
42+
- **Model** registry handles model scope (grid dimensions)
43+
- Resolution walks up parent chain: package → model → simulation
44+
45+
### Loading Order
46+
47+
With this design, dimension resolution naturally works if we load in the right order:
48+
49+
```
50+
1. Load Simulation (creates registry)
51+
52+
2. Load TDIS → registers nper with simulation
53+
54+
3. Load Model (creates registry, links to simulation)
55+
56+
4. Load DIS → registers nlay, nrow, ncol with model
57+
58+
5. Load other packages → resolve dims from parent chain
59+
```
60+
61+
## Design Considerations
62+
63+
### Self-Contained Components (TDIS, DIS)
64+
65+
For components that DEFINE their own dimensions, like TDIS:
66+
67+
```python
68+
@xattree
69+
class Tdis(Package):
70+
nper: int = dim(block="dimensions", default=1)
71+
perlen: NDArray = array(dims=("nper",), ...)
72+
```
73+
74+
The `nper` field is defined ON the same class that uses it. The issue is attrs field initialization order - `perlen`'s converter runs before `nper` is set.
75+
76+
**Solution**: For self-contained dimension providers, we can:
77+
1. Extract dimensions from parsed data BEFORE calling `__init__`
78+
2. Pass them explicitly to converters
79+
3. Or: defer array validation until after `__init__` completes
80+
81+
### Cross-Component Dimensions
82+
83+
For packages that depend on dimensions from OTHER components:
84+
85+
```python
86+
@xattree
87+
class Npf(Package):
88+
k: NDArray = array(dims=("nlay", "nrow", "ncol"), ...) # From DIS
89+
```
90+
91+
**Solution**: These dimensions come from the parent chain:
92+
1. Package has `parent` field pointing to Model
93+
2. Model registry has dimensions from DIS
94+
3. `resolve_dimension("nlay")` walks up to find it
95+
96+
### Partial/Incomplete Simulations
97+
98+
Sometimes we want to load components without their dimension providers:
99+
- Loading a single package for inspection
100+
- Testing with mock data
101+
- Partial simulations
102+
103+
**Solution**: `strict=False` mode that:
104+
- Skips dimension validation
105+
- Accepts arrays of any shape
106+
- Logs warnings instead of raising errors
107+
108+
```python
109+
def structure(data, path, component_type, *, strict=True):
110+
if not strict:
111+
# Skip dimension validation, accept any array shapes
112+
...
113+
```
114+
115+
## Recommended Implementation
116+
117+
### Phase 1: Make Current Loading Work
118+
119+
For now, extract dimensions from parsed data and pass to converters:
120+
121+
```python
122+
def structure(data, path, component_type):
123+
# Extract dimensions from the data itself
124+
dims = {}
125+
if "dimensions" in data:
126+
dims.update(data["dimensions"])
127+
128+
# Make dims available during structuring
129+
# Option A: Thread-local context
130+
# Option B: Pass via __init__ parameter
131+
# Option C: Store on a context object passed to converters
132+
```
133+
134+
### Consider: attrs → pydantic Migration
135+
136+
The self-contained bootstrapping problem (TDIS needs `nper` set before `perlen` converter runs) could be solved by migrating from attrs to pydantic.
137+
138+
**Why pydantic helps:**
139+
140+
Pydantic provides explicit control over validation ordering:
141+
142+
```python
143+
from pydantic import BaseModel, model_validator
144+
145+
class Tdis(BaseModel):
146+
nper: int = 1
147+
perlen: list[float] # Raw data, not yet structured
148+
nstp: list[int]
149+
tsmult: list[float]
150+
151+
@model_validator(mode='after')
152+
def structure_arrays(self) -> 'Tdis':
153+
# Runs AFTER all fields are set, so self.nper is available
154+
self.perlen = np.array(self.perlen).reshape((self.nper,))
155+
self.nstp = np.array(self.nstp).reshape((self.nper,))
156+
self.tsmult = np.array(self.tsmult).reshape((self.nper,))
157+
return self
158+
```
159+
160+
**Key pydantic features:**
161+
- `model_validator(mode='before')` - transform input data before field assignment
162+
- `model_validator(mode='after')` - validate/transform after ALL fields are set
163+
- `field_validator` - per-field validation with ordering control
164+
- `model_config` - fine-grained control over validation behavior
165+
166+
**With pydantic, the flow becomes:**
167+
1. Set `nper=3` (scalar, no validation needed)
168+
2. Set `perlen=[1.0, 2.0, 3.0]` (raw list, no structuring yet)
169+
3. `model_validator(mode='after')` runs
170+
4. Now `self.nper` is available → structure arrays with correct shape
171+
172+
This cleanly separates field assignment from array structuring, solving the bootstrapping problem without workarounds.
173+
174+
### Phase 2: Explicit Dimension Registration (Issue #167)
175+
176+
Once the refactor is complete:
177+
178+
1. Simulation/Model implement `DimensionRegistry`
179+
2. TDIS/DIS implement `DimensionProvider`
180+
3. Auto-register providers when assigned as children
181+
4. Resolution walks parent chain naturally
182+
183+
### Phase 3: Loading Order Enforcement
184+
185+
For full simulations, enforce loading order:
186+
187+
```python
188+
def load_simulation(path):
189+
sim = Simulation._load_header(path) # Just options, no children
190+
191+
# Load dimension providers first
192+
sim.tdis = Tdis.load(...) # Registers nper
193+
194+
for model_binding in sim.model_bindings:
195+
model = Model._load_header(...)
196+
model.dis = Dis.load(...) # Registers grid dims
197+
198+
# Now load packages that need dimensions
199+
for pkg_binding in model.package_bindings:
200+
pkg = Package.load(...) # Can resolve dims from parent
201+
model.add_package(pkg)
202+
203+
sim.add_model(model)
204+
```
205+
206+
## Loading Order: Detect Dimension Providers First
207+
208+
During binding resolution, detect dimension-provider components and load them before consumers:
209+
210+
```python
211+
def _resolve_bindings_in_dict(data, workspace, target_type):
212+
# Partition bindings into dimension providers vs consumers
213+
dim_providers = [] # TDIS, DIS, DISV, DISU
214+
dim_consumers = [] # Everything else
215+
216+
for field_name, bindings in binding_fields:
217+
for binding in bindings:
218+
component_type = _resolve_component_class(binding[0])
219+
if is_dimension_provider(component_type):
220+
dim_providers.append((field_name, binding))
221+
else:
222+
dim_consumers.append((field_name, binding))
223+
224+
# Load dimension providers FIRST
225+
for field_name, binding in dim_providers:
226+
component = Binding.to_component(binding, workspace)
227+
resolved[field_name] = component
228+
# Dimensions now registered with parent
229+
230+
# Load consumers - they can now resolve dimensions
231+
for field_name, binding in dim_consumers:
232+
component = Binding.to_component(binding, workspace)
233+
resolved[field_name] = component
234+
```
235+
236+
**Detecting dimension providers:**
237+
- Check for `dim()` fields in the component class
238+
- Or maintain a registry: `{Tdis, Dis, Disv, Disu, ...}`
239+
- Or use a marker: `_is_dimension_provider = True`
240+
241+
## Example Flow (Future State)
242+
243+
```
244+
mfsim.nam
245+
├── simulation.tdis → Tdis(nper=3) → registers nper=3
246+
└── gwf.nam
247+
├── gwf.dis → Dis(nlay=5, nrow=10, ncol=10) → registers grid dims
248+
├── gwf.npf → Npf(k=array(5,10,10)) → resolves from parent
249+
└── gwf.chd → Chd(spd=array(3,...)) → resolves nper from sim
250+
```
251+
252+
Each package can resolve dimensions by walking up:
253+
- `Npf.parent``GwfModel` → has `nlay`, `nrow`, `ncol`
254+
- `GwfModel.parent``Simulation` → has `nper`

0 commit comments

Comments
 (0)