|
| 1 | +# Dimension Resolution |
| 2 | + |
| 3 | +## Current Problem |
| 4 | + |
| 5 | +When loading components with array fields that depend on dimensions (e.g., TDIS with `perlen` array of shape `(nper,)`), the dimension resolution fails: |
| 6 | + |
| 7 | +``` |
| 8 | +ValueError: Couldn't resolve dims: ['nper'] |
| 9 | +``` |
| 10 | + |
| 11 | +This happens because `_resolve_dimensions()` tries to find `nper` on `self_` during attrs initialization, but `nper` hasn't been set yet. |
| 12 | + |
| 13 | +## Planned Architecture (Issue #167) |
| 14 | + |
| 15 | +The data model refactor introduces explicit dimension handling via protocols: |
| 16 | + |
| 17 | +### DimensionProvider Protocol |
| 18 | + |
| 19 | +Components that define dimensions implement this: |
| 20 | + |
| 21 | +```python |
| 22 | +class DimensionProvider(Protocol): |
| 23 | + def get_dimensions(self) -> dict[str, int]: ... |
| 24 | + def get_dimension_scope(self, dim_name: str) -> str | type: ... |
| 25 | +``` |
| 26 | + |
| 27 | +- **TDIS** provides `nper` at global/simulation scope |
| 28 | +- **DIS/DISV/DISU** provide `nlay`, `nrow`, `ncol`, `nodes` at model scope |
| 29 | + |
| 30 | +### DimensionRegistry Protocol |
| 31 | + |
| 32 | +Containers that aggregate dimensions: |
| 33 | + |
| 34 | +```python |
| 35 | +class DimensionRegistry(Protocol): |
| 36 | + def register_dimension_provider(self, provider: DimensionProvider): ... |
| 37 | + def resolve_dimension(self, dim_name: str) -> int | None: ... |
| 38 | + def get_all_dimensions(self) -> dict[str, int]: ... |
| 39 | +``` |
| 40 | + |
| 41 | +- **Simulation** registry handles global scope (time dimensions) |
| 42 | +- **Model** registry handles model scope (grid dimensions) |
| 43 | +- Resolution walks up parent chain: package → model → simulation |
| 44 | + |
| 45 | +### Loading Order |
| 46 | + |
| 47 | +With this design, dimension resolution naturally works if we load in the right order: |
| 48 | + |
| 49 | +``` |
| 50 | +1. Load Simulation (creates registry) |
| 51 | + ↓ |
| 52 | +2. Load TDIS → registers nper with simulation |
| 53 | + ↓ |
| 54 | +3. Load Model (creates registry, links to simulation) |
| 55 | + ↓ |
| 56 | +4. Load DIS → registers nlay, nrow, ncol with model |
| 57 | + ↓ |
| 58 | +5. Load other packages → resolve dims from parent chain |
| 59 | +``` |
| 60 | + |
| 61 | +## Design Considerations |
| 62 | + |
| 63 | +### Self-Contained Components (TDIS, DIS) |
| 64 | + |
| 65 | +For components that DEFINE their own dimensions, like TDIS: |
| 66 | + |
| 67 | +```python |
| 68 | +@xattree |
| 69 | +class Tdis(Package): |
| 70 | + nper: int = dim(block="dimensions", default=1) |
| 71 | + perlen: NDArray = array(dims=("nper",), ...) |
| 72 | +``` |
| 73 | + |
| 74 | +The `nper` field is defined ON the same class that uses it. The issue is attrs field initialization order - `perlen`'s converter runs before `nper` is set. |
| 75 | + |
| 76 | +**Solution**: For self-contained dimension providers, we can: |
| 77 | +1. Extract dimensions from parsed data BEFORE calling `__init__` |
| 78 | +2. Pass them explicitly to converters |
| 79 | +3. Or: defer array validation until after `__init__` completes |
| 80 | + |
| 81 | +### Cross-Component Dimensions |
| 82 | + |
| 83 | +For packages that depend on dimensions from OTHER components: |
| 84 | + |
| 85 | +```python |
| 86 | +@xattree |
| 87 | +class Npf(Package): |
| 88 | + k: NDArray = array(dims=("nlay", "nrow", "ncol"), ...) # From DIS |
| 89 | +``` |
| 90 | + |
| 91 | +**Solution**: These dimensions come from the parent chain: |
| 92 | +1. Package has `parent` field pointing to Model |
| 93 | +2. Model registry has dimensions from DIS |
| 94 | +3. `resolve_dimension("nlay")` walks up to find it |
| 95 | + |
| 96 | +### Partial/Incomplete Simulations |
| 97 | + |
| 98 | +Sometimes we want to load components without their dimension providers: |
| 99 | +- Loading a single package for inspection |
| 100 | +- Testing with mock data |
| 101 | +- Partial simulations |
| 102 | + |
| 103 | +**Solution**: `strict=False` mode that: |
| 104 | +- Skips dimension validation |
| 105 | +- Accepts arrays of any shape |
| 106 | +- Logs warnings instead of raising errors |
| 107 | + |
| 108 | +```python |
| 109 | +def structure(data, path, component_type, *, strict=True): |
| 110 | + if not strict: |
| 111 | + # Skip dimension validation, accept any array shapes |
| 112 | + ... |
| 113 | +``` |
| 114 | + |
| 115 | +## Recommended Implementation |
| 116 | + |
| 117 | +### Phase 1: Make Current Loading Work |
| 118 | + |
| 119 | +For now, extract dimensions from parsed data and pass to converters: |
| 120 | + |
| 121 | +```python |
| 122 | +def structure(data, path, component_type): |
| 123 | + # Extract dimensions from the data itself |
| 124 | + dims = {} |
| 125 | + if "dimensions" in data: |
| 126 | + dims.update(data["dimensions"]) |
| 127 | + |
| 128 | + # Make dims available during structuring |
| 129 | + # Option A: Thread-local context |
| 130 | + # Option B: Pass via __init__ parameter |
| 131 | + # Option C: Store on a context object passed to converters |
| 132 | +``` |
| 133 | + |
| 134 | +### Consider: attrs → pydantic Migration |
| 135 | + |
| 136 | +The self-contained bootstrapping problem (TDIS needs `nper` set before `perlen` converter runs) could be solved by migrating from attrs to pydantic. |
| 137 | + |
| 138 | +**Why pydantic helps:** |
| 139 | + |
| 140 | +Pydantic provides explicit control over validation ordering: |
| 141 | + |
| 142 | +```python |
| 143 | +from pydantic import BaseModel, model_validator |
| 144 | + |
| 145 | +class Tdis(BaseModel): |
| 146 | + nper: int = 1 |
| 147 | + perlen: list[float] # Raw data, not yet structured |
| 148 | + nstp: list[int] |
| 149 | + tsmult: list[float] |
| 150 | + |
| 151 | + @model_validator(mode='after') |
| 152 | + def structure_arrays(self) -> 'Tdis': |
| 153 | + # Runs AFTER all fields are set, so self.nper is available |
| 154 | + self.perlen = np.array(self.perlen).reshape((self.nper,)) |
| 155 | + self.nstp = np.array(self.nstp).reshape((self.nper,)) |
| 156 | + self.tsmult = np.array(self.tsmult).reshape((self.nper,)) |
| 157 | + return self |
| 158 | +``` |
| 159 | + |
| 160 | +**Key pydantic features:** |
| 161 | +- `model_validator(mode='before')` - transform input data before field assignment |
| 162 | +- `model_validator(mode='after')` - validate/transform after ALL fields are set |
| 163 | +- `field_validator` - per-field validation with ordering control |
| 164 | +- `model_config` - fine-grained control over validation behavior |
| 165 | + |
| 166 | +**With pydantic, the flow becomes:** |
| 167 | +1. Set `nper=3` (scalar, no validation needed) |
| 168 | +2. Set `perlen=[1.0, 2.0, 3.0]` (raw list, no structuring yet) |
| 169 | +3. `model_validator(mode='after')` runs |
| 170 | +4. Now `self.nper` is available → structure arrays with correct shape |
| 171 | + |
| 172 | +This cleanly separates field assignment from array structuring, solving the bootstrapping problem without workarounds. |
| 173 | + |
| 174 | +### Phase 2: Explicit Dimension Registration (Issue #167) |
| 175 | + |
| 176 | +Once the refactor is complete: |
| 177 | + |
| 178 | +1. Simulation/Model implement `DimensionRegistry` |
| 179 | +2. TDIS/DIS implement `DimensionProvider` |
| 180 | +3. Auto-register providers when assigned as children |
| 181 | +4. Resolution walks parent chain naturally |
| 182 | + |
| 183 | +### Phase 3: Loading Order Enforcement |
| 184 | + |
| 185 | +For full simulations, enforce loading order: |
| 186 | + |
| 187 | +```python |
| 188 | +def load_simulation(path): |
| 189 | + sim = Simulation._load_header(path) # Just options, no children |
| 190 | + |
| 191 | + # Load dimension providers first |
| 192 | + sim.tdis = Tdis.load(...) # Registers nper |
| 193 | + |
| 194 | + for model_binding in sim.model_bindings: |
| 195 | + model = Model._load_header(...) |
| 196 | + model.dis = Dis.load(...) # Registers grid dims |
| 197 | + |
| 198 | + # Now load packages that need dimensions |
| 199 | + for pkg_binding in model.package_bindings: |
| 200 | + pkg = Package.load(...) # Can resolve dims from parent |
| 201 | + model.add_package(pkg) |
| 202 | + |
| 203 | + sim.add_model(model) |
| 204 | +``` |
| 205 | + |
| 206 | +## Loading Order: Detect Dimension Providers First |
| 207 | + |
| 208 | +During binding resolution, detect dimension-provider components and load them before consumers: |
| 209 | + |
| 210 | +```python |
| 211 | +def _resolve_bindings_in_dict(data, workspace, target_type): |
| 212 | + # Partition bindings into dimension providers vs consumers |
| 213 | + dim_providers = [] # TDIS, DIS, DISV, DISU |
| 214 | + dim_consumers = [] # Everything else |
| 215 | + |
| 216 | + for field_name, bindings in binding_fields: |
| 217 | + for binding in bindings: |
| 218 | + component_type = _resolve_component_class(binding[0]) |
| 219 | + if is_dimension_provider(component_type): |
| 220 | + dim_providers.append((field_name, binding)) |
| 221 | + else: |
| 222 | + dim_consumers.append((field_name, binding)) |
| 223 | + |
| 224 | + # Load dimension providers FIRST |
| 225 | + for field_name, binding in dim_providers: |
| 226 | + component = Binding.to_component(binding, workspace) |
| 227 | + resolved[field_name] = component |
| 228 | + # Dimensions now registered with parent |
| 229 | + |
| 230 | + # Load consumers - they can now resolve dimensions |
| 231 | + for field_name, binding in dim_consumers: |
| 232 | + component = Binding.to_component(binding, workspace) |
| 233 | + resolved[field_name] = component |
| 234 | +``` |
| 235 | + |
| 236 | +**Detecting dimension providers:** |
| 237 | +- Check for `dim()` fields in the component class |
| 238 | +- Or maintain a registry: `{Tdis, Dis, Disv, Disu, ...}` |
| 239 | +- Or use a marker: `_is_dimension_provider = True` |
| 240 | + |
| 241 | +## Example Flow (Future State) |
| 242 | + |
| 243 | +``` |
| 244 | +mfsim.nam |
| 245 | +├── simulation.tdis → Tdis(nper=3) → registers nper=3 |
| 246 | +└── gwf.nam |
| 247 | + ├── gwf.dis → Dis(nlay=5, nrow=10, ncol=10) → registers grid dims |
| 248 | + ├── gwf.npf → Npf(k=array(5,10,10)) → resolves from parent |
| 249 | + └── gwf.chd → Chd(spd=array(3,...)) → resolves nper from sim |
| 250 | +``` |
| 251 | + |
| 252 | +Each package can resolve dimensions by walking up: |
| 253 | +- `Npf.parent` → `GwfModel` → has `nlay`, `nrow`, `ncol` |
| 254 | +- `GwfModel.parent` → `Simulation` → has `nper` |
0 commit comments