Skip to content

Commit dcef5d4

Browse files
committed
feat: mf6 input file loading
1 parent 7698c7a commit dcef5d4

184 files changed

Lines changed: 5625 additions & 1510 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/dev/array-converter-design.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Array Converter Design Document
22

3+
> **Note**: The architectural overview has been consolidated into the main design document. See the **IO > Input > Conversion** section in [`sdd.md`](./sdd.md) for the current high-level design. This document remains as a detailed implementation specification.
4+
5+
---
6+
37
## Overview
48

59
Design for refactoring `flopy4.mf6.converter.structure.structure_array()` to support multiple sparse/dense array formats from flopy 3.x while returning xarray DataArrays with proper metadata.

docs/dev/construction_patterns.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
```python
2+
ic = Ic(strt=...)
3+
gwf = Gwf(dis=Dis(...))
4+
gwf.packages['ic'] = ic # parent auto-set
5+
```

docs/dev/grammar-issues.md

Lines changed: 265 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,265 @@
1+
# Typed Grammar Issues
2+
3+
This document catalogs issues with generated typed grammars that cause fallback to the basic parser.
4+
5+
## Overview
6+
7+
The typed parser uses auto-generated Lark grammars for each MF6 component type (e.g., `gwf-nam.lark`, `gwf-chd.lark`, `gwf-disv.lark`). When these grammars have errors or are incomplete, parsing falls back to the basic (untyped) parser.
8+
9+
The basic parser returns blocks as flat lists of lines, requiring the mapper layer to use heuristic workarounds. The typed parser with `__default__()` returns properly structured dicts, eliminating the need for workarounds.
10+
11+
## Issue 1: Rule Name Conflicts
12+
13+
**Component**: GWF (model-level name file)
14+
**File**: `flopy4/mf6/codec/reader/grammar/generated/gwf-nam.lark`
15+
**Error**: `Rule 'list' defined more than once`
16+
17+
### Root Cause
18+
19+
The grammar imports a `list` rule from the typed module (for structured list/recarray data), then tries to define its own `list` rule for the LIST filename option:
20+
21+
```lark
22+
%import typed.list -> list # Line 10: imports 'list' rule
23+
...
24+
list: "list"i string # Line 29: field rule for LIST option
25+
```
26+
27+
Lark doesn't allow duplicate rule names, so the grammar fails to compile.
28+
29+
### Why This Happens
30+
31+
The grammar generator creates a rule for each field using the field's name. When a field name conflicts with an imported rule name (`list`, `record`, `array`, etc.), we get a collision.
32+
33+
### Fix Options
34+
35+
**Option A**: Grammar generator should detect conflicts and rename field rules
36+
```lark
37+
list_option: "list"i string # Renamed to avoid conflict
38+
```
39+
40+
**Option B**: Grammar generator should use a consistent prefix for field rules
41+
```lark
42+
field_list: "list"i string
43+
field_newton: "newton"i "under_relaxation"i
44+
```
45+
46+
**Option C**: Don't import unused rules (if `list` structured data isn't used in this file)
47+
48+
### Impact
49+
50+
- GWF model-level files always fall back to basic parser
51+
- Affects every simulation (all have GWF name files)
52+
- Lower priority since name files are simple (just file listings)
53+
54+
### Workaround
55+
56+
The basic parser handles name files fine, just less cleanly. Not urgent.
57+
58+
---
59+
60+
## Issue 2: Missing Keywords in Grammar
61+
62+
**Component**: NPF (Node Property Flow package)
63+
**File**: `flopy4/mf6/codec/reader/grammar/generated/gwf-npf.lark`
64+
**Error**: `Unexpected token Token('__ANON_2', 'wetfct') at line 3, column 10`
65+
66+
### Root Cause
67+
68+
The NPF file contains a `wetfct` keyword that's not defined in the generated grammar. This suggests:
69+
1. The grammar generator missed a field from the definition file
70+
2. The definition file is incomplete
71+
3. The keyword is deprecated or from a different MF6 version
72+
73+
### Example File Content
74+
75+
```
76+
BEGIN options
77+
SAVE_FLOWS
78+
wetfct 0.1
79+
...
80+
END options
81+
```
82+
83+
The grammar likely has:
84+
```lark
85+
options_fields: (save_flows | ... )*
86+
# Missing: wetfct rule
87+
```
88+
89+
### Fix
90+
91+
Check the NPF definition file (DFN) and ensure all valid options are included in the grammar generation process. Add:
92+
```lark
93+
wetfct: "wetfct"i double
94+
```
95+
96+
### Impact
97+
98+
- NPF files with `wetfct` option fall back to basic parser
99+
- Affects models using wetting/drying functionality
100+
- Medium priority (common in certain model types)
101+
102+
---
103+
104+
## Issue 3: Strict Whitespace Handling
105+
106+
**Component**: DISV (Vertex discretization)
107+
**File**: `flopy4/mf6/codec/reader/grammar/generated/gwf-disv.lark`
108+
**Error**: `Unexpected token Token('NEWLINE', '\n') at line 8, column 10. Expected one of: END, NVERT, NCPL, NLAY`
109+
110+
### Root Cause
111+
112+
The grammar expects dimension fields to appear consecutively without extra whitespace:
113+
114+
```lark
115+
dimensions_fields: (nlay | ncpl | nvert)*
116+
```
117+
118+
But MF6 files often have blank lines or extra spacing:
119+
```
120+
BEGIN dimensions
121+
NLAY 1
122+
NCPL 121
123+
# <-- This blank line causes the error
124+
NVERT 148
125+
END dimensions
126+
```
127+
128+
### Why This Happens
129+
130+
The grammar correctly ignores whitespace within lines (`%ignore WS`) and comments (`%ignore SH_COMMENT`), but doesn't handle optional newlines between field definitions gracefully.
131+
132+
The `*` operator means "zero or more fields", but the parser gets confused when it sees a NEWLINE after fields but before END.
133+
134+
### Fix Options
135+
136+
**Option A**: Make the grammar more permissive about newlines
137+
```lark
138+
dimensions_fields: (NEWLINE* (nlay | ncpl | nvert) NEWLINE*)*
139+
```
140+
141+
**Option B**: Add explicit newline handling in the grammar
142+
```lark
143+
dimensions_fields: (dimension_field)*
144+
dimension_field: (nlay | ncpl | nvert) NEWLINE+
145+
```
146+
147+
**Option C**: Preprocess files to remove blank lines (not recommended)
148+
149+
### Impact
150+
151+
- DISV models always fall back to basic parser
152+
- Affects all vertex discretization models
153+
- **High priority** - common model type
154+
155+
### Workaround
156+
157+
The basic parser handles DISV files, but loses the clean dict structure benefits. The mapper workarounds compensate.
158+
159+
---
160+
161+
## Issue 4: OC Print Format Keywords
162+
163+
**Component**: OC (Output Control)
164+
**File**: `flopy4/mf6/codec/reader/grammar/generated/gwf-oc.lark`
165+
**Error**: `Unexpected token Token('__ANON_2', 'COLUMNS') at line 5, column 23`
166+
167+
### Root Cause
168+
169+
The OC file has print format keywords that aren't in the grammar:
170+
171+
```
172+
BEGIN options
173+
HEAD PRINT_FORMAT COLUMNS 10 WIDTH 15 DIGITS 6 GENERAL
174+
...
175+
END options
176+
```
177+
178+
The grammar likely only has:
179+
```lark
180+
head: "head"i "fileout"i word
181+
```
182+
183+
But not:
184+
```lark
185+
head_format: "head"i "print_format"i "columns"i integer "width"i integer ...
186+
```
187+
188+
### Fix
189+
190+
The grammar needs to handle both HEAD FILEOUT and HEAD PRINT_FORMAT variants:
191+
```lark
192+
head: head_fileout | head_format
193+
head_fileout: "head"i "fileout"i word
194+
head_format: "head"i "print_format"i format_spec
195+
format_spec: ("columns"i integer)? ("width"i integer)? ...
196+
```
197+
198+
### Impact
199+
200+
- OC files with print format options fall back to basic parser
201+
- Common in models with detailed output control
202+
- Medium priority
203+
204+
---
205+
206+
## Statistics
207+
208+
### Fallback Frequency
209+
210+
To measure fallback frequency, search for warnings in test output:
211+
```python
212+
grep "Typed parsing failed for" test_output.txt | sort | uniq -c
213+
```
214+
215+
Expected common failures:
216+
- **Gwf**: Every simulation (name file has `list` conflict)
217+
- **Disv**: Most vertex grid models (whitespace issue)
218+
- **Npf**: Some models (missing keywords)
219+
- **Oc**: Some models (format keywords)
220+
221+
### Success Rate
222+
223+
Components with **working** typed grammars:
224+
- CHD (Constant Head)
225+
- DRN (Drain)
226+
- WEL (Well)
227+
- RIV (River)
228+
- GHB (General Head Boundary)
229+
- RCH (Recharge)
230+
- Most other boundary packages
231+
232+
These benefit from the `__default__()` structured output immediately.
233+
234+
---
235+
236+
## TODO
237+
238+
1. Fix high-priority grammar issues:
239+
- DISV whitespace handling (affects many models)
240+
- Rule name conflicts in GWF/GWT/GWE name files
241+
2. Validate generated grammars: add CI check that all grammars compile
242+
3. Document known limitations: which keywords/options aren't supported
243+
4. Consider whether basic parser can structure blocks as dicts too?
244+
245+
## Testing Strategy
246+
247+
### Verify Typed Parsing Works
248+
249+
```python
250+
from flopy4.mf6.codec.reader import loads
251+
from flopy4.mf6.gwf.chd import Chd
252+
253+
# Should NOT see fallback warning
254+
data = loads(chd_file_content, component_type=Chd)
255+
assert isinstance(data['dimensions'], dict) # Clean structure
256+
```
257+
258+
### Verify Fallback Still Works
259+
260+
```python
261+
# Even if typed parsing fails, basic parser should work
262+
data = loads(disv_file_content, component_type=Disv)
263+
# Will see fallback warning, but still get data
264+
assert 'dimensions' in data
265+
```

docs/dev/map.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,28 @@
11
# FloPy 4 development roadmap
22

3-
## Demo
3+
## Milestones
44

5-
Showcase a limited set of core functionality, such as:
5+
### Demo
6+
7+
Limited set of core functionality, such as:
68

79
- object model, data model, user-facing APIs
810
- IO framework and ASCII file loading/writing
911
- constructing, running, modifying simulations
1012

11-
Design and implementation are provisional. Implementation may take shortcuts, e.g. components hand-written instead of generated from the DFN specification. Demonstration is a guided tour with guardrails.
13+
Design and implementation are provisional. Implementation may take shortcuts, e.g. components hand-written instead of generated from the DFN specification.
1214

1315
Release to demo participants via `pip install` from github URL.
1416

15-
## MVP
17+
### MVP
1618

17-
Support all core functionality, with components generated from the DFN spec. Prioritize functionality over performance/polish.
19+
All core functionality. Components generated from the DFN spec. Functionality over performance/polish.
1820

1921
Release to initial USGS and Deltares testers via `pip install ` from github URL. Begin alpha versioning.
2022

21-
## MMP
23+
### MMP
2224

23-
Minimum marketable product implements all core and most peripheral functionality, and may involve:
25+
All core and most peripheral functionality, and may involve:
2426

2527
- Achieving rough feature-parity with 3.x
2628
- Adopting features from e.g. `imod-python`
@@ -33,8 +35,8 @@ Minimum marketable product implements all core and most peripheral functionality
3335

3436
Release to wider test audience at USGS and Deltares via `pip install` from github URL. Begin beta versioning.
3537

36-
## GA
38+
### GA
3739

38-
Production-ready product achieves feature-parity with 3.x, integrates with the existing repository and becomes generally available via standard channels (PyPI, Conda).
40+
Feature-parity with 3.x. Integrated into existing repository. Generally available via standard channels (PyPI, Conda).
3941

4042
3.x enters maintenance-only mode for a limited time after which support will be dropped and all effort moved to 4.x.

0 commit comments

Comments
 (0)