feat(GeoDataFrame support): geopandas support for shapefile exporting by jlarsen-usgs · Pull Request #2671 · modflowpy/flopy

jlarsen-usgs · 2025-12-16T00:51:08Z

This PR supersedes the draft PR #2573 and continues the cleanup of legacy shapefile exporting code and integration of geopandas GeoDataFrame support.

…across flopy

…port utilities

…upport

…frame()` routines

…code

codecov · 2025-12-23T01:38:48Z

Codecov Report

❌ Patch coverage is 57.22965% with 352 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.3%. Comparing base (556c088) to head (7bc98cd).
⚠️ Report is 109 commits behind head on develop.

Files with missing lines	Patch %	Lines
flopy/export/shapefile_utils.py	40.7%	45 Missing ⚠️
flopy/plot/plotutil.py	2.5%	38 Missing ⚠️
flopy/mf6/data/mfdatalist.py	40.9%	36 Missing ⚠️
flopy/mf6/data/mfdataplist.py	50.0%	30 Missing ⚠️
flopy/mf6/data/mfdataarray.py	62.6%	28 Missing ⚠️
flopy/utils/binaryfile/__init__.py	7.4%	25 Missing ⚠️
flopy/modflow/mfsfr2.py	54.7%	24 Missing ⚠️
flopy/utils/util_array.py	67.7%	19 Missing ⚠️
flopy/utils/datafile.py	32.0%	17 Missing ⚠️
flopy/mf6/mfpackage.py	48.3%	16 Missing ⚠️
... and 12 more

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2671      +/-   ##
===========================================
+ Coverage     55.5%    72.3%   +16.7%     
===========================================
  Files          644      667      +23     
  Lines       124135   130120    +5985     
===========================================
+ Hits         68947    94127   +25180     
+ Misses       55188    35993   -19195

Files with missing lines	Coverage Δ
flopy/discretization/grid.py	`74.3% <100.0%> (-1.7%)`	⬇️
flopy/mf6/data/mfdatautil.py	`60.6% <100.0%> (-17.3%)`	⬇️
flopy/utils/gridgen.py	`75.8% <100.0%> (-11.3%)`	⬇️
flopy/mf6/mfmodel.py	`57.5% <88.2%> (-23.4%)`	⬇️
flopy/discretization/structuredgrid.py	`53.7% <81.2%> (+6.2%)`	⬆️
flopy/mbase.py	`70.2% <82.3%> (-2.6%)`	⬇️
flopy/utils/geospatial_utils.py	`91.5% <40.0%> (-1.6%)`	⬇️
flopy/utils/particletrackfile.py	`95.6% <94.8%> (-0.4%)`	⬇️
flopy/discretization/unstructuredgrid.py	`81.0% <73.6%> (-0.5%)`	⬇️
flopy/pakbase.py	`82.4% <73.6%> (-2.0%)`	⬇️
... and 15 more

... and 538 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jlarsen-usgs · 2025-12-23T17:49:47Z

@wpbonelli it looks like the smoke tests are failing on test_mf2005_copy and also failed on the prior commit to FloPy. I can't currently reproduce the error on Windows. Any insight into what might be causing this?

wpbonelli

Great to be able to lean on geopandas for shapefile export. I guess we should look at using xarray for netcdf export similarly

General comment re: deprecations, we usually put an expected removal version in the deprecation warning which I think is typically ~2 minor releases ahead?

wpbonelli · 2025-12-26T13:34:55Z

+    @property
+    def geo_dataframe(self):
+        """
+        Returns a geopandas GeoDataFrame of the model grid


I think the running convention is to add deprecation notes to docstrings as well as the warning in the code itself

wpbonelli · 2025-12-26T13:36:58Z

    filename : str or PathLike
        path of the shapefile to write
-    mg : flopy.discretization.grid.Grid object
+    modelgrid : flopy.discretization.grid.Grid object


suggest keeping/deprecating mg just making it optional and introducing modelgrid alongside it to avoid breakage. or, since this function will go away soon, is it worth renaming this parameter at all?

Changed back to mg and added deprecation notes

wpbonelli · 2025-12-26T13:37:44Z


 def model_attributes_to_shapefile(
-    path: Union[str, PathLike],
+    filename: Union[str, PathLike],


same comment as mg/modelgrid above

reverted naming convention and added deprecation notes

wpbonelli · 2025-12-26T13:38:12Z

    Parameters
    ----------
-    shpname : str or PathLike
+    filename : str or PathLike


same comment as above

reverted naming convention and added deprecation notes

wpbonelli · 2025-12-26T13:38:19Z

    geoms,
-    shpname: Union[str, PathLike] = "recarray.shp",
-    mg=None,
+    filename: Union[str, PathLike] = "recarray.shp",


reverted naming convention and added deprecation notes

wpbonelli · 2025-12-26T13:47:07Z

+
+        return gdf
+
    def export(self, f, **kwargs):


I guess we would ideally deprecate export() now that the geodataframe approach is the recommended way to export to shapefile, but since export() also handles netcdf we have to wait til there is a similar netcdf export capability via e.g. xarray? working in the same way as the geopandas approach in this PR like to_xarray().to_netcdf()?

In the long term I think we should deprecate export() for methods like .to_netcdf(), .to_vtk(), .to_raster() to be explicit and follow similar conventions as pandas, geopandas, etc...

wpbonelli · 2025-12-26T14:02:36Z

        self.repeating = True
        self.empty_keys = {}

+    def to_geodataframe(self, gdf=None, kper=0, sparse=False, truncate_attrs=False, **kwargs):


What's the use case you have in mind for sparse=False? Since stress period data is sparse by default (at least in 3.x) it seems reasonable to return a sparse geodataframe too, either exclusively or at least by default?

Kind of like what you're doing in shapefile_export_example.py for WEL.

I guess I'm wondering when someone calling to_geodataframe() on stress period data would want one containing the whole grid, rather than just the relevant boundary conditions. In that case couldn't they get the grid geodataframe and .merge() it manually with the SPD frame or pass the grid frame into the gdf parameter here.

Performance wise, it seems expensive to always do the full grid then dropna().

sparse=False is useful when exporting from higher level objects like gwf.to_geodataframe(), from multiple individual packages into the same geodataframe, and if a user is exporting multiple stress periods with different boundary conditions from the same package. Because the geodataframe is constructed in the modelgrid object, it is created as a single layer representation of the whole grid and then we filter after the fact. For operations like gwf.to_geodataframe() there is a single construction of geometries and the geodataframe is then passed through to each package/datatype export.

Additionally sparse operates a little differently depending on if it's used on a TransientList or a MFList like object. On a TransientList the dataframe and attributes are constructed using a full grid representation and then filtered after all stress-period data is added. This is a much simpler operation than trying to reconcile multiple sparse dataframes from many stress-periods into a single dataframe.

We could cache a copy of the polygons after creation to improve performance; however, we don't currently give an option on the grid to construct a GeoDataFrame with only certain cells. We could add a nodes= parameter or kwarg to the grid .to_geodataframe() method which would only construct geometries for those nodes, however this will most likely be used in limited cases like exporting data from a single stress period or exporting SFR/LAK/UZF packagedata.

wpbonelli · 2025-12-26T14:06:43Z


    @property
-    def geo_dataframe(self):
+    def geodataframe(self):


why to_geodataframe... elsewhere and not here, other than the fact the old property didn't have to_...?

suggest also deprecating the old one to avoid breakage?

Two comments here:

geodataframe was used to keep the convention consistent with other properties on the GeoSpatialUtil class (e.g., shapely, geojson, points, etc...).

I did not add a deprecation here because this is a low-level object that generally shouldn't be called by end users. This serves as a "catch all" in methods that handle vector geospatial data and converts it to the type that the function was originally written for. I can keep the old method around and add a deprecation warning in the off chance someone is using this method to convert vector data outside of the intended use case.

wpbonelli · 2025-12-26T14:13:25Z

+            optional attribute name, default uses util2d name
+        forgive : bool
+            optional flag to continue if data shape not compatible with GeoDataFrame
+        truncate_attrs : bool


I think GeoDataFrame.to_file() already truncates for you when writing to shapefile, is this parameter necessary? It seems like having our own API for this would be useful if we let the user customize the truncation but otherwise redundant?

The truncate_attrs flag is definitely useful. to_file() does truncate the attributes, but it just cuts off the end and can potentially write a dbf file with multiple attributes that have identical names. truncate_attrs maintains the way we previously constructed attributes in flopy by removing text from the middle of the attribute name and keeping the layer and stress period number.

got it- didn't realize this. thanks!

wpbonelli · 2025-12-26T14:15:18Z

+geoms = GeoSpatialCollection(geoms, "Point")
+gdf = gpd.GeoDataFrame.from_features(geoms)
+
+for col in list(welldata):


instead of looping to add columns, could this just be

gdf = gpd.GeoDataFrame(welldata, geometry=geoms)

updated this, need to use:

geoms = GeoSpatialCollection(geoms, "Point").shape gdf = gpd.GeoDataFrame(welldata, geometry=geoms)

for this type of GeoDataFrame construction because it expects an arraylike object

christianlangevin

I took a quick look through this PR. Really nice, @jlarsen-usgs. Excited to see it rolled out.

…full_grid` and `shorten_attr`

…flopy into conflict_resolution

@jlarsen-usgs

Improve the shared face finding algorithm for HFB (Horizontal Flow Barrier) plotting. Factor out @jlarsen-usgs's index-based approach from #2671 into reusable function(s) and use it instead of floating point comparisons as before. Should be faster. I moved the new face-related functions to a new module faceutil.py instead of plotutil.py as they are used not only for plotting but for export, and may be useful in other cases. I considered putting them in gridutil.py but that is used by the grid classes, while the face functions use the grid classes, so it would have created a circular import. I also changed the name of is_vertical_barrier -> is_vertical and inverted the semantics so it works on faces, not barriers (a face between horizontally adjacent cells is a vertical polygon, and a face between vertically adjacent cells is a horizontal polygon, but we call the former a "horizontal" flow barrier and the latter a "vertical" flow barrier). I figure writing these utils in terms of faces for the general case makes sense, and just interpreting the results for the specific case of HFBs. This PR builds on - #2671's shared face finding approach - #2680 which gives all grid types a consistent get_node() method - #2681 which adds an ihc init parameter/attribute to UnstructuredGrid Some ad hoc cell ID -> node number conversion utilities are removed and get_node() used instead. I am classifying all this a refactor though it introduces a new utility module, as the new utils are in service of the refactor.

jlarsen-usgs added 22 commits December 8, 2025 14:00

merge work and refactor "geo_dataframe" methods to "to_geodataframe" …

1074446

…across flopy

Begin scattering geopandas support and removing pyshp support from ex…

8e4f5ae

…port utilities

Work on deprecating old shapefile exporting code and removing pyshp s…

021602a

…upport

Scatter shapefile attribute truncation options throughout `to_geodata…

9b5e3f0

…frame()` routines

More shapefile and export utilities work to cleanup legacy shapefile …

5338125

…code

update "Grid" import for shapefile_utils.py

86e28b5

woof

58a20f7

fix typo

4b2395f

Remove circular import

a301620

Updates for tests

9412c47

move grid_line_geodataframe call to individual Grid objects

beeeb6a

updates for test_export.py

51fb7a0

woof

60b2cb3

cleanup errors in legacy code

3906504

update shapefile_utils for to_geodataframe

e3301da

update modelgrid_examples.py notebook example

49d8353

update shapefile_export_example.py for to_geodataframe() functionality

805afee

update legacy sfr exporting

d5bd349

update shapefile_feature_examples.py

0cd534c

woof

965340a

Merge branch 'modflowpy:develop' into develop

ded7eaa

refactor .mg to .modelgrid for consistency

957269d

jlarsen-usgs added 3 commits December 23, 2025 07:43

Update test_get_destination_data for geopandas exporting

d115145

woof

c260033

update test_gridgen.py fix unbound error in MFList export

c2b6ec5

jlarsen-usgs marked this pull request as ready for review December 23, 2025 17:35

jlarsen-usgs requested a review from wpbonelli December 23, 2025 17:50

update test_copy.py for numpy immutable deepcopy reference

7d79f96

woof

f0e9211

wpbonelli reviewed Dec 26, 2025

View reviewed changes

jlarsen-usgs added 4 commits January 13, 2026 11:51

PR review updates

612767c

Merge branch 'develop' into conflict_resolution

6d7b501

woof

ae06aba

fix variable name error

1fec868

wpbonelli approved these changes Jan 13, 2026

View reviewed changes

christianlangevin approved these changes Jan 13, 2026

View reviewed changes

wpbonelli mentioned this pull request Jan 16, 2026

refactor(plot): better shared face finding for HFB plotting #2682

Merged

jlarsen-usgs added 4 commits January 20, 2026 14:38

standardize call signature and change sparse, truncate_attrs to `…

3c212c9

…full_grid` and `shorten_attr`

Merge branch 'develop' into conflict_resolution

43f0493

Fix test_export.py, add "active" column to geodataframe

740dc60

Merge branch 'conflict_resolution' of https://github.com/jdlarsen-UA/…

7bc98cd

…flopy into conflict_resolution

jlarsen-usgs merged commit ae1533f into modflowpy:develop Jan 21, 2026
20 checks passed

This was referenced Jan 29, 2026

enhancement: refactor shapefile exporting to use geopandas #2527

Closed

feature: universal get_dataframe() method #1969

Closed

Conversation

jlarsen-usgs commented Dec 16, 2025

Uh oh!

codecov Bot commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jlarsen-usgs commented Dec 23, 2025

Uh oh!

wpbonelli left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jlarsen-usgs Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

christianlangevin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Dec 23, 2025 •

edited

Loading

wpbonelli left a comment •

edited

Loading

jlarsen-usgs Jan 12, 2026 •

edited

Loading