Skip to content

Fix silent CRS mishandling in find_geometry_from_coord and geom_prop#615

Open
martinguthrie93 wants to merge 3 commits into
CSHS-CWRA:mainfrom
martinguthrie93:fix-gis-crs-alignment
Open

Fix silent CRS mishandling in find_geometry_from_coord and geom_prop#615
martinguthrie93 wants to merge 3 commits into
CSHS-CWRA:mainfrom
martinguthrie93:fix-gis-crs-alignment

Conversation

@martinguthrie93

Copy link
Copy Markdown

Overview

Fixes two silent CRS-handling defects in the geospatial utilities. In both cases the code assumes a specific CRS, produces plausible-looking output when that assumption is violated, and raises nothing — so the error can propagate all the way into simulated streamflow.

Closes #614.

Changes

find_geometry_from_coord (ravenpy/utilities/geo.py)

Previously the function built a Point(lon, lat) and called GeoDataFrame.contains directly. contains compares raw coordinates and does not reconcile CRSs, so a geographic query point never matched a routing product stored in a projected CRS (metres) — the lookup raised a misleading "not in any geometry" error or matched the wrong feature.

  • The point is now reprojected into df.crs before the point-in-polygon test.
  • A new optional point_crs argument (default 4326) lets callers pass a point expressed in any CRS.
  • When df has no CRS, a warning is emitted and the previous behavior (assume matching CRS) is preserved.
  • This is a no-op when df is already in WGS84, so existing workflows (e.g. upstream_from_coords) are unaffected.

geom_prop (ravenpy/utilities/analysis.py)

area, perimeter, and the Gravelius shape index are only physically meaningful in a projected, equal-area CRS. The previous guard was inverted: it warned only when the centroid fell outside ±180/±90 (the projected, safe case) and stayed silent for decimal-degree geometries (the risky case).

  • A new optional crs argument is used to detect a geographic CRS via pyproj.
  • Without an explicit crs, a centroid-based heuristic flags likely decimal-degree geometries.
  • When the geometry is geographic, a clear warning explains that the returned metrics are in angular units and should be recomputed after reprojecting to an equal-area CRS.
  • Returned numeric values are unchanged, so this is fully backward compatible.

Tests

Added TestCrsAlignment in tests/test_geo_utilities.py:

  • find_geometry_from_coord locates the correct feature for a projected GeoDataFrame, for a geographic GeoDataFrame, and for a point supplied in a custom point_crs; and still raises for a point genuinely outside all geometries.
  • geom_prop warns for a geographic CRS (while leaving numeric output unchanged) and stays silent for a projected CRS.

Backward compatibility

Both public signatures gain only optional keyword arguments with defaults that reproduce the prior behavior. Existing numeric results (including the values asserted in test_geo_utilities.py) are unchanged.

Notes for reviewers

I was unable to run the full GIS test suite locally (my environment lacks part of the RavenPy runtime stack, and the tests rely on the networked yangtze data fixture). I validated the new logic and every new test assertion against geopandas/pyproj/shapely in isolation. CI should exercise the full suite.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Welcome, new contributor!

It appears that this is your first Pull Request. To give credit where it's due, we ask that you add your information to the AUTHORS.rst and .zenodo.json:

  • The relevant author information has been added to AUTHORS.rst and .zenodo.json.

Please make sure you've read our contributing guide. We look forward to reviewing your Pull Request shortly ✨

@martinguthrie93 martinguthrie93 force-pushed the fix-gis-crs-alignment branch from 2b37fc0 to d45c3d2 Compare July 3, 2026 12:08
@github-actions github-actions Bot added the docs label Jul 3, 2026
@martinguthrie93

Copy link
Copy Markdown
Author

Note on the failing CI checks

Both currently-failing checks are pre-existing/environmental and are not introduced by this PR. This PR only touches src/ravenpy/utilities/geo.py, src/ravenpy/utilities/analysis.py, tests/test_geo_utilities.py, and the changelog/author files.

1. Code linting (3.14) — 9 errors, all in files this PR does not touch

make lint runs ruff check src/ravenpy tests and reports 9 property-docstring-starts-with-verb violations, all in pre-existing @property docstrings:

  • src/ravenpy/config/commands.py:884, :1104
  • src/ravenpy/config/rvs.py:327
  • src/ravenpy/ravenpy.py:94, :133, :138, :146, :154, :164

This rule is newly enforced because the lint environment resolved ruff==0.15.20 (the pin is ruff>=0.15.9) with preview = true, which now flags long-standing """Return ...""" property docstrings. The three source/test files changed here pass ruff check cleanly.

2. Test Notebooks (Anaconda, …) — broken notebook environment

The cascade of NameErrors (e.g. name 'get_file' is not defined, name 'forecast_sims' is not defined) all stem from the first data-loading cell of each data-fetching notebook failing. The root-cause exceptions are environmental:

  • ImportError: The 'read_file' function requires the 'pyogrio' or 'fiona' package (no vector backend installed)
  • ImportError: libjxl.so.0.11: cannot open shared object file / dlopen(...) on macOS (broken native library)

The failure set is identical on ubuntu-latest and macos-latest, which points to the environment rather than the code. The only notebook that exercises a function changed here is Distributed_hydrological_modelling.ipynb (via upstream_from_coords), and its data is in EPSG:4326, so the new point-reprojection path is a no-op and behavior is unchanged.

This PR's substantive checks pass: pre-commit.ci, CodeQL, Analyze (python), dependency-review, and the Read the Docs build.

Happy to rebase once main's lint/notebook environment is addressed, or to fix the 9 property-docstring-starts-with-verb warnings in a separate PR if that is preferred.

@Zeitsperre

Copy link
Copy Markdown
Member

Feel free to rebase; the changes needed were addressed in #613

HoHo and others added 3 commits July 3, 2026 18:50
`find_geometry_from_coord` reprojected nothing before calling
`GeoDataFrame.contains`, so a geographic query point silently failed to
match (or matched the wrong feature) against a projected routing product.
It now reprojects the point into the GeoDataFrame's CRS and accepts an
optional `point_crs`.

`geom_prop` computed area/perimeter/gravelius regardless of CRS and its
warning fired for the safe (projected) case instead of the risky
(decimal-degree) one. It now accepts an optional `crs` and warns when the
geometry is in a geographic CRS. Returned values are unchanged.

Both changes are backward compatible and covered by new regression tests.

Closes CSHS-CWRA#614

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@martinguthrie93 martinguthrie93 force-pushed the fix-gis-crs-alignment branch from 2f96402 to 0d8397d Compare July 3, 2026 16:52
@martinguthrie93

Copy link
Copy Markdown
Author

Thanks @Zeitsperre! Rebased onto the latest main (which now includes #613). With ruff 0.15.20 locally, ruff check src/ravenpy tests and the rest of make lint (flake8, numpydoc, codespell) now pass, and the two changed source files pick up the standardized ModuleNotFoundError import guards from #613. The CRS fixes and their regression tests are unchanged.

@Zeitsperre

Zeitsperre commented Jul 3, 2026

Copy link
Copy Markdown
Member

Looking good! There may still be some issues with a few builds that the changes I made may have uncovered (some GDAL libraries not loading properly, maybe). Will likely look into this Monday.

@martinguthrie93

Copy link
Copy Markdown
Author

Rebase is in and Code linting is now green. ✅

The remaining Test Notebooks (Anaconda, …) failure is a broken conda environment, not something in this PR. It also reproduces on main itself: the notebooks.yml run on the 2026-07-03 push failed identically, while the scheduled runs through 2026-06-29 all passed — so the environment regressed between those dates.

Precise root cause (same on ubuntu and macOS): both fiona and pyogrio fail to load their compiled extensions because libjxl.so.0.11 / libjxl.0.11.dylib is missing from the solved environment:

Library not loaded: @rpath/libjxl.0.11.dylib  (fiona/_env…so, pyogrio/_ogr…so)
ImportError: pyogrio: libjxl.so.0.11: cannot open shared object file
ImportError: The 'read_file' function requires the 'pyogrio' or 'fiona' package

With no vector backend, every notebook's first geopandas.read_file cell raises, which cascades into the 71 NameErrors (forecast_sims, get_file, etc.). It looks like a conda-forge libjxl SONAME bump (0.11 → newer) that the pinned fiona/pyogrio/GDAL builds were compiled against; pinning libjxl (or refreshing those builds) in the notebook environment on main should clear it.

Nothing in this PR touches that import path — the only notebook exercising a changed function is Distributed_hydrological_modelling.ipynb via upstream_from_coords, whose data is in EPSG:4326, so the new reprojection branch is a no-op. Happy to open a separate PR pinning libjxl for the notebook environment if that would help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Geospatial helpers silently mishandle CRS: find_geometry_from_coord and geom_prop

2 participants