feat(gis): add spatial/GIS query support with GeoParquet output#97
Open
feat(gis): add spatial/GIS query support with GeoParquet output#97
Conversation
43234f9 to
a82cba4
Compare
- Detect geometry columns in PostgreSQL/PostGIS with SRID metadata - Fetch geometry data as WKB using ST_AsBinary() - Write GeoParquet 1.1.0 metadata with CRS information - Register geodatafusion spatial functions (st_area, st_distance, etc.) - Add spatial type support for MySQL, Snowflake, and DuckDB backends - Add GIS integration tests
Parse "geo" metadata from uploaded GeoParquet files and pass geometry column info to the StreamingParquetWriter so output datasets maintain GeoParquet 1.1.0 metadata.
Use kartoza/postgis:16-3.4 which supports arm64/amd64. Add retry logic for container startup, fix tokio runtime requirements, and use only geodatafusion-supported spatial functions.
Extend explicit column definitions to support geometry types with SRID and geometry_type metadata. Geometry columns are stored as WKB binary with GeoParquet metadata for spatial query support.
Load spatial extension in discover_tables_sync and fetch_table_to_channel. Add graceful fallback in build_fetch_query when ST_AsBinary is unavailable (bundled crate limitation).
Remove unused parse_geometry_type_params function. Use both udt_name (for geometry detection) and data_type (for type mapping) in discover_tables, matching the approach already used in fetch_table.
Previously, when a table had spatial columns but ST_AsBinary was unavailable (bundled crate limitation), build_fetch_query silently fell back to SELECT *. This produced DuckDB's internal geometry format instead of WKB, but downstream code (GeoParquet writer) would incorrectly treat it as WKB, causing data corruption. Now returns an error with a clear message when spatial columns are detected but the spatial extension isn't functional.
When information_schema query returns zero rows (case mismatch, permissions), column_exprs is empty, producing invalid SQL like SELECT FROM .... Now falls back to SELECT * when column list empty.
a82cba4 to
ab4e778
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #86