|
| 1 | +--- |
| 2 | +name: hotdata-geospatial |
| 3 | +description: Use this skill only when the user is working with geospatial data in Hotdata (PostGIS-style SQL like ST_* functions, geometry/WKB, bbox filtering, point-in-polygon, distance/area, lat/lon, spatial joins, “geospatial”, “GIS”, “PostGIS”). Do not load this skill for non-geospatial SQL or general Hotdata usage. |
| 4 | +version: 0.1.14 |
| 5 | +--- |
| 6 | + |
| 7 | +# Hotdata Geospatial Skill |
| 8 | + |
| 9 | +Use this skill when working with geospatial data in Hotdata. Hotdata supports a subset of PostGIS-style functions using **PostgreSQL dialect SQL**. This reference is dataset-agnostic — apply it to any table with geometry columns. |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +## Geometry Columns |
| 14 | + |
| 15 | +Most geospatial datasets in Hotdata carry one or both of: |
| 16 | + |
| 17 | +| Column | Type | Description | |
| 18 | +|---|---|---| |
| 19 | +| `wkb_geometry` | `Binary` | WKB-encoded geometry (polygon, point, multipolygon, etc.) | |
| 20 | +| `wkb_geometry_bbox` | `Struct` | Precomputed bounding box with fields `xmin`, `ymin`, `xmax`, `ymax` (Float32) | |
| 21 | + |
| 22 | +**Always parse `wkb_geometry` with `ST_GeomFromWKB()` before using it in any spatial function:** |
| 23 | + |
| 24 | +```sql |
| 25 | +ST_GeomFromWKB(wkb_geometry) |
| 26 | +``` |
| 27 | + |
| 28 | +**Access `wkb_geometry_bbox` fields with bracket notation** (dot access is not supported): |
| 29 | + |
| 30 | +```sql |
| 31 | +wkb_geometry_bbox['xmin'] -- ✓ works |
| 32 | +(wkb_geometry_bbox).xmin -- ✗ not supported |
| 33 | +``` |
| 34 | + |
| 35 | +Discover geometry columns with: |
| 36 | + |
| 37 | +```sql |
| 38 | +hotdata tables list --connection-id <id> |
| 39 | +``` |
| 40 | + |
| 41 | +--- |
| 42 | + |
| 43 | +## Supported Functions |
| 44 | + |
| 45 | +### Input / Construction |
| 46 | + |
| 47 | +| Function | Example | |
| 48 | +|---|---| |
| 49 | +| `ST_GeomFromWKB(col)` | `ST_GeomFromWKB(wkb_geometry)` | |
| 50 | +| `ST_GeomFromText(wkt)` | `ST_GeomFromText('POLYGON((...))')` | |
| 51 | +| `ST_MakePoint(lon, lat)` | `ST_MakePoint(-122.27, 37.80)` | |
| 52 | + |
| 53 | +### Output |
| 54 | + |
| 55 | +| Function | Example | |
| 56 | +|---|---| |
| 57 | +| `ST_AsText(geom)` | `ST_AsText(ST_GeomFromWKB(wkb_geometry))` → WKT string | |
| 58 | +| `ST_AsBinary(geom)` | `ST_AsBinary(ST_GeomFromWKB(wkb_geometry))` → WKB binary | |
| 59 | + |
| 60 | +### Accessors / Inspection |
| 61 | + |
| 62 | +| Function | Returns | |
| 63 | +|---|---| |
| 64 | +| `ST_GeometryType(geom)` | e.g. `ST_Polygon`, `ST_MultiPolygon`, `ST_Point` | |
| 65 | +| `ST_IsValid(geom)` | boolean | |
| 66 | +| `ST_NumPoints(geom)` | integer | |
| 67 | +| `ST_NPoints(geom)` | integer (alias for ST_NumPoints) | |
| 68 | +| `ST_X(point)` | longitude (float) | |
| 69 | +| `ST_Y(point)` | latitude (float) | |
| 70 | +| `ST_Centroid(geom)` | point geometry | |
| 71 | + |
| 72 | +### Measurement |
| 73 | + |
| 74 | +| Function | Unit | Notes | |
| 75 | +|---|---|---| |
| 76 | +| `ST_Area(geom)` | degrees² | Multiply by `111000 * 111000` for m², then `* 10.7639` for ft² | |
| 77 | +| `ST_Length(geom)` | degrees | Multiply by `111000` for approximate meters | |
| 78 | +| `ST_Distance(geom_a, geom_b)` | degrees | Multiply by `111000` for approximate meters | |
| 79 | + |
| 80 | +> **No meter-native measurements:** `::geography` cast is not supported. All measurements are in decimal degrees. The conversion factor ~111,000 m/degree is accurate at mid-latitudes (~30–50°N/S) and degrades toward the poles. |
| 81 | +
|
| 82 | +### Spatial Relationships |
| 83 | + |
| 84 | +All return `boolean`: |
| 85 | + |
| 86 | +| Function | Meaning | |
| 87 | +|---|---| |
| 88 | +| `ST_Within(a, b)` | `a` is completely inside `b` | |
| 89 | +| `ST_Contains(a, b)` | `a` contains `b` | |
| 90 | +| `ST_Covers(a, b)` | `a` covers `b` (includes boundary) | |
| 91 | +| `ST_CoveredBy(a, b)` | `a` is covered by `b` | |
| 92 | +| `ST_Intersects(a, b)` | geometries share any space | |
| 93 | +| `ST_Overlaps(a, b)` | geometries overlap (same dimension) | |
| 94 | +| `ST_Touches(a, b)` | share boundary only, no interior overlap | |
| 95 | +| `ST_Crosses(a, b)` | geometries cross (different dimensions) | |
| 96 | +| `ST_Disjoint(a, b)` | geometries share no space | |
| 97 | +| `ST_Equals(a, b)` | geometries are spatially identical | |
| 98 | + |
| 99 | +### Processing / Geometry Operations |
| 100 | + |
| 101 | +| Function | Notes | |
| 102 | +|---|---| |
| 103 | +| `ST_ConvexHull(geom)` | Returns convex hull polygon | |
| 104 | +| `ST_Simplify(geom, tolerance)` | Douglas-Peucker simplification; tolerance in degrees | |
| 105 | +| `ST_OrientedEnvelope(geom)` | Minimum oriented bounding box | |
| 106 | + |
| 107 | +--- |
| 108 | + |
| 109 | +## Not Supported |
| 110 | + |
| 111 | +| Category | Not Supported | Workaround | |
| 112 | +|---|---|---| |
| 113 | +| Output | `ST_AsGeoJSON`, `ST_AsEWKT` | Use `ST_AsText`; parse WKT client-side | |
| 114 | +| Cast | `::geography` | Multiply degrees by ~111,000 for meters | |
| 115 | +| Input | `ST_MakeEnvelope`, `ST_GeomFromGeoJSON`, `ST_MakeLine` | Use `ST_GeomFromText('POLYGON(...)')` for envelopes | |
| 116 | +| Accessors | `ST_SRID`, `ST_IsEmpty`, `ST_NumGeometries`, `ST_GeometryN`, `ST_ExteriorRing`, `ST_PointN`, `ST_StartPoint`, `ST_EndPoint` | — | |
| 117 | +| Measurement | `ST_Perimeter`, `ST_MaxDistance` | — | |
| 118 | +| Relationships | `ST_DWithin` | Use `ST_Within` + `ST_GeomFromText('POLYGON(...)')` | |
| 119 | +| Processing | `ST_Buffer`, `ST_Envelope`, `ST_Boundary`, `ST_Union`, `ST_Intersection`, `ST_Difference`, `ST_SymDifference`, `ST_Collect`, `ST_ClosestPoint`, `ST_Snap`, `ST_BoundingDiagonal`, `ST_Expand` | Use `ST_OrientedEnvelope` instead of `ST_Envelope` | |
| 120 | +| Projection | `ST_Transform`, `ST_SetSRID`, `ST_FlipCoordinates` | — | |
| 121 | + |
| 122 | +--- |
| 123 | + |
| 124 | +## Common Patterns |
| 125 | + |
| 126 | +### Check geometry types in a table |
| 127 | + |
| 128 | +```sql |
| 129 | +SELECT ST_GeometryType(ST_GeomFromWKB(wkb_geometry)) AS geom_type, COUNT(*) |
| 130 | +FROM <table> |
| 131 | +WHERE wkb_geometry IS NOT NULL |
| 132 | +GROUP BY 1 |
| 133 | +``` |
| 134 | + |
| 135 | +### Bounding box filter (replaces ST_MakeEnvelope / ST_DWithin) |
| 136 | + |
| 137 | +Use `ST_GeomFromText` with a closed WKT polygon ring: |
| 138 | + |
| 139 | +```sql |
| 140 | +WHERE ST_Within( |
| 141 | + ST_Centroid(ST_GeomFromWKB(wkb_geometry)), |
| 142 | + ST_GeomFromText('POLYGON((minLon minLat, maxLon minLat, maxLon maxLat, minLon maxLat, minLon minLat))') |
| 143 | +) |
| 144 | +``` |
| 145 | + |
| 146 | +**Vertex order:** `(minLon minLat, maxLon minLat, maxLon maxLat, minLon maxLat, minLon minLat)` — close the ring by repeating the first point. |
| 147 | + |
| 148 | +**Faster alternative** using the precomputed bbox struct (no WKB parsing): |
| 149 | + |
| 150 | +```sql |
| 151 | +WHERE wkb_geometry_bbox['xmin'] >= <minLon> |
| 152 | + AND wkb_geometry_bbox['xmax'] <= <maxLon> |
| 153 | + AND wkb_geometry_bbox['ymin'] >= <minLat> |
| 154 | + AND wkb_geometry_bbox['ymax'] <= <maxLat> |
| 155 | +``` |
| 156 | + |
| 157 | +Use the bbox approach for large tables where WKB parsing is expensive; use `ST_Within` when you need centroid-in-polygon precision. |
| 158 | + |
| 159 | +### Point-in-polygon test |
| 160 | + |
| 161 | +```sql |
| 162 | +SELECT * |
| 163 | +FROM <table> |
| 164 | +WHERE ST_Contains( |
| 165 | + ST_GeomFromWKB(wkb_geometry), |
| 166 | + ST_MakePoint(<lon>, <lat>) |
| 167 | +) |
| 168 | +``` |
| 169 | + |
| 170 | +### Nearest neighbors (closest N features to a point) |
| 171 | + |
| 172 | +```sql |
| 173 | +SELECT |
| 174 | + <id_col>, |
| 175 | + ST_Distance( |
| 176 | + ST_Centroid(ST_GeomFromWKB(wkb_geometry)), |
| 177 | + ST_MakePoint(<lon>, <lat>) |
| 178 | + ) * 111000 AS dist_meters |
| 179 | +FROM <table> |
| 180 | +WHERE wkb_geometry IS NOT NULL |
| 181 | +ORDER BY dist_meters |
| 182 | +LIMIT 10 |
| 183 | +``` |
| 184 | + |
| 185 | +### Distance between two known points |
| 186 | + |
| 187 | +```sql |
| 188 | +SELECT |
| 189 | + ST_Distance(ST_MakePoint(<lon1>, <lat1>), ST_MakePoint(<lon2>, <lat2>)) * 111000 AS dist_meters, |
| 190 | + ST_Distance(ST_MakePoint(<lon1>, <lat1>), ST_MakePoint(<lon2>, <lat2>)) * 69.0 AS dist_miles |
| 191 | +``` |
| 192 | + |
| 193 | +### Area of polygon features |
| 194 | + |
| 195 | +```sql |
| 196 | +SELECT |
| 197 | + <id_col>, |
| 198 | + ST_Area(ST_GeomFromWKB(wkb_geometry)) * 111000 * 111000 AS area_sqm, |
| 199 | + ST_Area(ST_GeomFromWKB(wkb_geometry)) * 111000 * 111000 * 10.7639 AS area_sqft, |
| 200 | + ST_Area(ST_GeomFromWKB(wkb_geometry)) * 111000 * 111000 / 4047 AS area_acres |
| 201 | +FROM <table> |
| 202 | +WHERE wkb_geometry IS NOT NULL |
| 203 | +``` |
| 204 | + |
| 205 | +### Centroid coordinates |
| 206 | + |
| 207 | +```sql |
| 208 | +SELECT |
| 209 | + <id_col>, |
| 210 | + ST_X(ST_Centroid(ST_GeomFromWKB(wkb_geometry))) AS lon, |
| 211 | + ST_Y(ST_Centroid(ST_GeomFromWKB(wkb_geometry))) AS lat |
| 212 | +FROM <table> |
| 213 | +WHERE wkb_geometry IS NOT NULL |
| 214 | +``` |
| 215 | + |
| 216 | +### Convert to WKT for export or inspection |
| 217 | + |
| 218 | +```sql |
| 219 | +SELECT <id_col>, ST_AsText(ST_GeomFromWKB(wkb_geometry)) AS wkt |
| 220 | +FROM <table> |
| 221 | +WHERE wkb_geometry IS NOT NULL |
| 222 | +LIMIT 10 |
| 223 | +``` |
| 224 | + |
| 225 | +### Simplify geometry for faster rendering |
| 226 | + |
| 227 | +```sql |
| 228 | +SELECT <id_col>, ST_AsText(ST_Simplify(ST_GeomFromWKB(wkb_geometry), 0.0001)) AS simplified_wkt |
| 229 | +FROM <table> |
| 230 | +WHERE wkb_geometry IS NOT NULL |
| 231 | +``` |
| 232 | + |
| 233 | +Tolerance is in degrees (~11 m at mid-latitudes). Increase for coarser simplification, decrease for finer. |
| 234 | + |
| 235 | +--- |
| 236 | + |
| 237 | +## Unit Conversion Reference |
| 238 | + |
| 239 | +| To get | Multiply degrees by | |
| 240 | +|---|---| |
| 241 | +| Meters (distance) | × 111,000 | |
| 242 | +| Kilometers (distance) | × 111 | |
| 243 | +| Miles (distance) | × 69.0 | |
| 244 | +| Feet (distance) | × 364,173 | |
| 245 | +| m² (area) | × 111,000² = × 12,321,000,000 | |
| 246 | +| ft² (area) | × 111,000² × 10.7639 | |
| 247 | +| Acres (area) | × 111,000² ÷ 4,047 | |
| 248 | + |
| 249 | +> These conversions assume ~37°N latitude. They are approximations — accuracy decreases significantly above 60°N or below 60°S. |
| 250 | +
|
| 251 | +--- |
| 252 | + |
| 253 | +## Workflow: Exploring a New Geospatial Dataset |
| 254 | + |
| 255 | +1. **Check for geometry columns:** |
| 256 | + ``` |
| 257 | + hotdata tables list --connection-id <id> |
| 258 | + ``` |
| 259 | + Look for `Binary` (WKB) or `Struct` (bbox) typed columns. |
| 260 | + |
| 261 | +2. **Verify geometry types:** |
| 262 | + ```sql |
| 263 | + SELECT ST_GeometryType(ST_GeomFromWKB(wkb_geometry)) AS type, COUNT(*) |
| 264 | + FROM <table> WHERE wkb_geometry IS NOT NULL GROUP BY 1 |
| 265 | + ``` |
| 266 | + |
| 267 | +3. **Check coverage (bounding box of entire dataset):** |
| 268 | + ```sql |
| 269 | + SELECT |
| 270 | + MIN(wkb_geometry_bbox['xmin']) AS min_lon, |
| 271 | + MIN(wkb_geometry_bbox['ymin']) AS min_lat, |
| 272 | + MAX(wkb_geometry_bbox['xmax']) AS max_lon, |
| 273 | + MAX(wkb_geometry_bbox['ymax']) AS max_lat |
| 274 | + FROM <table> |
| 275 | + WHERE wkb_geometry_bbox IS NOT NULL |
| 276 | + ``` |
| 277 | + |
| 278 | +4. **Sample WKT to understand geometry structure:** |
| 279 | + ```sql |
| 280 | + SELECT ST_AsText(ST_GeomFromWKB(wkb_geometry)) FROM <table> |
| 281 | + WHERE wkb_geometry IS NOT NULL LIMIT 3 |
| 282 | + ``` |
0 commit comments