Skip to content

Commit c90bbf4

Browse files
authored
Merge pull request #68 from hotdata-dev/feat/geospatial-skill
feat(skills): geospatial skill, multi-skill install, and auto-update
2 parents 27dcdb0 + 2070f5e commit c90bbf4

4 files changed

Lines changed: 508 additions & 95 deletions

File tree

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ pre-release-hook = ["git-cliff", "-o", "CHANGELOG.md", "--tag", "v{{version}}" ]
4747
publish = false
4848
pre-release-replacements = [
4949
{ file = "skills/hotdata/SKILL.md", search = "^version: .+", replace = "version: {{version}}", exactly = 1 },
50+
{ file = "skills/hotdata-geospatial/SKILL.md", search = "^version: .+", replace = "version: {{version}}", exactly = 1 },
5051
{ file = "README.md", search = "version-[0-9.]+-blue", replace = "version-{{version}}-blue", exactly = 1 },
5152
]
5253

skills/hotdata-geospatial/SKILL.md

Lines changed: 282 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,282 @@
1+
---
2+
name: hotdata-geospatial
3+
description: Use this skill only when the user is working with geospatial data in Hotdata (PostGIS-style SQL like ST_* functions, geometry/WKB, bbox filtering, point-in-polygon, distance/area, lat/lon, spatial joins, “geospatial”, “GIS”, “PostGIS”). Do not load this skill for non-geospatial SQL or general Hotdata usage.
4+
version: 0.1.14
5+
---
6+
7+
# Hotdata Geospatial Skill
8+
9+
Use this skill when working with geospatial data in Hotdata. Hotdata supports a subset of PostGIS-style functions using **PostgreSQL dialect SQL**. This reference is dataset-agnostic — apply it to any table with geometry columns.
10+
11+
---
12+
13+
## Geometry Columns
14+
15+
Most geospatial datasets in Hotdata carry one or both of:
16+
17+
| Column | Type | Description |
18+
|---|---|---|
19+
| `wkb_geometry` | `Binary` | WKB-encoded geometry (polygon, point, multipolygon, etc.) |
20+
| `wkb_geometry_bbox` | `Struct` | Precomputed bounding box with fields `xmin`, `ymin`, `xmax`, `ymax` (Float32) |
21+
22+
**Always parse `wkb_geometry` with `ST_GeomFromWKB()` before using it in any spatial function:**
23+
24+
```sql
25+
ST_GeomFromWKB(wkb_geometry)
26+
```
27+
28+
**Access `wkb_geometry_bbox` fields with bracket notation** (dot access is not supported):
29+
30+
```sql
31+
wkb_geometry_bbox['xmin'] -- ✓ works
32+
(wkb_geometry_bbox).xmin -- ✗ not supported
33+
```
34+
35+
Discover geometry columns with:
36+
37+
```sql
38+
hotdata tables list --connection-id <id>
39+
```
40+
41+
---
42+
43+
## Supported Functions
44+
45+
### Input / Construction
46+
47+
| Function | Example |
48+
|---|---|
49+
| `ST_GeomFromWKB(col)` | `ST_GeomFromWKB(wkb_geometry)` |
50+
| `ST_GeomFromText(wkt)` | `ST_GeomFromText('POLYGON((...))')` |
51+
| `ST_MakePoint(lon, lat)` | `ST_MakePoint(-122.27, 37.80)` |
52+
53+
### Output
54+
55+
| Function | Example |
56+
|---|---|
57+
| `ST_AsText(geom)` | `ST_AsText(ST_GeomFromWKB(wkb_geometry))` → WKT string |
58+
| `ST_AsBinary(geom)` | `ST_AsBinary(ST_GeomFromWKB(wkb_geometry))` → WKB binary |
59+
60+
### Accessors / Inspection
61+
62+
| Function | Returns |
63+
|---|---|
64+
| `ST_GeometryType(geom)` | e.g. `ST_Polygon`, `ST_MultiPolygon`, `ST_Point` |
65+
| `ST_IsValid(geom)` | boolean |
66+
| `ST_NumPoints(geom)` | integer |
67+
| `ST_NPoints(geom)` | integer (alias for ST_NumPoints) |
68+
| `ST_X(point)` | longitude (float) |
69+
| `ST_Y(point)` | latitude (float) |
70+
| `ST_Centroid(geom)` | point geometry |
71+
72+
### Measurement
73+
74+
| Function | Unit | Notes |
75+
|---|---|---|
76+
| `ST_Area(geom)` | degrees² | Multiply by `111000 * 111000` for m², then `* 10.7639` for ft² |
77+
| `ST_Length(geom)` | degrees | Multiply by `111000` for approximate meters |
78+
| `ST_Distance(geom_a, geom_b)` | degrees | Multiply by `111000` for approximate meters |
79+
80+
> **No meter-native measurements:** `::geography` cast is not supported. All measurements are in decimal degrees. The conversion factor ~111,000 m/degree is accurate at mid-latitudes (~30–50°N/S) and degrades toward the poles.
81+
82+
### Spatial Relationships
83+
84+
All return `boolean`:
85+
86+
| Function | Meaning |
87+
|---|---|
88+
| `ST_Within(a, b)` | `a` is completely inside `b` |
89+
| `ST_Contains(a, b)` | `a` contains `b` |
90+
| `ST_Covers(a, b)` | `a` covers `b` (includes boundary) |
91+
| `ST_CoveredBy(a, b)` | `a` is covered by `b` |
92+
| `ST_Intersects(a, b)` | geometries share any space |
93+
| `ST_Overlaps(a, b)` | geometries overlap (same dimension) |
94+
| `ST_Touches(a, b)` | share boundary only, no interior overlap |
95+
| `ST_Crosses(a, b)` | geometries cross (different dimensions) |
96+
| `ST_Disjoint(a, b)` | geometries share no space |
97+
| `ST_Equals(a, b)` | geometries are spatially identical |
98+
99+
### Processing / Geometry Operations
100+
101+
| Function | Notes |
102+
|---|---|
103+
| `ST_ConvexHull(geom)` | Returns convex hull polygon |
104+
| `ST_Simplify(geom, tolerance)` | Douglas-Peucker simplification; tolerance in degrees |
105+
| `ST_OrientedEnvelope(geom)` | Minimum oriented bounding box |
106+
107+
---
108+
109+
## Not Supported
110+
111+
| Category | Not Supported | Workaround |
112+
|---|---|---|
113+
| Output | `ST_AsGeoJSON`, `ST_AsEWKT` | Use `ST_AsText`; parse WKT client-side |
114+
| Cast | `::geography` | Multiply degrees by ~111,000 for meters |
115+
| Input | `ST_MakeEnvelope`, `ST_GeomFromGeoJSON`, `ST_MakeLine` | Use `ST_GeomFromText('POLYGON(...)')` for envelopes |
116+
| Accessors | `ST_SRID`, `ST_IsEmpty`, `ST_NumGeometries`, `ST_GeometryN`, `ST_ExteriorRing`, `ST_PointN`, `ST_StartPoint`, `ST_EndPoint` ||
117+
| Measurement | `ST_Perimeter`, `ST_MaxDistance` ||
118+
| Relationships | `ST_DWithin` | Use `ST_Within` + `ST_GeomFromText('POLYGON(...)')` |
119+
| Processing | `ST_Buffer`, `ST_Envelope`, `ST_Boundary`, `ST_Union`, `ST_Intersection`, `ST_Difference`, `ST_SymDifference`, `ST_Collect`, `ST_ClosestPoint`, `ST_Snap`, `ST_BoundingDiagonal`, `ST_Expand` | Use `ST_OrientedEnvelope` instead of `ST_Envelope` |
120+
| Projection | `ST_Transform`, `ST_SetSRID`, `ST_FlipCoordinates` ||
121+
122+
---
123+
124+
## Common Patterns
125+
126+
### Check geometry types in a table
127+
128+
```sql
129+
SELECT ST_GeometryType(ST_GeomFromWKB(wkb_geometry)) AS geom_type, COUNT(*)
130+
FROM <table>
131+
WHERE wkb_geometry IS NOT NULL
132+
GROUP BY 1
133+
```
134+
135+
### Bounding box filter (replaces ST_MakeEnvelope / ST_DWithin)
136+
137+
Use `ST_GeomFromText` with a closed WKT polygon ring:
138+
139+
```sql
140+
WHERE ST_Within(
141+
ST_Centroid(ST_GeomFromWKB(wkb_geometry)),
142+
ST_GeomFromText('POLYGON((minLon minLat, maxLon minLat, maxLon maxLat, minLon maxLat, minLon minLat))')
143+
)
144+
```
145+
146+
**Vertex order:** `(minLon minLat, maxLon minLat, maxLon maxLat, minLon maxLat, minLon minLat)` — close the ring by repeating the first point.
147+
148+
**Faster alternative** using the precomputed bbox struct (no WKB parsing):
149+
150+
```sql
151+
WHERE wkb_geometry_bbox['xmin'] >= <minLon>
152+
AND wkb_geometry_bbox['xmax'] <= <maxLon>
153+
AND wkb_geometry_bbox['ymin'] >= <minLat>
154+
AND wkb_geometry_bbox['ymax'] <= <maxLat>
155+
```
156+
157+
Use the bbox approach for large tables where WKB parsing is expensive; use `ST_Within` when you need centroid-in-polygon precision.
158+
159+
### Point-in-polygon test
160+
161+
```sql
162+
SELECT *
163+
FROM <table>
164+
WHERE ST_Contains(
165+
ST_GeomFromWKB(wkb_geometry),
166+
ST_MakePoint(<lon>, <lat>)
167+
)
168+
```
169+
170+
### Nearest neighbors (closest N features to a point)
171+
172+
```sql
173+
SELECT
174+
<id_col>,
175+
ST_Distance(
176+
ST_Centroid(ST_GeomFromWKB(wkb_geometry)),
177+
ST_MakePoint(<lon>, <lat>)
178+
) * 111000 AS dist_meters
179+
FROM <table>
180+
WHERE wkb_geometry IS NOT NULL
181+
ORDER BY dist_meters
182+
LIMIT 10
183+
```
184+
185+
### Distance between two known points
186+
187+
```sql
188+
SELECT
189+
ST_Distance(ST_MakePoint(<lon1>, <lat1>), ST_MakePoint(<lon2>, <lat2>)) * 111000 AS dist_meters,
190+
ST_Distance(ST_MakePoint(<lon1>, <lat1>), ST_MakePoint(<lon2>, <lat2>)) * 69.0 AS dist_miles
191+
```
192+
193+
### Area of polygon features
194+
195+
```sql
196+
SELECT
197+
<id_col>,
198+
ST_Area(ST_GeomFromWKB(wkb_geometry)) * 111000 * 111000 AS area_sqm,
199+
ST_Area(ST_GeomFromWKB(wkb_geometry)) * 111000 * 111000 * 10.7639 AS area_sqft,
200+
ST_Area(ST_GeomFromWKB(wkb_geometry)) * 111000 * 111000 / 4047 AS area_acres
201+
FROM <table>
202+
WHERE wkb_geometry IS NOT NULL
203+
```
204+
205+
### Centroid coordinates
206+
207+
```sql
208+
SELECT
209+
<id_col>,
210+
ST_X(ST_Centroid(ST_GeomFromWKB(wkb_geometry))) AS lon,
211+
ST_Y(ST_Centroid(ST_GeomFromWKB(wkb_geometry))) AS lat
212+
FROM <table>
213+
WHERE wkb_geometry IS NOT NULL
214+
```
215+
216+
### Convert to WKT for export or inspection
217+
218+
```sql
219+
SELECT <id_col>, ST_AsText(ST_GeomFromWKB(wkb_geometry)) AS wkt
220+
FROM <table>
221+
WHERE wkb_geometry IS NOT NULL
222+
LIMIT 10
223+
```
224+
225+
### Simplify geometry for faster rendering
226+
227+
```sql
228+
SELECT <id_col>, ST_AsText(ST_Simplify(ST_GeomFromWKB(wkb_geometry), 0.0001)) AS simplified_wkt
229+
FROM <table>
230+
WHERE wkb_geometry IS NOT NULL
231+
```
232+
233+
Tolerance is in degrees (~11 m at mid-latitudes). Increase for coarser simplification, decrease for finer.
234+
235+
---
236+
237+
## Unit Conversion Reference
238+
239+
| To get | Multiply degrees by |
240+
|---|---|
241+
| Meters (distance) | × 111,000 |
242+
| Kilometers (distance) | × 111 |
243+
| Miles (distance) | × 69.0 |
244+
| Feet (distance) | × 364,173 |
245+
| m² (area) | × 111,000² = × 12,321,000,000 |
246+
| ft² (area) | × 111,000² × 10.7639 |
247+
| Acres (area) | × 111,000² ÷ 4,047 |
248+
249+
> These conversions assume ~37°N latitude. They are approximations — accuracy decreases significantly above 60°N or below 60°S.
250+
251+
---
252+
253+
## Workflow: Exploring a New Geospatial Dataset
254+
255+
1. **Check for geometry columns:**
256+
```
257+
hotdata tables list --connection-id <id>
258+
```
259+
Look for `Binary` (WKB) or `Struct` (bbox) typed columns.
260+
261+
2. **Verify geometry types:**
262+
```sql
263+
SELECT ST_GeometryType(ST_GeomFromWKB(wkb_geometry)) AS type, COUNT(*)
264+
FROM <table> WHERE wkb_geometry IS NOT NULL GROUP BY 1
265+
```
266+
267+
3. **Check coverage (bounding box of entire dataset):**
268+
```sql
269+
SELECT
270+
MIN(wkb_geometry_bbox['xmin']) AS min_lon,
271+
MIN(wkb_geometry_bbox['ymin']) AS min_lat,
272+
MAX(wkb_geometry_bbox['xmax']) AS max_lon,
273+
MAX(wkb_geometry_bbox['ymax']) AS max_lat
274+
FROM <table>
275+
WHERE wkb_geometry_bbox IS NOT NULL
276+
```
277+
278+
4. **Sample WKT to understand geometry structure:**
279+
```sql
280+
SELECT ST_AsText(ST_GeomFromWKB(wkb_geometry)) FROM <table>
281+
WHERE wkb_geometry IS NOT NULL LIMIT 3
282+
```

src/main.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,12 @@ fn main() {
9292
util::set_debug(true);
9393
}
9494

95+
let skip_skill_auto_update =
96+
cli.command.is_none() || matches!(&cli.command, Some(Commands::Skills { .. }));
97+
if !skip_skill_auto_update {
98+
skill::maybe_auto_update_after_cli_upgrade();
99+
}
100+
95101
match cli.command {
96102
None => {
97103
use clap::CommandFactory;

0 commit comments

Comments
 (0)