Skip to content

Commit 4a45f5d

Browse files
committed
feat: rename datasets command to views
Renames the `hotdata datasets` CLI command to `hotdata views` with a new `src/views.rs` module. The command and all user-facing terminology (help text, output messages, SQL prefix `views.`, skill docs) now use "view" / "views". Server-side API paths remain unchanged (`/datasets`). - Add `src/views.rs` (renamed from deleted `datasets.rs`) - Add `Views` / `ViewsCommands` to `command.rs` - Wire dispatch in `main.rs` - Update README, SKILL.md, WORKFLOWS.md, DATA_MODEL.template.md, MODEL_BUILD.md across hotdata and hotdata-analytics skills
1 parent ebda8ff commit 4a45f5d

10 files changed

Lines changed: 651 additions & 121 deletions

File tree

README.md

Lines changed: 11 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ API key priority (lowest to highest): config file → `HOTDATA_API_KEY` env var
6767
| `connections` | `list`, `create`, `refresh`, `new` | Manage connections |
6868
| `databases` | `list`, `create`, `delete`, `tables` | Managed databases (create and load tables via parquet) |
6969
| `tables` | `list` | List tables and columns |
70-
| `datasets` | `list`, `create`, `update` | Manage uploaded datasets |
70+
| `views` | `list`, `create`, `update`, `refresh` | Manage SQL-derived views |
7171
| `context` | `list`, `show`, `pull`, `push` | Workspace Markdown context (e.g. data model `DATAMODEL`) via the context API |
7272
| `query` | | Execute a SQL query |
7373
| `queries` | `list` | Inspect query run history |
@@ -146,7 +146,7 @@ hotdata databases tables delete <database> <table> [--schema public]
146146

147147
- `create` registers a managed connection (`source_type: managed`) with no external credentials. Use `--table` to declare tables up front (required before `tables load` on the current API).
148148
- `tables load` uploads a **parquet** file (or uses a staged `upload_id` from `POST /v1/files`) and publishes it as the table generation (`replace` mode).
149-
- For CSV/JSON uploads without a managed database, use `hotdata datasets create` instead (`datasets.main.*`).
149+
- For SQL-query materializations without a managed database, use `hotdata views create` instead (`views.main.*`).
150150

151151
Example:
152152

@@ -167,24 +167,19 @@ hotdata tables list [--workspace-id <id>] [--connection-id <id>] [--schema <patt
167167
- `--schema` and `--table` support SQL `%` wildcard patterns.
168168
- Tables are displayed as `<connection>.<schema>.<table>` — use this format in SQL queries.
169169

170-
## Datasets
170+
## Views
171171

172172
```sh
173-
hotdata datasets list [--workspace-id <id>] [--limit <n>] [--offset <n>] [--format table|json|yaml]
174-
hotdata datasets <dataset_id> [--workspace-id <id>] [--format table|json|yaml]
175-
hotdata datasets create --file data.csv [--label "My Dataset"] [--table-name my_dataset]
176-
hotdata datasets create --sql "SELECT ..." --label "My Dataset"
177-
hotdata datasets create --url "https://example.com/data.parquet" --label "My Dataset"
178-
hotdata datasets update <dataset_id> [--label "New Label"] [--table-name new_table]
179-
hotdata datasets refresh <dataset_id> [--workspace-id <id>] [--async]
173+
hotdata views list [--workspace-id <id>] [--limit <n>] [--offset <n>] [--output table|json|yaml]
174+
hotdata views <view_id> [--workspace-id <id>] [--output table|json|yaml]
175+
hotdata views create --name my_view [--description "My View"] (--sql "SELECT ..." | --query-id <id>)
176+
hotdata views update <view_id> [--description "New Label"] [--name new_table]
177+
hotdata views refresh <view_id> [--workspace-id <id>] [--async]
180178
```
181179

182-
- Datasets are queryable as `datasets.main.<table_name>`.
183-
- `--file`, `--sql`, `--query-id`, and `--url` are mutually exclusive.
184-
- `--url` imports data directly from a URL (supports csv, json, parquet).
185-
- Format is auto-detected from file extension or content.
186-
- Piped stdin is supported: `cat data.csv | hotdata datasets create --label "My Dataset"`
187-
- `refresh` re-runs the dataset's source (URL fetch or saved query) and creates a new version. Not supported for upload-source datasets.
180+
- Views are queryable as `views.main.<name>`.
181+
- `--sql` and `--query-id` are mutually exclusive; exactly one is required for `create`.
182+
- `refresh` re-runs the view's source query and creates a new version.
188183
- `--async` submits the refresh as a background job and returns a job ID; poll with `hotdata jobs <job_id>`.
189184

190185
## Workspace context

skills/hotdata-analytics/SKILL.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
---
22
name: hotdata-analytics
3-
description: Use this skill when the user wants OLAP-style SQL analytics in Hotdata — aggregations, GROUP BY, JOINs, reporting, exploratory queries, query run history, stored results, or materialized follow-up tables (Chain via datasets or managed databases). Activate for "analyze", "aggregate", "rollup", "pivot", "report", "metrics", "GROUP BY", "query history", "past queries", "query runs", "stored results", "materialize", "chain", "intermediate table", or sorted indexes for filters/range scans. Do not load for BM25/vector search or geospatial SQL — use hotdata-search or hotdata-geospatial. Requires the core hotdata skill for connections, tables, datasets, and auth.
3+
description: Use this skill when the user wants OLAP-style SQL analytics in Hotdata — aggregations, GROUP BY, JOINs, reporting, exploratory queries, query run history, stored results, or materialized follow-up tables (Chain via views or managed databases). Activate for "analyze", "aggregate", "rollup", "pivot", "report", "metrics", "GROUP BY", "query history", "past queries", "query runs", "stored results", "materialize", "chain", "intermediate table", or sorted indexes for filters/range scans. Do not load for BM25/vector search or geospatial SQL — use hotdata-search or hotdata-geospatial. Requires the core hotdata skill for connections, tables, views, and auth.
44
version: 0.3.2
55
---
66

77
# Hotdata Analytics Skill
88

99
**OLAP-style analytics** in Hotdata: PostgreSQL-dialect SQL, query execution, run history, stored results, **Chain** materializations, and **sorted** indexes for filters and joins.
1010

11-
**Prerequisites:** Authenticate, workspace, and catalog discovery via the **`hotdata`** skill (`connections`, `tables`, `datasets`, `databases`).
11+
**Prerequisites:** Authenticate, workspace, and catalog discovery via the **`hotdata`** skill (`connections`, `tables`, `views`, `databases`).
1212

1313
**Related skills:** **`hotdata-search`** (BM25, vector, retrieval indexes), **`hotdata-geospatial`** (spatial SQL).
1414

@@ -23,7 +23,7 @@ hotdata query status <query_run_id> [--output table|json|csv]
2323

2424
- **PostgreSQL dialect.** Quote mixed-case identifiers: `"CustomerName"`.
2525
- Use **`hotdata tables list`** for schema discovery — not `information_schema` via `query`.
26-
- Fully qualified names: `<connection>.<schema>.<table>`, `datasets.<schema>.<table>`, `<database>.<schema>.<table>`.
26+
- Fully qualified names: `<connection>.<schema>.<table>`, `views.<schema>.<table>`, `<database>.<schema>.<table>`.
2727
- Long-running queries may return `query_run_id` → poll with **`query status`** (exit `2` = still running). Do not re-run identical heavy SQL while polling.
2828
- For **workspace-wide** joins and naming, load **context:DATAMODEL** when listed (`hotdata context list``show DATAMODEL`) — see **`hotdata`** skill.
2929

@@ -82,8 +82,8 @@ hotdata results <result_id> [--workspace-id <workspace_id>] [--output table|json
8282
2. **Materialize** (pick one)
8383

8484
```bash
85-
hotdata datasets create --name chain_slice [--description "chain slice"] --sql "SELECT ..."
86-
hotdata datasets create --name chain_from_saved [--description "from saved"] --query-id <query_id>
85+
hotdata views create --name chain_slice --description "chain slice" --sql "SELECT ..."
86+
hotdata views create --name chain_from_saved --description "from saved" --query-id <query_id>
8787
```
8888

8989
Or managed parquet:
@@ -94,10 +94,10 @@ hotdata results <result_id> [--workspace-id <workspace_id>] [--output table|json
9494
hotdata databases tables load slice --file ./slice.parquet
9595
```
9696

97-
3. **Chain query** — use printed **`full_name`** or `datasets list` **FULL NAME** column:
97+
3. **Chain query** — use printed **`full_name`** or `views list` **FULL NAME** column:
9898

9999
```bash
100-
hotdata query "SELECT * FROM datasets.main.chain_slice WHERE ..."
100+
hotdata query "SELECT * FROM views.main.chain_slice WHERE ..."
101101
hotdata query "SELECT * FROM analytics.public.slice WHERE ..."
102102
```
103103

@@ -122,4 +122,4 @@ List and delete use the same `hotdata indexes` commands as in the search skill;
122122

123123
## Sandboxes and chains
124124

125-
Sandbox datasets use **`datasets.<sandbox_id>.<table>`**, not `datasets.main`. Run queries with active sandbox config or `hotdata sandbox <id> run hotdata query "..."`. See **`hotdata`** skill **Sandboxes**.
125+
Sandbox views use **`views.<sandbox_id>.<table>`**, not `views.main`. Run queries with active sandbox config or `hotdata sandbox <id> run hotdata query "..."`. See **`hotdata`** skill **Sandboxes**.

skills/hotdata-analytics/references/WORKFLOWS.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
OLAP-style SQL, **History** (query runs and stored results), and **Chain** (materialized follow-ups). Requires **`hotdata`** for auth, workspaces, and catalog commands.
44

5-
**Related:** **`hotdata-search`** for BM25/vector indexes and `hotdata search`; **`hotdata`** [WORKFLOWS.md](../../hotdata/references/WORKFLOWS.md) for datasets vs managed databases.
5+
**Related:** **`hotdata-search`** for BM25/vector indexes and `hotdata search`; **`hotdata`** [WORKFLOWS.md](../../hotdata/references/WORKFLOWS.md) for views vs managed databases.
66

77
---
88

@@ -66,11 +66,11 @@ hotdata query "SELECT ..."
6666

6767
Land a smaller table — pick one:
6868

69-
**Datasets** (CSV/JSON/URL/SQL snapshot → `datasets.<schema>.<table>`):
69+
**Views** (SQL snapshot → `views.<schema>.<table>`):
7070

7171
```bash
72-
hotdata datasets create --label "chain revenue slice" --sql "SELECT ..." [--table-name chain_revenue_slice]
73-
hotdata datasets create --label "from saved" --query-id <query_id> [--table-name ...]
72+
hotdata views create --name chain_revenue_slice --description "chain revenue slice" --sql "SELECT ..."
73+
hotdata views create --name chain_from_saved --description "from saved" --query-id <query_id>
7474
```
7575

7676
**Managed database** (parquet → `<database>.<schema>.<table>`):
@@ -80,17 +80,17 @@ hotdata databases create --name chain_db --table revenue_slice
8080
hotdata databases tables load chain_db revenue_slice --file ./revenue_slice.parquet
8181
```
8282

83-
Note the printed **`full_name`** (e.g. `datasets.main.chain_revenue_slice` or `chain_db.public.revenue_slice`). For datasets, **`FULL NAME`** from `datasets list` is authoritative.
83+
Note the printed **`full_name`** (e.g. `views.main.chain_revenue_slice` or `chain_db.public.revenue_slice`). For views, **`FULL NAME`** from `views list` is authoritative.
8484

8585
### 3. Chain query
8686

87-
Query using that name — do not hardcode `datasets.main` if the schema segment is a sandbox id:
87+
Query using that name — do not hardcode `views.main` if the schema segment is a sandbox id:
8888

8989
```bash
90-
hotdata datasets list
91-
hotdata query "SELECT * FROM datasets.main.chain_revenue_slice WHERE ..."
90+
hotdata views list
91+
hotdata query "SELECT * FROM views.main.chain_revenue_slice WHERE ..."
9292
# Sandbox example (use actual full_name from create or list):
93-
# hotdata query "SELECT * FROM datasets.s_ufmblmvq.chain_revenue_slice WHERE ..."
93+
# hotdata query "SELECT * FROM views.s_ufmblmvq.chain_revenue_slice WHERE ..."
9494
# Managed database:
9595
# hotdata query "SELECT * FROM chain_db.public.revenue_slice WHERE ..."
9696
```
@@ -99,18 +99,18 @@ hotdata query "SELECT * FROM datasets.main.chain_revenue_slice WHERE ..."
9999

100100
For **sandbox-scoped** chain tables:
101101

102-
- Qualified name is **`datasets.<sandbox_id>.<table>`**, not `datasets.main`.
102+
- Qualified name is **`views.<sandbox_id>.<table>`**, not `views.main`.
103103
- Run queries with **active sandbox** in config (`hotdata sandbox set`) **or** inside **`hotdata sandbox <sandbox_id> run hotdata query "…"`**.
104104
- Without sandbox context, you may get **access denied** on sandbox-only tables.
105105

106106
### Naming and documentation
107107

108108
- Prefer predictable `--table-name` values: `chain_<topic>_<YYYYMMDD>`.
109-
- Record long-lived chains in **context:DATAMODEL → Derived tables (Chain)** with the **full** SQL name you use (`datasets.…` or `database.schema.table`).
109+
- Record long-lived chains in **context:DATAMODEL → Derived tables (Chain)** with the **full** SQL name you use (`views.…` or `database.schema.table`).
110110
- Promote join/grain findings to **context:DATAMODEL** when they should outlive the sandbox (**`hotdata`** skill).
111111

112112
### Guardrails
113113

114114
- Materialize when the base scan is large and the follow-up runs many times.
115115
- Keep Chain tables focused; avoid wide `SELECT *` materializations when a narrow projection suffices.
116-
- For upload format choice (datasets vs databases), see **`hotdata`** WORKFLOWS — [Datasets vs managed databases](../../hotdata/references/WORKFLOWS.md#datasets-vs-managed-databases).
116+
- For source format choice (views vs databases), see **`hotdata`** WORKFLOWS — [Views vs managed databases](../../hotdata/references/WORKFLOWS.md#views-vs-managed-databases).

0 commit comments

Comments
 (0)