Skip to content

Commit d43343d

Browse files
authored
Merge pull request #82 from hotdata-dev/feat/databases-managed-cli
Add managed databases CLI
2 parents 14aeaa6 + e129fcb commit d43343d

6 files changed

Lines changed: 1194 additions & 6 deletions

File tree

README.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ API key priority (lowest to highest): config file → `HOTDATA_API_KEY` env var
6565
| `auth` | `login`, `status`, `logout` | `login` or bare `auth` opens browser login; `status` / `logout` manage the saved profile |
6666
| `workspaces` | `list`, `set` | Manage workspaces |
6767
| `connections` | `list`, `create`, `refresh`, `new` | Manage connections |
68+
| `databases` | `list`, `create`, `delete`, `tables` | Managed databases (create and load tables via parquet) |
6869
| `tables` | `list` | List tables and columns |
6970
| `datasets` | `list`, `create`, `update` | Manage uploaded datasets |
7071
| `context` | `list`, `show`, `pull`, `push` | Workspace Markdown context (e.g. data model `DATAMODEL`) via the context API |
@@ -127,6 +128,34 @@ hotdata connections create list <type_name> --format json
127128
hotdata connections create --name "my-conn" --type postgres --config '{"host":"...","port":5432,...}'
128129
```
129130

131+
## Databases
132+
133+
Managed databases are Hotdata-owned catalogs you create and populate yourself (no remote source to sync). Query them with SQL as `database_name.schema.table` — the database name is the connection name.
134+
135+
```sh
136+
hotdata databases list [-w <id>] [-o table|json|yaml]
137+
hotdata databases create --name <name> [--table <table> ...] [--schema public] [-o table|json|yaml]
138+
hotdata databases <name_or_id> [-o table|json|yaml]
139+
hotdata databases delete <name_or_id>
140+
141+
hotdata databases tables list <database> [--schema <name>] [-o table|json|yaml]
142+
hotdata databases tables load <database> <table> --file ./data.parquet [--schema public]
143+
hotdata databases tables load <database> <table> --upload-id <id> [--schema public]
144+
hotdata databases tables delete <database> <table> [--schema public]
145+
```
146+
147+
- `create` registers a managed connection (`source_type: managed`) with no external credentials. Use `--table` to declare tables up front (required before `tables load` on the current API).
148+
- `tables load` uploads a **parquet** file (or uses a staged `upload_id` from `POST /v1/files`) and publishes it as the table generation (`replace` mode).
149+
- For CSV/JSON uploads without a managed database, use `hotdata datasets create` instead (`datasets.main.*`).
150+
151+
Example:
152+
153+
```sh
154+
hotdata databases create --name sales --table orders
155+
hotdata databases tables load sales orders --file ./orders.parquet
156+
hotdata query "SELECT count(*) FROM sales.public.orders"
157+
```
158+
130159
## Tables
131160

132161
```sh

skills/hotdata/SKILL.md

Lines changed: 58 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
name: hotdata
3-
description: Use this skill when the user wants to run hotdata CLI commands, query the Hotdata API, list workspaces, list connections, create connections, list tables, manage datasets, execute SQL queries, inspect query run history, search tables, manage indexes, manage sandboxes, manage workspace context and stored docs such as context:DATAMODEL via the context API (`hotdata context`), install or update the bundled agent skills (`hotdata skills`), generate shell completions (`hotdata completions`), or interact with the hotdata service. Activate when the user says "run hotdata", "query hotdata", "list workspaces", "list connections", "create a connection", "list tables", "list datasets", "create a dataset", "upload a dataset", "execute a query", "search a table", "list indexes", "create an index", "list query runs", "list past queries", "query history", "list sandboxes", "create a sandbox", "run a sandbox", "workspace context", "pull context", "push context", "data model", "context:DATAMODEL", or asks you to use the hotdata CLI.
3+
description: Use this skill when the user wants to run hotdata CLI commands, query the Hotdata API, list workspaces, list connections, create connections, list or create managed databases, load parquet into database tables, list tables, manage datasets, execute SQL queries, inspect query run history, search tables, manage indexes, manage sandboxes, manage workspace context and stored docs such as context:DATAMODEL via the context API (`hotdata context`), install or update the bundled agent skills (`hotdata skills`), generate shell completions (`hotdata completions`), or interact with the hotdata service. Activate when the user says "run hotdata", "query hotdata", "list workspaces", "list connections", "create a connection", "list databases", "create a database", "managed database", "load parquet", "list tables", "list datasets", "create a dataset", "upload a dataset", "execute a query", "search a table", "list indexes", "create an index", "list query runs", "list past queries", "query history", "list sandboxes", "create a sandbox", "run a sandbox", "workspace context", "pull context", "push context", "data model", "context:DATAMODEL", or asks you to use the hotdata CLI.
44
version: 0.2.2
55
---
66

@@ -73,14 +73,14 @@ These are **patterns** built from the commands below—not separate CLI subcomma
7373

7474
- **Model (`context:DATAMODEL`)** — The **shared** Markdown semantic map of the workspace (entities, keys, joins across connections). **Store and read it only via workspace context** (`hotdata context list`, then `hotdata context show DATAMODEL` **only when listed**, `context push DATAMODEL`); refresh using `connections`, `connections refresh`, `tables list`, and `datasets list`. For a **deep** pass (connector enrichment, indexes, per-table detail), see [references/MODEL_BUILD.md](references/MODEL_BUILD.md). Contrast **analysis modeling** in sandboxes or chat (see [Analysis modeling vs context:DATAMODEL](#analysis-modeling-vs-contextdatamodel)).
7575
- **History** — Inspect prior activity via `hotdata queries list` (query runs) and `hotdata results list` / `results <id>` (row data).
76-
- **Chain** — Follow-ups via **`datasets create`** then `query` against `datasets.<schema>.<table>`.
76+
- **Chain** — Follow-ups via **`datasets create`** then `query` against `datasets.<schema>.<table>`, or via **`databases create`** + **`databases tables load`** (parquet) then `query` against `<database>.<schema>.<table>`.
7777
- **Indexes** — Review SQL and schema, compare to existing indexes, create **sorted**, **bm25**, or **vector** indexes when it clearly helps; see [references/WORKFLOWS.md](references/WORKFLOWS.md#indexes).
7878

7979
Full step-by-step procedures: [references/WORKFLOWS.md](references/WORKFLOWS.md).
8080

8181
## Available Commands
8282

83-
Top-level subcommands (each detailed below): **`auth`**, **`datasets`**, **`query`**, **`workspaces`**, **`connections`**, **`tables`**, **`skills`**, **`results`**, **`jobs`**, **`indexes`**, **`embedding-providers`**, **`search`**, **`queries`**, **`sandbox`**, **`context`**, **`completions`**.
83+
Top-level subcommands (each detailed below): **`auth`**, **`datasets`**, **`query`**, **`workspaces`**, **`connections`**, **`databases`**, **`tables`**, **`skills`**, **`results`**, **`jobs`**, **`indexes`**, **`embedding-providers`**, **`search`**, **`queries`**, **`sandbox`**, **`context`**, **`completions`**.
8484

8585
Global CLI options: **`--api-key`**, **`-v` / `--version`**, **`-h` / `--help`**. Hidden developer flag: **`--debug`** (verbose HTTP logs).
8686

@@ -167,6 +167,43 @@ hotdata connections create \
167167
- Fields with `"type": "array"` must be JSON arrays (e.g. `"spreadsheet_ids": ["abc", "def"]`).
168168
- Nested `oneOf` fields must be a JSON object including a `"type"` discriminator field matching the chosen variant's `const` value.
169169

170+
### Managed databases (`databases`)
171+
172+
**Managed databases** are Hotdata-owned catalogs (`source_type: managed`) you create and populate yourself—no remote source to sync. Query them in SQL as **`<database_name>.<schema>.<table>`** (the database name is the connection name). Prefer **`hotdata databases`** over **`hotdata connections create --type managed`** for this workflow.
173+
174+
**Parquet vs datasets:** `databases tables load` accepts **parquet only**. For CSV/JSON uploads without a managed database, use **`hotdata datasets create`**.
175+
176+
**Declare tables at create time:** On the current API, each table must be declared with **`--table`** when creating the database before **`tables load`** will succeed. If load fails with *not declared*, recreate with `--table` or add declaration support when the API allows it.
177+
178+
```
179+
hotdata databases list [--workspace-id <workspace_id>] [--output table|json|yaml]
180+
hotdata databases create --name <name> [--table <table> ...] [--schema public] [--workspace-id <workspace_id>] [--output table|json|yaml]
181+
hotdata databases <name_or_id> [--workspace-id <workspace_id>] [--output table|json|yaml]
182+
hotdata databases delete <name_or_id> [--workspace-id <workspace_id>]
183+
184+
hotdata databases tables list <database> [--schema <name>] [--workspace-id <workspace_id>] [--output table|json|yaml]
185+
hotdata databases tables load <database> <table> --file ./data.parquet [--schema public] [--workspace-id <workspace_id>]
186+
hotdata databases tables load <database> <table> --upload-id <id> [--schema public] [--workspace-id <workspace_id>]
187+
hotdata databases tables delete <database> <table> [--schema public] [--workspace-id <workspace_id>]
188+
```
189+
190+
- `list` — managed databases only (filters `source_type: managed` from connections).
191+
- `create` — registers a managed connection with optional `config.schemas[].tables[]` from repeated **`--table`**. Default schema is **`public`**.
192+
- `<name_or_id>` — inspect one database (name, id, table counts, SQL prefix hint).
193+
- `delete` — removes the managed database and its tables.
194+
- `tables list` — tables with `TABLE` (`<database>.<schema>.<table>`), `SYNCED`, `LAST_SYNC` (via `information_schema`).
195+
- `tables load` — uploads a local **parquet** file (or uses **`--upload-id`** from a prior `POST /v1/files` staging) and publishes with **`replace`** mode. **`--file`** and **`--upload-id`** are mutually exclusive.
196+
- `tables delete` — drops a table from the managed database.
197+
- Resolving by **name** or **connection id** works for all subcommands that take `<database>` or `<name_or_id>`. Non-managed connections error with a hint to use **`hotdata connections`**.
198+
199+
Example:
200+
201+
```
202+
hotdata databases create --name sales --table orders
203+
hotdata databases tables load sales orders --file ./orders.parquet
204+
hotdata query "SELECT count(*) FROM sales.public.orders"
205+
```
206+
170207
### List Tables and Columns
171208
```
172209
hotdata tables list [--workspace-id <workspace_id>] [--connection-id <connection_id>] [--schema <pattern>] [--table <pattern>] [--limit <int>] [--cursor <cursor>] [--output table|json|yaml]
@@ -499,6 +536,24 @@ Use a sandbox to explore tables and capture **analysis-oriented** notes in sandb
499536
hotdata query "SELECT \"CustomerName\" FROM datasets.main.my_csv LIMIT 10"
500537
```
501538

539+
## Workflow: Creating a managed database (parquet)
540+
541+
1. Create the database and declare tables up front:
542+
```
543+
hotdata databases create --name mydb --table events --table users
544+
```
545+
2. Load parquet into each table:
546+
```
547+
hotdata databases tables load mydb events --file ./events.parquet
548+
```
549+
3. Confirm tables and query:
550+
```
551+
hotdata databases tables list mydb
552+
hotdata query "SELECT * FROM mydb.public.events LIMIT 10"
553+
```
554+
555+
For CSV/JSON file uploads, use **`hotdata datasets create`** instead.
556+
502557
## Workflow: Creating a Connection
503558

504559
1. List available connection types:

src/command.rs

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,23 @@ pub enum Commands {
6969
command: Option<ConnectionsCommands>,
7070
},
7171

72+
/// Managed databases you create and populate with tables (parquet uploads)
73+
Databases {
74+
/// Database name or connection ID (omit to use a subcommand)
75+
name_or_id: Option<String>,
76+
77+
/// Workspace ID (defaults to first workspace from login)
78+
#[arg(long, short = 'w', global = true)]
79+
workspace_id: Option<String>,
80+
81+
/// Output format
82+
#[arg(long = "output", short = 'o', default_value = "table", value_parser = ["table", "json", "yaml"])]
83+
output: String,
84+
85+
#[command(subcommand)]
86+
command: Option<DatabasesCommands>,
87+
},
88+
7289
/// Manage tables in a workspace
7390
Tables {
7491
#[command(subcommand)]
@@ -515,6 +532,98 @@ pub enum ConnectionsCreateCommands {
515532
},
516533
}
517534

535+
#[derive(Subcommand)]
536+
pub enum DatabasesCommands {
537+
/// List managed databases in the workspace
538+
List {
539+
/// Output format
540+
#[arg(long = "output", short = 'o', default_value = "table", value_parser = ["table", "json", "yaml"])]
541+
output: String,
542+
},
543+
544+
/// Create a new managed database
545+
Create {
546+
/// Database name (used as the connection name in SQL: `name.schema.table`)
547+
#[arg(long)]
548+
name: String,
549+
550+
/// Schema for tables declared at create time (default: public)
551+
#[arg(long, default_value = "public")]
552+
schema: String,
553+
554+
/// Table to declare up front (repeatable). Required before load on current API.
555+
#[arg(long = "table")]
556+
tables: Vec<String>,
557+
558+
/// Output format
559+
#[arg(long = "output", short = 'o', default_value = "table", value_parser = ["table", "json", "yaml"])]
560+
output: String,
561+
},
562+
563+
/// Delete a managed database and its tables
564+
Delete {
565+
/// Database name or connection ID
566+
name_or_id: String,
567+
},
568+
569+
/// Manage tables inside a managed database
570+
Tables {
571+
#[command(subcommand)]
572+
command: DatabaseTablesCommands,
573+
},
574+
}
575+
576+
#[derive(Subcommand)]
577+
pub enum DatabaseTablesCommands {
578+
/// List tables in a managed database
579+
List {
580+
/// Database name or connection ID
581+
database: String,
582+
583+
/// Filter by schema name
584+
#[arg(long)]
585+
schema: Option<String>,
586+
587+
/// Output format
588+
#[arg(long = "output", short = 'o', default_value = "table", value_parser = ["table", "json", "yaml"])]
589+
output: String,
590+
},
591+
592+
/// Load a parquet file into a table (creates or replaces the table)
593+
Load {
594+
/// Database name or connection ID
595+
database: String,
596+
597+
/// Table name
598+
table: String,
599+
600+
/// Schema name (default: public)
601+
#[arg(long, default_value = "public")]
602+
schema: String,
603+
604+
/// Path to a local parquet file to upload and load
605+
#[arg(long, conflicts_with = "upload_id")]
606+
file: Option<String>,
607+
608+
/// Use a previously staged upload ID from `POST /v1/files` instead of uploading
609+
#[arg(long)]
610+
upload_id: Option<String>,
611+
},
612+
613+
/// Delete a table from a managed database
614+
Delete {
615+
/// Database name or connection ID
616+
database: String,
617+
618+
/// Table name
619+
table: String,
620+
621+
/// Schema name (default: public)
622+
#[arg(long, default_value = "public")]
623+
schema: String,
624+
},
625+
}
626+
518627
#[derive(Subcommand)]
519628
pub enum ConnectionsCommands {
520629
/// Interactively create a new connection

0 commit comments

Comments
 (0)