You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: update README and skills to reflect new CLI syntax
- databases load: explicit --catalog/--schema/--table flags (no more dot-notation)
- databases list: note * marker on active database
- databases set/unset: documented
- indexes create: --catalog option for managed databases (in addition to --connection-id)
- search: --type and --column are now optional (inferred from indexes)
- workflows: updated examples throughout
-`create` registers a managed connection with no external credentials. `--name` is a human-readable display name; `--catalog` sets the SQL alias used in queries (`SELECT … FROM <catalog>.schema.table`) and must be `[a-z_][a-z0-9_]*`. Use `--table` to declare tables up front (required before `tables load` on the current API).
154
+
-`create` registers a managed connection with no external credentials. `--name` is a human-readable display name; `--catalog` sets the SQL alias used in queries (`SELECT … FROM <catalog>.schema.table`) and must be `[a-z_][a-z0-9_]*`.
155
+
-`set` / `unset` — save or clear the active database. All `databases tables` and `context` commands default to it. The active database is marked with `*` in `databases list`.
156
+
-`load` (top-level shorthand) — loads a parquet file into `--catalog.--schema.--table`. If the table was not declared at create time, the CLI automatically deletes and recreates the database with the table declared, then retries the load.
150
157
-`tables load` uploads a **parquet** file (or uses a staged `upload_id` from `POST /v1/files`) and publishes it as the table generation (`replace` mode).
151
-
-`run` mints a database-scoped JWT and execs `<cmd>` with `HOTDATA_DATABASE_TOKEN`, `HOTDATA_DATABASE_REFRESH_TOKEN`, `HOTDATA_DATABASE`, `HOTDATA_WORKSPACE`, and `HOTDATA_API_URL` injected into its environment. Pass a database id (group-positional `<id>` like `sandbox run`, or `--database <id>`) to scope an existing database; omit both to auto-create a scratch one using `--name` / `--schema` / `--table` / `--expires-at`. Useful for launching an agent or child process whose API access is restricted to a single database.
158
+
-`run` mints a database-scoped JWT and execs `<cmd>` with `HOTDATA_DATABASE_TOKEN`, `HOTDATA_DATABASE_REFRESH_TOKEN`, `HOTDATA_DATABASE`, `HOTDATA_WORKSPACE`, and `HOTDATA_API_URL` injected into its environment.
152
159
- For CSV/JSON uploads without a managed database, use `hotdata datasets create` instead (`datasets.main.*`).
`--type`is **required**— no default. Pass either `vector` (similarity search via the index's embedding provider) or `bm25` (full-text search). Both run entirely server-side.
243
+
Both run entirely server-side. `--type`and `--column` are **optional**when the table has exactly one search index — they are inferred automatically. Pass them explicitly when multiple indexes exist.
237
244
238
245
```sh
239
246
# BM25 full-text search (requires a BM25 index on the column)
-**`--type vector`** — pass your query as **plain text**, name the **source text column** (e.g. `title`). The server embeds the query at the same time, using the same provider that auto-embedded the column when the index was built — so distance metric, model, and dimensions all match automatically. No `OPENAI_API_KEY`, no client-side embedding, no need to know about the auto-generated `_embedding` column. Generated SQL: `vector_distance(col, 'query')` server-side.
Indexes attach to either a connection-table (`--connection-id` + `--schema` + `--table`) or a dataset (`--dataset-id`). The two scopes are mutually exclusive.
256
263
257
264
```sh
258
-
# Connection-table scope
265
+
# Managed database scope (catalog alias resolves via active database)
Copy file name to clipboardExpand all lines: skills/hotdata-search/SKILL.md
+15-10Lines changed: 15 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,15 +16,15 @@ Retrieval workloads in Hotdata: **BM25 full-text**, **vector similarity**, and t
16
16
17
17
## Search CLI
18
18
19
-
`--type`is **required**: `bm25` or `vector`. Both run server-side.
19
+
Both run server-side. `--type`and `--column` are **optional** when the table has exactly one search index — they are inferred automatically. Specify them when multiple indexes exist.
|**`bm25`**| Server generates `bm25_search(table, col, 'text')`. Results sort by score (descending). |
34
34
|**`vector`**| Pass plain-text query; name the **source text column** (e.g. `title`). Server embeds using the same provider/metric/dimensions as the index. SQL uses `vector_distance(col, 'text')`. Results sort by distance (ascending). |
35
35
36
+
-**Inference:** when `--type` or `--column` are omitted, the CLI fetches the table's indexes and selects the only BM25/vector index. If multiple exist, you must specify both flags.
36
37
-**No vector index, or custom embedding model?** Use raw SQL via `hotdata query` (e.g. `cosine_distance(col, [<vec>])`). The removed `--model` / stdin-vector paths hardcoded `l2_distance` and are not supported.
37
38
-**Before search:** create the right index (`indexes create --type bm25` or `--type vector`). See [references/INDEXES.md](references/INDEXES.md).
38
39
- Default `--limit` is 10.
@@ -48,15 +49,19 @@ Indexes attach to a **connection table** (`--connection-id` + `--schema` + `--ta
-`list` — all managed databases in the workspace. Active database is marked with `*`.
206
208
-`create` — creates a new managed database. `--name` is an optional human-readable display name. `--catalog` sets the SQL alias used in queries (`SELECT … FROM <catalog>.schema.table`); must be `[a-z_][a-z0-9_]*`. `--expires-at` accepts relative durations (`24h`, `7d`, `90m`) or an RFC 3339 timestamp; omitting means no expiry. Repeat `--table` to declare tables up front.
207
209
-`set` — saves `<id_or_name>` as the active database. Subsequent `databases tables` and `context` commands use it automatically.
210
+
-`unset` — clears the active database from config.
208
211
-`<id_or_name>` — inspect one database (id, catalog, name, expires_at).
209
212
-`delete` — removes the managed database; clears the active-database config if it matched.
210
-
-`load`— shorthand with dot notation (`database.table` or `database.schema.table`). Schema defaults to `public`.
213
+
-`load`(top-level shorthand) — loads parquet into `--catalog.--schema.--table`. Accepts `--file`, `--url`, or `--upload-id`. If the table was not declared at create time, the CLI automatically deletes and recreates the database with the table declared, then retries the load.
211
214
-`tables list` — lists tables with `TABLE` (`<catalog>.<schema>.<table>`), `SYNCED`, `LAST_SYNC`. Uses active database when `--database` is omitted.
212
215
-`tables load` — uploads a local parquet file (`--file`), a remote parquet URL (`--url`), or a pre-staged upload (`--upload-id`) and publishes with **replace** mode.
213
216
-`tables delete` — drops a table from the managed database.
0 commit comments