You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat(databases): explicit flags for load, unset command, and star in list
- Replace dot-notation positional <TARGET> on `databases load` with
explicit --catalog, --schema, --table flags
- Add `databases unset` to clear the active database from config
- Show * marker on the active database in `databases list`
- Remove parse_db_target and its tests (no longer needed)
* feat(databases): resolve database by catalog alias; auto-declare table on load
- try_resolve_database now falls through to match by default_catalog when
neither id nor name match, so --catalog works as a lookup key everywhere
- databases load auto-recovers from "not declared": deletes the empty
database, recreates it with the table declared, then retries the load
- Add default_catalog to DatabaseSummary so the list response can be
matched without a per-row fetch
* fix(indexes,search): resolve catalog aliases in connection lookup; fix duplicate score column
- resolve_connection_id falls back to managed database catalog lookup so
`airbnb4.listings[description]` works in indexes create and search
- BM25 search no longer appends 'score' when --select already includes it
* refactor(indexes): replace dot/bracket notation with explicit --catalog/--schema/--table/--column flags
Removes the positional `connection.table[col1,col2]` target argument and
parse_index_target helper. All index creation now uses named flags,
consistent with databases load and search.
* fix: prefer active database connection when resolving catalog name
* docs: update README and skills to reflect new CLI syntax
- databases load: explicit --catalog/--schema/--table flags (no more dot-notation)
- databases list: note * marker on active database
- databases set/unset: documented
- indexes create: --catalog option for managed databases (in addition to --connection-id)
- search: --type and --column are now optional (inferred from indexes)
- workflows: updated examples throughout
* fix: address PR review feedback
- auto-declare: collect existing tables before delete+recreate so they
are preserved in the new database; also pass expires_at through
- databases create hint: update to new --catalog/--table flag syntax
- api.rs: fix workspace_id() doc comment placement
* fix: warn before auto-declare when existing tables have synced data
When a 'not declared' error triggers delete+recreate, check if any
existing tables are synced. If so, show a yellow warning listing the
tables whose data will be lost and prompt for y/N confirmation.
In non-interactive mode (CI, piped stdin, --no-input) the command
errors out with a clear message instead of silently destroying data.
* docs: update hotdata-analytics skill to new databases load and --column flag syntax
---------
Co-authored-by: Eddie A Tejeda <669988+eddietejeda@users.noreply.github.com>
-`create` registers a managed connection with no external credentials. `--name` is a human-readable display name; `--catalog` sets the SQL alias used in queries (`SELECT … FROM <catalog>.schema.table`) and must be `[a-z_][a-z0-9_]*`. Use `--table` to declare tables up front (required before `tables load` on the current API).
154
+
-`create` registers a managed connection with no external credentials. `--name` is a human-readable display name; `--catalog` sets the SQL alias used in queries (`SELECT … FROM <catalog>.schema.table`) and must be `[a-z_][a-z0-9_]*`.
155
+
-`set` / `unset` — save or clear the active database. All `databases tables` and `context` commands default to it. The active database is marked with `*` in `databases list`.
156
+
-`load` (top-level shorthand) — loads a parquet file into `--catalog.--schema.--table`. If the table was not declared at create time, the CLI automatically deletes and recreates the database with the table declared, then retries the load.
150
157
-`tables load` uploads a **parquet** file (or uses a staged `upload_id` from `POST /v1/files`) and publishes it as the table generation (`replace` mode).
151
-
-`run` mints a database-scoped JWT and execs `<cmd>` with `HOTDATA_DATABASE_TOKEN`, `HOTDATA_DATABASE_REFRESH_TOKEN`, `HOTDATA_DATABASE`, `HOTDATA_WORKSPACE`, and `HOTDATA_API_URL` injected into its environment. Pass a database id (group-positional `<id>` like `sandbox run`, or `--database <id>`) to scope an existing database; omit both to auto-create a scratch one using `--name` / `--schema` / `--table` / `--expires-at`. Useful for launching an agent or child process whose API access is restricted to a single database.
158
+
-`run` mints a database-scoped JWT and execs `<cmd>` with `HOTDATA_DATABASE_TOKEN`, `HOTDATA_DATABASE_REFRESH_TOKEN`, `HOTDATA_DATABASE`, `HOTDATA_WORKSPACE`, and `HOTDATA_API_URL` injected into its environment.
152
159
- For CSV/JSON uploads without a managed database, use `hotdata datasets create` instead (`datasets.main.*`).
`--type`is **required**— no default. Pass either `vector` (similarity search via the index's embedding provider) or `bm25` (full-text search). Both run entirely server-side.
243
+
Both run entirely server-side. `--type`and `--column` are **optional**when the table has exactly one search index — they are inferred automatically. Pass them explicitly when multiple indexes exist.
237
244
238
245
```sh
239
246
# BM25 full-text search (requires a BM25 index on the column)
-**`--type vector`** — pass your query as **plain text**, name the **source text column** (e.g. `title`). The server embeds the query at the same time, using the same provider that auto-embedded the column when the index was built — so distance metric, model, and dimensions all match automatically. No `OPENAI_API_KEY`, no client-side embedding, no need to know about the auto-generated `_embedding` column. Generated SQL: `vector_distance(col, 'query')` server-side.
Indexes attach to either a connection-table (`--connection-id` + `--schema` + `--table`) or a dataset (`--dataset-id`). The two scopes are mutually exclusive.
256
263
257
264
```sh
258
-
# Connection-table scope
265
+
# Managed database scope (catalog alias resolves via active database)
Note the printed **`full_name`** (e.g. `datasets.main.chain_revenue_slice` or `chain_db.public.revenue_slice`). For datasets, **`FULL NAME`** from `datasets list` is authoritative.
Copy file name to clipboardExpand all lines: skills/hotdata-search/SKILL.md
+15-10Lines changed: 15 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,15 +16,15 @@ Retrieval workloads in Hotdata: **BM25 full-text**, **vector similarity**, and t
16
16
17
17
## Search CLI
18
18
19
-
`--type`is **required**: `bm25` or `vector`. Both run server-side.
19
+
Both run server-side. `--type`and `--column` are **optional** when the table has exactly one search index — they are inferred automatically. Specify them when multiple indexes exist.
|**`bm25`**| Server generates `bm25_search(table, col, 'text')`. Results sort by score (descending). |
34
34
|**`vector`**| Pass plain-text query; name the **source text column** (e.g. `title`). Server embeds using the same provider/metric/dimensions as the index. SQL uses `vector_distance(col, 'text')`. Results sort by distance (ascending). |
35
35
36
+
-**Inference:** when `--type` or `--column` are omitted, the CLI fetches the table's indexes and selects the only BM25/vector index. If multiple exist, you must specify both flags.
36
37
-**No vector index, or custom embedding model?** Use raw SQL via `hotdata query` (e.g. `cosine_distance(col, [<vec>])`). The removed `--model` / stdin-vector paths hardcoded `l2_distance` and are not supported.
37
38
-**Before search:** create the right index (`indexes create --type bm25` or `--type vector`). See [references/INDEXES.md](references/INDEXES.md).
38
39
- Default `--limit` is 10.
@@ -48,15 +49,19 @@ Indexes attach to a **connection table** (`--connection-id` + `--schema` + `--ta
-`list` — all managed databases in the workspace. Active database is marked with `*`.
206
208
-`create` — creates a new managed database. `--name` is an optional human-readable display name. `--catalog` sets the SQL alias used in queries (`SELECT … FROM <catalog>.schema.table`); must be `[a-z_][a-z0-9_]*`. `--expires-at` accepts relative durations (`24h`, `7d`, `90m`) or an RFC 3339 timestamp; omitting means no expiry. Repeat `--table` to declare tables up front.
207
209
-`set` — saves `<id_or_name>` as the active database. Subsequent `databases tables` and `context` commands use it automatically.
210
+
-`unset` — clears the active database from config.
208
211
-`<id_or_name>` — inspect one database (id, catalog, name, expires_at).
209
212
-`delete` — removes the managed database; clears the active-database config if it matched.
210
-
-`load`— shorthand with dot notation (`database.table` or `database.schema.table`). Schema defaults to `public`.
213
+
-`load`(top-level shorthand) — loads parquet into `--catalog.--schema.--table`. Accepts `--file`, `--url`, or `--upload-id`. If the table was not declared at create time, the CLI automatically deletes and recreates the database with the table declared, then retries the load.
211
214
-`tables list` — lists tables with `TABLE` (`<catalog>.<schema>.<table>`), `SYNCED`, `LAST_SYNC`. Uses active database when `--database` is omitted.
212
215
-`tables load` — uploads a local parquet file (`--file`), a remote parquet URL (`--url`), or a pre-staged upload (`--upload-id`) and publishes with **replace** mode.
213
216
-`tables delete` — drops a table from the managed database.
0 commit comments