feat(ux): explicit --catalog/--table flags and datasets → views rename#114
feat(ux): explicit --catalog/--table flags and datasets → views rename#114eddietejeda wants to merge 10 commits into
Conversation
Removing the datasets CLI surface for now. Deletes datasets.rs entirely and strips the Datasets variant from Commands, DatasetsCommands enum, the dispatch block in main.rs, and the stale cross-reference in the databases error message.
Renames the `hotdata datasets` CLI command to `hotdata views` with a new `src/views.rs` module. The command and all user-facing terminology (help text, output messages, SQL prefix `views.`, skill docs) now use "view" / "views". Server-side API paths remain unchanged (`/datasets`). - Add `src/views.rs` (renamed from deleted `datasets.rs`) - Add `Views` / `ViewsCommands` to `command.rs` - Wire dispatch in `main.rs` - Update README, SKILL.md, WORKFLOWS.md, DATA_MODEL.template.md, MODEL_BUILD.md across hotdata and hotdata-analytics skills
…--table flags Removes the `connection.schema.table` and `airbnb.listings[col1,col2]` positional/dot-notation patterns across load, search, and indexes create. All three commands now use named flags for discoverability and consistency. Also adds --name (SQL catalog alias) to databases create, resolves databases by name instead of description, and updates list/create output accordingly. Deletes parse_db_target, parse_index_target, and resolve_connection_id which were only needed to destructure the old dot-notation strings.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Review
Blocking Issues
src/views.rs:194—hotdata views <id>hardcodesviews.main.{table}and ignoresViewDetail.schema_name. For a sandbox-scoped view the printedfull_nameis wrong, and the SKILL/README docs in this PR specifically instruct users to copyfull_nameintoFROMclauses. The neighboringUpdateResponsecomment in the same file already warns that synthesizing"main"is incorrect whenschema_name == sandbox_id— that exact mistake is reintroduced here.
Action Required
- Use
v.schema_nameinviews::get'sfull_nameoutput (suggestion posted inline).
Other notes (non-blocking)
- README
## Searchand## Indexessections (untouched by this PR but now stale):--table <connection.schema.table>andindexes create --connection-id …no longer match the CLI, which now expects--catalog/--schema/--tableforsearchandindexes create. Worth updating in this PR since it owns the UX change. src/command.rs—Set { id_or_description }doc still says "Database id or description", andDelete { name_or_id }says "Database name or connection ID". Sincetry_resolve_databasenow matches byname(and error messages say "id or name"), these doc strings drift from the actual behavior.
- Fix views.rs:194 to use v.schema_name instead of hardcoded "main" - Update command.rs doc strings: Set uses "id or name", Delete uses "name or ID" - Update README Search and Indexes sections to use --catalog/--table flags - Update hotdata skill: databases load/create syntax, fix views->datasets ref - Update hotdata-search skill: search and indexes create use --catalog/--table
| #### Create a view | ||
| ``` | ||
| hotdata datasets create --name <table_name> [--description "My Dataset"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>] | ||
| hotdata views create --name <table_name> [--description “My View”] (--sql “SELECT ...” | --query-id <saved_query_id>) [--workspace-id <workspace_id>] |
There was a problem hiding this comment.
nit: (not blocking) This command example uses smart/curly quotes (“ ”) instead of straight ASCII quotes. The pre-PR line used ". Since these SKILL docs are designed for agents to copy commands verbatim, curly quotes will break shell parsing when pasted. Same issue on line 287 (hotdata query “SELECT * FROM views.main.my_view LIMIT 10”) and in references/WORKFLOWS.md lines 104 and 113.
| hotdata views create --name <table_name> [--description “My View”] (--sql “SELECT ...” | --query-id <saved_query_id>) [--workspace-id <workspace_id>] | |
| hotdata views create --name <table_name> [--description "My View"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>] |
…tor queries - Add `default_catalog` field to `Database` struct (returned by API) - Make `--catalog` required on `search` and `indexes create` commands instead of falling back to the active database config - Use `db.default_catalog` as the vector table prefix so the query resolves against the correct catalog alias, not the resolved name
There was a problem hiding this comment.
Blocking issue from the prior review (src/views.rs hardcoding views.main) is resolved — get() now prints views.{schema_name}.{table_name}, and update() correctly gates on the optional schema_name. New --catalog/--schema/--table routing for search and indexes create is correct (catalog.unwrap() is guarded by clap's required_unless_present).
Two non-blocking nits from cycle 1 remain open on their existing threads (curly quotes in copy-paste command blocks in SKILL.md/WORKFLOWS.md; the --catalog default load hint when --name is omitted). Neither blocks merge.
bm25_search passes its table reference as a string literal to the server, which now rewrites catalog aliases (e.g. "default") to the real managed connection name before execution. Use db.default_catalog as the prefix for both BM25 and vector table refs so both search types route through the same catalog alias resolution path.
| #### Create a view | ||
| ``` | ||
| hotdata datasets create --name <table_name> [--description "My Dataset"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>] | ||
| hotdata views create --name <table_name> [--description “My View”] (--sql “SELECT ...” | --query-id <saved_query_id>) [--workspace-id <workspace_id>] |
There was a problem hiding this comment.
nit: (not blocking) Still using smart/curly quotes (“ ”) instead of straight ASCII quotes here and on line 287. Same issue in references/WORKFLOWS.md lines 104 and 113. Since these SKILL docs are meant for agents to copy commands verbatim, curly quotes break shell parsing when pasted.
| hotdata views create --name <table_name> [--description “My View”] (--sql “SELECT ...” | --query-id <saved_query_id>) [--workspace-id <workspace_id>] | |
| hotdata views create --name <table_name> [--description "My View"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>] |
| concat!( | ||
| "Load a table:\n", | ||
| " hotdata databases load --file <path.parquet> {}.<table_name>\n", | ||
| " hotdata databases load --catalog {} --table <table_name> --file <path.parquet>\n", |
There was a problem hiding this comment.
nit: (not blocking) When --name is omitted (result.name == None), catalog falls back to the literal "default", so the hint prints hotdata databases load --catalog default --table .... Copied verbatim, tables_load → resolve_database(&api, "default") fails (no database with id or name 'default') since the new DB has no such name/id. The DB was just saved as current a few lines up, so the working form is hotdata databases load --table <name> --file ... (no --catalog). Consider printing the no-catalog form when result.name.is_none().
There was a problem hiding this comment.
Prior blocking issue (views.rs full_name hardcoding main) is resolved — it now uses v.schema_name. Two prior nits remain unaddressed (curly quotes in SKILL.md/WORKFLOWS.md command examples, and the --catalog default create hint); re-flagged inline. Neither is blocking. Approving.
| #### Create a view | ||
| ``` | ||
| hotdata datasets create --name <table_name> [--description "My Dataset"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>] | ||
| hotdata views create --name <table_name> [--description “My View”] (--sql “SELECT ...” | --query-id <saved_query_id>) [--workspace-id <workspace_id>] |
There was a problem hiding this comment.
nit: (not blocking) Smart/curly quotes (“ ”) are still here and on line 287, plus references/WORKFLOWS.md lines 104 and 113. Since these docs are copied verbatim by agents, curly quotes break shell parsing when pasted. Replace with straight ASCII quotes:
| hotdata views create --name <table_name> [--description “My View”] (--sql “SELECT ...” | --query-id <saved_query_id>) [--workspace-id <workspace_id>] | |
| hotdata views create --name <table_name> [--description "My View"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>] |
There was a problem hiding this comment.
Prior blocking issue resolved: views.rs get() now derives full_name from schema_name instead of hardcoding views.main. Remaining items are non-blocking nits (curly quotes in copy-paste command examples; the --catalog default create hint when --name is omitted). Approving.
…es default_catalog
There was a problem hiding this comment.
Cycle 2: The blocking full_name issue in src/views.rs is resolved — all output paths (create, list, get, update) now use schema_name instead of hardcoding main, so sandbox-scoped views print the correct qualified name.
The remaining nits (curly/smart quotes in SKILL.md lines 264/287 and references/WORKFLOWS.md lines 104/113, plus the --catalog default load hint in databases.rs) are non-blocking and already tracked in open threads from the prior cycle — worth a quick fix before merge since the docs are copy-pasted verbatim by agents, but not gating.
Summary
connection.schema.tableandairbnb.listings[col1,col2]positional/dot-notation patterns acrossdatabases load,search, andindexes createwith explicit--catalog,--schema, and--tablenamed flags--name(SQL catalog alias) todatabases create; resolves databases bynameinstead ofdescription; updateslistandcreateoutput to show name andexpires_atparse_db_target,parse_index_target, andresolve_connection_id— only existed to destructure the old dot-notation stringsdefault_catalogfield to theDatabasestruct (returned by API); uses it as the table prefix for bothbm25_searchandvector_searchSQL so catalog alias resolution on the server handles thedefault→__db_*translation (pairs with runtimedb BM25 catalog alias rewrite)--catalogrequired onsearchandindexes create(no silent fallback to the active database)Known limitations / pending server work
BM25 search on managed tables —
hotdata search --type bm25 --catalog <name>generatesbm25_search('default.public.<table>', ...). The server'sbm25_searchtable function does not yet resolve thedefaultcatalog alias to the real connection name. Unlike vector ANN (fixed in runtimedb #545, merged 2026-06-03), BM25 alias resolution requires a follow-up runtimedb PR before it works end-to-end on managed databases.Vector search on managed tables — auto-embedding vector indexes (
--type vector) are not supported on managed tables; the embedding pipeline replaces the parquet, conflicting with managed load semantics. Vector search works on connection tables.End-to-end example (SQL query)
Until the BM25 server fix ships, the working end-to-end pattern for managed databases uses
hotdata querydirectly:Test plan
hotdata databases create --name mydb— name appears in output, SQL tip uses--catalog mydbhotdata databases list— NAME column appearshotdata databases load --catalog mydb --table orders --url ...— loads 7535 rows,full_name: mydb.public.ordershotdata indexes create --catalog mydb --table orders --column description --type bm25— index createdhotdata query "SELECT name, description FROM airbnb.public.listings WHERE description ILIKE '%cozy%' LIMIT 10"— returns results from managed databasehotdata search "ocean view apartment" --catalog airbnb --table listings --type bm25 --column description— blocked: BM25 catalog alias rewrite not yet deployed (generatesbm25_search('default.public.listings', ...)→ server returns "Connection 'default' not found")hotdata search --type vectoron managed table — blocked: auto-embedding not supported on managed tables (embedding pipeline conflicts with parquet replace semantics)hotdata databases set mydbthen re-run load/indexes without--catalog— current database fallback works for both