Skip to content

feat(ux): explicit --catalog/--table flags and datasets → views rename#114

Open
eddietejeda wants to merge 10 commits into
mainfrom
feat/catalog-ux-redesign
Open

feat(ux): explicit --catalog/--table flags and datasets → views rename#114
eddietejeda wants to merge 10 commits into
mainfrom
feat/catalog-ux-redesign

Conversation

@eddietejeda
Copy link
Copy Markdown
Contributor

@eddietejeda eddietejeda commented May 28, 2026

Summary

  • Replaces the connection.schema.table and airbnb.listings[col1,col2] positional/dot-notation patterns across databases load, search, and indexes create with explicit --catalog, --schema, and --table named flags
  • Adds --name (SQL catalog alias) to databases create; resolves databases by name instead of description; updates list and create output to show name and expires_at
  • Deletes parse_db_target, parse_index_target, and resolve_connection_id — only existed to destructure the old dot-notation strings
  • Adds default_catalog field to the Database struct (returned by API); uses it as the table prefix for both bm25_search and vector_search SQL so catalog alias resolution on the server handles the default__db_* translation (pairs with runtimedb BM25 catalog alias rewrite)
  • Makes --catalog required on search and indexes create (no silent fallback to the active database)

Known limitations / pending server work

BM25 search on managed tableshotdata search --type bm25 --catalog <name> generates bm25_search('default.public.<table>', ...). The server's bm25_search table function does not yet resolve the default catalog alias to the real connection name. Unlike vector ANN (fixed in runtimedb #545, merged 2026-06-03), BM25 alias resolution requires a follow-up runtimedb PR before it works end-to-end on managed databases.

Vector search on managed tables — auto-embedding vector indexes (--type vector) are not supported on managed tables; the embedding pipeline replaces the parquet, conflicting with managed load semantics. Vector search works on connection tables.

End-to-end example (SQL query)

Until the BM25 server fix ships, the working end-to-end pattern for managed databases uses hotdata query directly:

hotdata databases create --name airbnb

hotdata databases load \
  --catalog airbnb \
  --table listings \
  --url https://hotdata.dev/data/sf-airbnb-listings.parquet

hotdata query "SELECT name, description
  FROM airbnb.public.listings
  WHERE description ILIKE '%cozy%'
    AND description ILIKE '%view%'
  LIMIT 10"

Test plan

  • hotdata databases create --name mydb — name appears in output, SQL tip uses --catalog mydb
  • hotdata databases list — NAME column appears
  • hotdata databases load --catalog mydb --table orders --url ... — loads 7535 rows, full_name: mydb.public.orders
  • hotdata indexes create --catalog mydb --table orders --column description --type bm25 — index created
  • hotdata query "SELECT name, description FROM airbnb.public.listings WHERE description ILIKE '%cozy%' LIMIT 10" — returns results from managed database
  • hotdata search "ocean view apartment" --catalog airbnb --table listings --type bm25 --column description — blocked: BM25 catalog alias rewrite not yet deployed (generates bm25_search('default.public.listings', ...) → server returns "Connection 'default' not found")
  • hotdata search --type vector on managed table — blocked: auto-embedding not supported on managed tables (embedding pipeline conflicts with parquet replace semantics)
  • hotdata databases set mydb then re-run load/indexes without --catalog — current database fallback works for both

Removing the datasets CLI surface for now. Deletes datasets.rs entirely
and strips the Datasets variant from Commands, DatasetsCommands enum,
the dispatch block in main.rs, and the stale cross-reference in the
databases error message.
Renames the `hotdata datasets` CLI command to `hotdata views` with a
new `src/views.rs` module. The command and all user-facing terminology
(help text, output messages, SQL prefix `views.`, skill docs) now use
"view" / "views". Server-side API paths remain unchanged (`/datasets`).

- Add `src/views.rs` (renamed from deleted `datasets.rs`)
- Add `Views` / `ViewsCommands` to `command.rs`
- Wire dispatch in `main.rs`
- Update README, SKILL.md, WORKFLOWS.md, DATA_MODEL.template.md,
  MODEL_BUILD.md across hotdata and hotdata-analytics skills
…--table flags

Removes the `connection.schema.table` and `airbnb.listings[col1,col2]`
positional/dot-notation patterns across load, search, and indexes create.
All three commands now use named flags for discoverability and consistency.

Also adds --name (SQL catalog alias) to databases create, resolves databases
by name instead of description, and updates list/create output accordingly.
Deletes parse_db_target, parse_index_target, and resolve_connection_id which
were only needed to destructure the old dot-notation strings.
@sentry
Copy link
Copy Markdown

sentry Bot commented May 28, 2026

Codecov Report

❌ Patch coverage is 3.03030% with 160 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/main.rs 0.00% 99 Missing ⚠️
src/views.rs 0.00% 52 Missing ⚠️
src/databases.rs 50.00% 5 Missing ⚠️
src/indexes.rs 0.00% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

Comment thread src/views.rs Outdated
Comment thread src/databases.rs Outdated
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Blocking Issues

  • src/views.rs:194hotdata views <id> hardcodes views.main.{table} and ignores ViewDetail.schema_name. For a sandbox-scoped view the printed full_name is wrong, and the SKILL/README docs in this PR specifically instruct users to copy full_name into FROM clauses. The neighboring UpdateResponse comment in the same file already warns that synthesizing "main" is incorrect when schema_name == sandbox_id — that exact mistake is reintroduced here.

Action Required

  • Use v.schema_name in views::get's full_name output (suggestion posted inline).

Other notes (non-blocking)

  • README ## Search and ## Indexes sections (untouched by this PR but now stale): --table <connection.schema.table> and indexes create --connection-id … no longer match the CLI, which now expects --catalog/--schema/--table for search and indexes create. Worth updating in this PR since it owns the UX change.
  • src/command.rsSet { id_or_description } doc still says "Database id or description", and Delete { name_or_id } says "Database name or connection ID". Since try_resolve_database now matches by name (and error messages say "id or name"), these doc strings drift from the actual behavior.

- Fix views.rs:194 to use v.schema_name instead of hardcoded "main"
- Update command.rs doc strings: Set uses "id or name", Delete uses "name or ID"
- Update README Search and Indexes sections to use --catalog/--table flags
- Update hotdata skill: databases load/create syntax, fix views->datasets ref
- Update hotdata-search skill: search and indexes create use --catalog/--table
Comment thread skills/hotdata/SKILL.md
#### Create a view
```
hotdata datasets create --name <table_name> [--description "My Dataset"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
hotdata views create --name <table_name> [--description My View”] (--sql SELECT ... | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (not blocking) This command example uses smart/curly quotes (“ ”) instead of straight ASCII quotes. The pre-PR line used ". Since these SKILL docs are designed for agents to copy commands verbatim, curly quotes will break shell parsing when pasted. Same issue on line 287 (hotdata query “SELECT * FROM views.main.my_view LIMIT 10”) and in references/WORKFLOWS.md lines 104 and 113.

Suggested change
hotdata views create --name <table_name> [--description My View] (--sql SELECT ... | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
hotdata views create --name <table_name> [--description "My View"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>]

claude[bot]
claude Bot previously approved these changes Jun 3, 2026
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior blocking issue (views.rs full_name synthesizing views.main) is resolved — get() now uses v.schema_name. Remaining items are non-blocking nits.

…tor queries

- Add `default_catalog` field to `Database` struct (returned by API)
- Make `--catalog` required on `search` and `indexes create` commands
  instead of falling back to the active database config
- Use `db.default_catalog` as the vector table prefix so the query
  resolves against the correct catalog alias, not the resolved name
claude[bot]
claude Bot previously approved these changes Jun 3, 2026
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking issue from the prior review (src/views.rs hardcoding views.main) is resolved — get() now prints views.{schema_name}.{table_name}, and update() correctly gates on the optional schema_name. New --catalog/--schema/--table routing for search and indexes create is correct (catalog.unwrap() is guarded by clap's required_unless_present).

Two non-blocking nits from cycle 1 remain open on their existing threads (curly quotes in copy-paste command blocks in SKILL.md/WORKFLOWS.md; the --catalog default load hint when --name is omitted). Neither blocks merge.

bm25_search passes its table reference as a string literal to the
server, which now rewrites catalog aliases (e.g. "default") to the
real managed connection name before execution. Use db.default_catalog
as the prefix for both BM25 and vector table refs so both search
types route through the same catalog alias resolution path.
Comment thread skills/hotdata/SKILL.md
#### Create a view
```
hotdata datasets create --name <table_name> [--description "My Dataset"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
hotdata views create --name <table_name> [--description My View”] (--sql SELECT ... | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (not blocking) Still using smart/curly quotes (“ ”) instead of straight ASCII quotes here and on line 287. Same issue in references/WORKFLOWS.md lines 104 and 113. Since these SKILL docs are meant for agents to copy commands verbatim, curly quotes break shell parsing when pasted.

Suggested change
hotdata views create --name <table_name> [--description My View] (--sql SELECT ... | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
hotdata views create --name <table_name> [--description "My View"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>]

Comment thread src/databases.rs
concat!(
"Load a table:\n",
" hotdata databases load --file <path.parquet> {}.<table_name>\n",
" hotdata databases load --catalog {} --table <table_name> --file <path.parquet>\n",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (not blocking) When --name is omitted (result.name == None), catalog falls back to the literal "default", so the hint prints hotdata databases load --catalog default --table .... Copied verbatim, tables_loadresolve_database(&api, "default") fails (no database with id or name 'default') since the new DB has no such name/id. The DB was just saved as current a few lines up, so the working form is hotdata databases load --table <name> --file ... (no --catalog). Consider printing the no-catalog form when result.name.is_none().

claude[bot]
claude Bot previously approved these changes Jun 3, 2026
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior blocking issue (views.rs full_name hardcoding main) is resolved — it now uses v.schema_name. Two prior nits remain unaddressed (curly quotes in SKILL.md/WORKFLOWS.md command examples, and the --catalog default create hint); re-flagged inline. Neither is blocking. Approving.

@eddietejeda eddietejeda changed the title feat(ux): replace dot-notation args with explicit --catalog/--schema/--table flags feat(ux): explicit --catalog/--table flags and datasets → views rename Jun 3, 2026
Comment thread skills/hotdata/SKILL.md
#### Create a view
```
hotdata datasets create --name <table_name> [--description "My Dataset"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
hotdata views create --name <table_name> [--description My View”] (--sql SELECT ... | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (not blocking) Smart/curly quotes (“ ”) are still here and on line 287, plus references/WORKFLOWS.md lines 104 and 113. Since these docs are copied verbatim by agents, curly quotes break shell parsing when pasted. Replace with straight ASCII quotes:

Suggested change
hotdata views create --name <table_name> [--description My View] (--sql SELECT ... | --query-id <saved_query_id>) [--workspace-id <workspace_id>]
hotdata views create --name <table_name> [--description "My View"] (--sql "SELECT ..." | --query-id <saved_query_id>) [--workspace-id <workspace_id>]

claude[bot]
claude Bot previously approved these changes Jun 3, 2026
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior blocking issue resolved: views.rs get() now derives full_name from schema_name instead of hardcoding views.main. Remaining items are non-blocking nits (curly quotes in copy-paste command examples; the --catalog default create hint when --name is omitted). Approving.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cycle 2: The blocking full_name issue in src/views.rs is resolved — all output paths (create, list, get, update) now use schema_name instead of hardcoding main, so sandbox-scoped views print the correct qualified name.

The remaining nits (curly/smart quotes in SKILL.md lines 264/287 and references/WORKFLOWS.md lines 104/113, plus the --catalog default load hint in databases.rs) are non-blocking and already tracked in open threads from the prior cycle — worth a quick fix before merge since the docs are copy-pasted verbatim by agents, but not gating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant