Skip to content

Commit b861934

Browse files
committed
Add to readme and add error code for pending
1 parent 0e9e71e commit b861934

4 files changed

Lines changed: 47 additions & 20 deletions

File tree

README.md

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -138,11 +138,15 @@ hotdata datasets create --url "https://example.com/data.parquet" --label "My Dat
138138
## Query
139139

140140
```sh
141-
hotdata query "<sql>" [--workspace-id <id>] [--connection <connection_id>] [--format table|json|csv]
141+
hotdata query "<sql>" [-w <id>] [--connection <connection_id>] [-o table|json|csv]
142+
hotdata query status <query_run_id> [-o table|json|csv]
142143
```
143144

144-
- Default format is `table`, which prints results with row count and execution time.
145+
- Default output is `table`, which prints results with row count and execution time.
145146
- Use `--connection` to scope the query to a specific connection.
147+
- Long-running queries automatically fall back to async execution and return a `query_run_id`.
148+
- Use `hotdata query status <query_run_id>` to poll for results.
149+
- Exit codes for `query status`: `0` = succeeded, `1` = failed, `2` = still running (poll again).
146150

147151
## Saved Queries
148152

@@ -163,13 +167,21 @@ hotdata queries run <query_id> [--format table|json|csv]
163167
## Search
164168

165169
```sh
166-
hotdata search "<query>" --table <connection.schema.table> --column <column> [--select <columns>] [--limit <n>] [--format table|json|csv]
170+
# BM25 full-text search
171+
hotdata search "query text" --table <connection.schema.table> --column <column> [--select <columns>] [--limit <n>] [-o table|json|csv]
172+
173+
# Vector search with --model (calls OpenAI to embed the query)
174+
hotdata search "query text" --table <table> --column <vector_column> --model text-embedding-3-small [--limit <n>]
175+
176+
# Vector search with piped embedding
177+
echo '[0.1, -0.2, ...]' | hotdata search --table <table> --column <vector_column> [--limit <n>]
167178
```
168179

169-
- Full-text search using BM25 across a table column.
170-
- Requires a BM25 index on the target column (see `indexes create`).
171-
- Results are ordered by relevance score (descending).
172-
- `--select` specifies which columns to return (comma-separated, defaults to all). The `score` column is automatically appended when `--select` is used.
180+
- Without `--model` and with query text: BM25 full-text search. Requires a BM25 index on the target column.
181+
- With `--model`: generates an embedding via OpenAI and performs vector search using `l2_distance`. Requires `OPENAI_API_KEY` env var.
182+
- Without query text and with piped stdin: reads a vector (raw JSON array or OpenAI embedding response) and performs vector search.
183+
- BM25 results are ordered by relevance score (descending). Vector results are ordered by distance (ascending).
184+
- `--select` specifies which columns to return (comma-separated, defaults to all).
173185

174186
## Indexes
175187

skills/hotdata-cli/SKILL.md

Lines changed: 24 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -163,16 +163,21 @@ Use `hotdata datasets <dataset_id>` to look up the `table_name` before writing q
163163

164164
### Execute SQL Query
165165
```
166-
hotdata query "<sql>" [--workspace-id <workspace_id>] [--connection <connection_id>] [--format table|json|csv]
166+
hotdata query "<sql>" [-w <workspace_id>] [--connection <connection_id>] [-o table|json|csv]
167+
hotdata query status <query_run_id> [-o table|json|csv]
167168
```
168-
- Default format is `table`, which prints results with row count and execution time.
169+
- Default output is `table`, which prints results with row count and execution time.
169170
- Use `--connection` to scope the query to a specific connection.
170171
- Use `hotdata tables list` to discover tables and columns — do not query `information_schema` directly.
171172
- **Always use PostgreSQL dialect SQL.**
173+
- Long-running queries automatically fall back to async execution and return a `query_run_id`.
174+
- Use `hotdata query status <query_run_id>` to poll for results.
175+
- Exit codes for `query status`: `0` = succeeded, `1` = failed, `2` = still running (poll again).
176+
- **When a query returns a `query_run_id`, use `query status` to poll rather than re-running the query.**
172177

173178
### Get Query Result
174179
```
175-
hotdata results <result_id> [--workspace-id <workspace_id>] [--format table|json|csv]
180+
hotdata results <result_id> [-w <workspace_id>] [-o table|json|csv]
176181
```
177182
- Retrieves a previously executed query result by its result ID.
178183
- Query results include a `result-id` in the footer (e.g. `[result-id: rslt...]`).
@@ -195,23 +200,31 @@ hotdata queries run <query_id> [--format table|json|csv]
195200

196201
### Search
197202
```
198-
hotdata search "<query>" --table <connection.schema.table> --column <column> [--select <columns>] [--limit <n>] [--format table|json|csv]
203+
# BM25 full-text search
204+
hotdata search "query text" --table <connection.schema.table> --column <column> [--select <columns>] [--limit <n>] [-o table|json|csv]
205+
206+
# Vector search with --model (calls OpenAI to embed the query)
207+
hotdata search "query text" --table <table> --column <vector_column> --model text-embedding-3-small [--limit <n>]
208+
209+
# Vector search with piped embedding
210+
echo '[0.1, -0.2, ...]' | hotdata search --table <table> --column <vector_column> [--limit <n>]
199211
```
200-
- Full-text search using BM25 across a table column.
201-
- Requires a BM25 index on the target column (see `indexes create`).
202-
- Results are ordered by relevance score (descending).
203-
- `--select` specifies which columns to return (comma-separated, defaults to all). The `score` column is automatically appended when `--select` is used.
212+
- Without `--model` and with query text: BM25 full-text search. Requires a BM25 index on the target column.
213+
- With `--model`: generates an embedding via OpenAI and performs vector search using `l2_distance`. Requires `OPENAI_API_KEY` env var. Supported models: `text-embedding-3-small`, `text-embedding-3-large`.
214+
- Without query text and with piped stdin: reads a vector (raw JSON array or OpenAI embedding response) and performs vector search.
215+
- BM25 results are ordered by relevance score (descending). Vector results are ordered by distance (ascending).
216+
- `--select` specifies which columns to return (comma-separated, defaults to all).
204217
- Default limit is 10.
218+
- **For BM25 search, create a BM25 index on the target column first. For vector search, create a vector index.**
205219

206220
### Indexes
207221
```
208-
hotdata indexes list --connection-id <id> --schema <schema> --table <table> [--workspace-id <workspace_id>] [--format table|json|yaml]
209-
hotdata indexes create --connection-id <id> --schema <schema> --table <table> --name <name> --columns <cols> [--type sorted|bm25|vector] [--metric l2|cosine|dot] [--async]
222+
hotdata indexes list -c <connection_id> --schema <schema> --table <table> [-w <workspace_id>] [-o table|json|yaml]
223+
hotdata indexes create -c <connection_id> --schema <schema> --table <table> --name <name> --columns <cols> [--type sorted|bm25|vector] [--metric l2|cosine|dot] [--async]
210224
```
211225
- `list` shows indexes on a table with name, type, columns, status, and creation date.
212226
- `create` creates an index. Use `--type bm25` for full-text search, `--type vector` for vector search (requires `--metric`).
213227
- `--async` submits index creation as a background job. Use `hotdata jobs <job_id>` to check status.
214-
- **Before using `hotdata search`, create a BM25 index on the target column.**
215228

216229
### Jobs
217230
```

src/command.rs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,8 @@ impl From<ShellChoice> for clap_complete::Shell {
192192

193193
#[derive(Subcommand)]
194194
pub enum QueryCommands {
195-
/// Check the status of a running query and retrieve results
195+
/// Check the status of a running query and retrieve results.
196+
/// Exit codes: 0 = succeeded, 1 = failed, 2 = still running (poll again)
196197
Status {
197198
/// Query run ID
198199
id: String,

src/query.rs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ pub fn execute(sql: &str, workspace_id: &str, connection: Option<&str>, format:
8080
eprintln!("{}", format!("query still running (status: {})", async_resp.status).yellow());
8181
eprintln!("query_run_id: {}", async_resp.query_run_id);
8282
eprintln!("{}", format!("Poll with: hotdata query status {}", async_resp.query_run_id).dark_grey());
83-
return;
83+
std::process::exit(2);
8484
}
8585

8686
if !status.is_success() {
@@ -134,6 +134,7 @@ pub fn poll(query_run_id: &str, workspace_id: &str, format: &str) {
134134
eprintln!("{}", format!("query status: {status}").yellow());
135135
eprintln!("query_run_id: {}", run.id);
136136
eprintln!("{}", format!("Poll again with: hotdata query status {}", run.id).dark_grey());
137+
std::process::exit(2);
137138
}
138139
}
139140
}

0 commit comments

Comments
 (0)