You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: rewrite README to lead with managed databases
Focus on create → upload → query → drop workflow as the primary pattern.
Move connections/external sources to a secondary section. Add addressing
summary table (create_table vs query catalog conventions).
Use [Ibis](https://ibis-project.org/) to query and upload data in your [Hotdata](https://www.hotdata.dev/docs/api-reference) workspace — write Python expressions instead of SQL, get pandas or Arrow results back.
3
+
Use [Ibis](https://ibis-project.org/) to create on-demand databases, upload data, and query with Python expressions — get pandas or Arrow results back without writing SQL.
**Table addressing:** Hotdata organizes data as `connection → schema → table`. In Ibis terms that maps to `catalog → database → table`. With a single connection and schema, defaults are inferred automatically. For multiple connections or schemas, pass `database=(connection_id, schema)` when referencing a table, or set `default_connection` / `default_schema` at connect time.
91
+
Table names must be declared when the database is created — you cannot add new table names later without recreating the database.
92
+
93
+
### Query
94
+
95
+
When querying, use `"default"` as the catalog:
96
+
97
+
```python
98
+
t = con.table("events", database=("default", "public"))
|`con.table(...)` when querying |`("default", schema)`|
62
132
63
133
## Querying
64
134
65
135
### Ibis expressions
66
136
67
137
```python
68
-
t = con.table("orders")
138
+
t = con.table("orders", database=("default", "public"))
69
139
70
-
# Filter, select, aggregate — all run as SQL on Hotdata
71
140
summary = (
72
-
t.filter(t.status =="shipped")
141
+
t.filter(t.amount >10)
73
142
.group_by("region")
74
143
.agg(total=t.amount.sum(), n=t.count())
75
-
.order_by("total", ascending=False)
144
+
.order_by(ibis.desc("total"))
76
145
.execute()
77
146
)
78
147
```
79
148
80
-
`.execute()` returns a **pandas DataFrame**. Use `.to_pyarrow()` for an Arrow table or `.to_pyarrow_batches()`for a record batch reader.
149
+
`.execute()` returns a **pandas DataFrame**. Use `.to_pyarrow()` for an Arrow table or `.to_pyarrow_batches()`to stream batches without materializing the full result.
81
150
82
151
### Raw SQL
83
152
84
-
When you need Hotdata-specific syntax, federated table names, or SQL that Ibis doesn't model:
85
-
86
153
```python
87
-
df= con.sql(
88
-
"SELECT region, SUM(amount) AS total FROM my_conn.public.orders GROUP BY region",
154
+
base= con.sql(
155
+
'SELECT * FROM "default"."public"."orders"',
89
156
dialect="postgres",
90
-
).execute()
157
+
)
158
+
result = base.filter(base.amount >10).execute()
91
159
```
92
160
93
-
You can chain Ibis expressions on the result of `con.sql(...)` the same way you would on `con.table(...)`.
161
+
You can chain Ibis expressions on the result of `con.sql(...)`.
94
162
95
-
### Discover what's available
163
+
##Connecting to existing sources
96
164
97
-
```python
98
-
con.list_catalogs() # Hotdata connection ids
99
-
con.list_databases(catalog="my_connection") # schemas for a connection
Managed databases let you upload your own data (pandas DataFrames or PyArrow tables) and query it alongside your other Hotdata connections. They are provisioned on demand and scoped to your workspace.
165
+
If you have existing databases or warehouses connected to your Hotdata workspace (Postgres, Snowflake, BigQuery, etc.), you can query them through the same Ibis connection:
107
166
108
167
```python
109
-
import time
110
-
import ibis
111
-
import pandas as pd
112
-
113
168
con = ibis.hotdata.connect(
114
169
api_url="https://api.hotdata.dev",
115
-
token="YOUR_API_TOKEN",
116
-
workspace_id="ws_…",
170
+
token="YOUR_API_KEY",
171
+
workspace_id="ws_...",
172
+
default_connection="my_postgres",
173
+
default_schema="public",
117
174
)
118
175
119
-
# 1. Create the database and declare which tables you'll upload.
120
-
# Table names must be declared here — uploads to undeclared names are rejected.
SQL compilation uses Ibis's Postgres dialect as the closest fit. Most common `SELECT` workloads run fine; complex expressions may generate SQL that Hotdata doesn't support — use `con.sql(...)` as a fallback.
202
+
SQL compilation uses Ibis's Postgres dialect. Use `con.sql(...)` as a fallback for expressions that don't compile cleanly.
Set your credentials, then run any example script:
180
217
181
218
```bash
182
-
export HOTDATA_API_KEY=…
183
-
export HOTDATA_WORKSPACE=…
219
+
export HOTDATA_API_KEY=...
220
+
export HOTDATA_WORKSPACE=...
184
221
uv run python examples/01_catalog_introspection.py
185
222
uv run python examples/02_execute_sql.py 'SELECT COUNT(*) AS n FROM tpch.tpch_sf1.customer'
186
223
uv run python examples/03_connect_via_url.py
187
224
uv run python examples/04_ibis_table_workflows.py
188
225
```
189
226
190
-
The examples assume a TPC-H dataset at `tpch.tpch_sf1`. To provision it: create a DuckDB connection in Hotdata, then run `CALL dbgen(sf = 1)` using DuckDB's [tpch extension](https://duckdb.org/docs/extensions/tpch.html).
0 commit comments