You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sourcecode of the paper *Bespoke OLAP: Synthesizing Workload-Specific One-size-fits-one Database Engines*
10
10
11
-
The generated Cpp Artifacts of *Bespoke-TPCH* and *Bespoke-CEB* can be found here [https://github.com/DataManagementLab/BespokeOLAP_Artifacts](https://github.com/DataManagementLab/BespokeOLAP_Artifacts).
[▶ Live Runner](https://datamanagementlab.github.io/BespokeOLAP/web-runner/)
12
15
13
-
An LLM agent that automatically generates and optimizes custom C++ OLAP query engines for user specified workloads. The agent generates C++ code, compiles it, and iteratively improves performance through sophisticaed optimization loops. Results are tracked in Weights & Biases (wandb).
16
+
The generated C++ artifacts of *Bespoke-TPCH* and *Bespoke-CEB* are available in the [BespokeOLAP_Artifacts](https://github.com/DataManagementLab/BespokeOLAP_Artifacts) repository.
17
+
18
+
An LLM agent that automatically generates and optimizes custom C++ OLAP query engines for user specified workloads. The agent generates C++ code, compiles it, and iteratively improves performance through sophisticated optimization loops. Results are tracked in Weights & Biases (wandb).
14
19
15
20
<divalign="center">
16
21
<figure>
@@ -93,7 +98,7 @@ Place TPC-H or CEB Parquet files in your artifacts directory (default: `/mnt/lab
93
98
94
99
## Usage
95
100
96
-
### 1. Actvate your Python environment
101
+
### 1. Activate your Python environment
97
102
98
103
```bash
99
104
source .venv/bin/activate
@@ -134,7 +139,7 @@ python run_gen_base_impl.py \
134
139
Conv name represents: `basef{q_id}-{q_id}v{version}`. For example, `basef1-22v1` is a base implementation generated for TPC-H queries 1 and 22, version 1.
135
140
136
141
### 4. Run the optimization loop
137
-
To run the optimizaiton loop, please specify the wandb run-id of the run producing the base implementation (see 3.).
142
+
To run the optimization loop, please specify the wandb run-id of the run producing the base implementation (see 3.).
138
143
The script will automatically look up the final snapshot created at the end of that conversation and load this git snapshot automatically.
139
144
I.e. any past run can be loaded as a starting point for the optimization loop, as long as the final snapshot of that run is available in the git cache. This allows you to easily continue and optimize from any past run, or even share runs across machines by sharing the git snapshot cache (see "Remote snapshot cache" below).
140
145
Store the wandb run-id in the `run_optim_loop.py` header.
@@ -160,7 +165,7 @@ Conversation names are used to organize and track runs.
160
165
They first create separate log-files but also identify traces, snapshots, and metrics in wandb.
161
166
Further they reference the queries for which an engine is generated and optimized, as well as the version number for the generated engine.
162
167
Hence they have to be unique - this is also enforced by the system.
163
-
Usually naming convensions (conversation name prefixes) are enforced by the scripts.
168
+
Usually naming conventions (conversation name prefixes) are enforced by the scripts.
164
169
165
170
## Optionally
166
171
### Run the agent manually (interactive)
@@ -183,7 +188,7 @@ See [Benchmarking guide](benchmark/README.md) for details and additional example
183
188
## CLI Reference
184
189
185
190
Common arguments shared across entry points:
186
-
(however the recommend to use the prepared scripts/steps listed above, which have the appropriate arguments pre-configured)
191
+
(We recommend using the prepared scriptsabove, which have the appropriate arguments pre-configured.)
0 commit comments