Skip to content

Refactor TPC-H example to a registry-driven model graph#205

Merged
ptomecek merged 1 commit intomainfrom
pit/tpch
Apr 30, 2026
Merged

Refactor TPC-H example to a registry-driven model graph#205
ptomecek merged 1 commit intomainfrom
pit/tpch

Conversation

@ptomecek
Copy link
Copy Markdown
Collaborator

@ptomecek ptomecek commented Apr 30, 2026

Reworks ccflow/examples/tpch to be more illustrative of how ccflow is used in practice — a registry of typed providers wired together in YAML — rather than a single context-dispatched data generator and query runner.

Why

The old example had two anti-patterns:

  1. TPCHDataGenerator produced 8 different output schemas depending on TPCHTableContext.table — one model whose return type changed by context.
  2. TPCHQueryRunner dispatched by TPCHQueryContext.query_id, with the table dependencies of each query hidden in a _QUERY_TABLE_MAP side-table rather than expressed as model fields.

Both shapes are convenient but obscure how ccflow is actually used in practice, where each registered model has a fixed output schema and dependencies are explicit Pydantic fields.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 30, 2026

Test Results

651 tests  ±0   649 ✅ ±0   1m 49s ⏱️ +6s
  1 suites ±0     2 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 003959d. ± Comparison against base commit e2ef462.

♻️ This comment has been updated with latest results.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

❌ Patch coverage is 97.43590% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.02%. Comparing base (a16f19b) to head (003959d).
⚠️ Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
ccflow/examples/tpch/data_generators.py 95.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #205      +/-   ##
==========================================
+ Coverage   95.98%   96.02%   +0.03%     
==========================================
  Files         140      139       -1     
  Lines        9797     9819      +22     
  Branches      568      567       -1     
==========================================
+ Hits         9404     9429      +25     
  Misses        275      275              
+ Partials      118      115       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ptomecek ptomecek force-pushed the pit/tpch branch 3 times, most recently from ff8b6f4 to b20e614 Compare April 30, 2026 13:18
Reworks ccflow/examples/tpch as a more illustrative ccflow example: a
registry of typed providers wired together in YAML, instead of a single
context-dispatched data generator and query runner.

The discriminator (table name / query id) becomes a Pydantic field on
each model instance, so each registered provider has a fixed output
schema. A shared TPCHDuckDBBackend (BaseModel) owns the DuckDB
connection and runs dbgen exactly once; every TPCHTableProvider and
TPCHAnswerProvider references it via /tpch/backend, so a single
scale_factor override on the backend flows through to all 22+8
providers. TPCHQuery is a generic CallableModel parameterised by
query_id and an explicit tuple of input table providers, replacing the
previous side-table mapping.

The bundled config/conf.yaml is heavily commented as a teaching example
of the BaseModel/CallableModel distinction, registry cross-references
via /abs/path strings, and per-instance dependency wiring.

Signed-off-by: Pascal Tomecek <pascal.tomecek@cubistsystematic.com>
@ptomecek ptomecek marked this pull request as ready for review April 30, 2026 14:55
@ptomecek ptomecek merged commit 0d45e4c into main Apr 30, 2026
12 checks passed
@ptomecek ptomecek deleted the pit/tpch branch April 30, 2026 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants