Skip to content

Commit 9e9c925

Browse files
committed
feat: add --threads option to parallelize report data fetching
Add --threads CLI option (default 1) to `edr report` and `edr send-report`. When set to >1, independent dbt run-operations are executed concurrently using ThreadPoolExecutor with SubprocessDbtRunner. dbt's Python API (APIDbtRunner) is not thread-safe due to global mutable state (GLOBAL_FLAGS, adapter FACTORY, etc.), so parallel execution uses SubprocessDbtRunner which spawns independent dbt processes per call. The fetching is split into phases: - Phase 1: 14 independent operations run in parallel - Phase 2: exposures + test_results (depend on Phase 1) - Phase 3: lineage (depends on Phase 2) - Phase 4: pure computation (no dbt calls) With --threads=14, edr report time is expected to drop from ~3m40s to ~30-40s on adapters with high query latency (e.g. Athena).
1 parent e5af7e7 commit 9e9c925

3 files changed

Lines changed: 400 additions & 12 deletions

File tree

0 commit comments

Comments
 (0)