perf(python): add host-side optimizations for pre-evaluation, context filtering, and index-based eval by aepfli · Pull Request #72 · open-feature-forking/flagd-evaluator

aepfli · 2026-02-10T20:15:30Z

Summary

Normalize the Python PyO3 bindings to support the same 3 host-side optimizations as Java and Go
Pre-evaluation cache: static/disabled flags served directly from Python dict (~0.3 µs) without calling into Rust evaluator
Context key filtering: only serialize keys referenced by targeting rules, plus $flagd enrichment and targetingKey
Index-based evaluation: evaluate_flag_by_index() with O(1) Vec lookup when both index and required keys are available

Benchmark Highlights

Scenario	Time (µs)	Ops/sec
Static flag (pre-eval cache)	0.3	2,918,057
Targeting match (small ctx)	1.4	700,497
Targeting (large 100+ ctx)	22.6	44,213
Complex targeting (small ctx)	2.8	359,245
Disabled flag (cache)	0.8	1,292,441

vs native json-logic-utils (used by current flagd Python provider):

Scenario	PyO3 native	json-logic-utils	Speedup
Simple targeting	1.4 µs	2.6 µs	1.9x
Simple + small ctx	1.2 µs	3.8 µs	3.2x
Simple + large ctx	21.0 µs	47.7 µs	2.3x
Complex targeting	2.8 µs	10.4 µs	3.7x

Changes

File	Description
`python/src/lib.rs`	Add 3 caches to `FlagEvaluator` struct, optimized evaluation pipeline
`python/tests/test_optimizations.py`	15 new tests covering all optimization paths

Test plan

All 30 Python tests pass (15 original + 15 new)
30/30 benchmarks pass (4 panzi comparison skipped — optional dep)
134/134 Rust unit tests pass
4/4 Gherkin tests pass
cargo fmt and cargo clippy -- -D warnings clean

🤖 Generated with Claude Code

…uation Reduce targeting flag evaluation latency for large contexts by avoiding unnecessary data transfer across the WASM boundary. Rust side: - Walk compiled targeting trees to extract referenced context keys (e.g. {"var": "email"} -> "email") during update_state() - Return per-flag requiredContextKeys and flagIndices in UpdateStateResponse - Add evaluate_by_index(u32, ctx_ptr, ctx_len) WASM export for O(1) flag lookup by numeric index instead of string key HashMap lookup - Add evaluate_flag_pre_enriched() that skips context enrichment when $flagd is already present (host-side enrichment) Java side: - Cache requiredContextKeys and flagIndices from updateState() response - EvaluationContextSerializer.serializeFiltered() serializes only the context keys a targeting rule references, plus $flagd enrichment - evaluateFlag(EvaluationContext) uses filtered serialization + index-based eval when available, falls back to full serialization gracefully - evaluateByIndex() calls the new WASM export with O(1) Vec lookup JMH results (1000+ attribute LayeredEvaluationContext): - Targeting flags: ~12.8 µs (down from ~167 µs) — 13x improvement - vs old json-logic-java: 32-34x faster (409 µs -> 12.8 µs) - Static/disabled flags: ~0.02 µs (pre-evaluated cache, unchanged) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… filtering, and index-based eval Normalize the Python PyO3 bindings to support the same 3 host-side optimizations as Java and Go: 1. Pre-evaluated cache: after update_state(), cache results for static/disabled flags and return them directly without calling into the Rust evaluator. 2. Context key filtering: store requiredContextKeys per flag and build a filtered context containing only the keys referenced by the targeting rule, plus $flagd enrichment and targetingKey. 3. Index-based evaluation: store flagIndices and call evaluate_flag_by_index() when both index and required keys are available, using O(1) Vec lookup instead of HashMap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The WASM tests use a thread-local singleton evaluator. Parallel test execution causes race conditions where one test's update_state overwrites another test's state, producing intermittent failures like test_wasm_evaluate_by_index getting "Static" instead of "TargetingMatch". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Gherkin tests (cucumber) don't support --test-threads flag. Run lib and integration tests with --test-threads=1 for WASM singleton safety, and gherkin tests separately without the flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

aepfli and others added 5 commits February 10, 2026 19:41

Merge branch 'main' into feat/python-normalization

2166043

aepfli merged commit 2b7a0b9 into main Feb 12, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(python): add host-side optimizations for pre-evaluation, context filtering, and index-based eval#72

perf(python): add host-side optimizations for pre-evaluation, context filtering, and index-based eval#72
aepfli merged 5 commits into
mainfrom
feat/python-normalization

aepfli commented Feb 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aepfli commented Feb 10, 2026

Summary

Benchmark Highlights

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant