bench(java): add multithreaded JMH benchmarks for concurrent evaluation by aepfli · Pull Request #64 · open-feature-forking/flagd-evaluator

aepfli · 2026-02-10T10:48:18Z

Summary

Add ConcurrentFlagEvaluatorBenchmark.java with JMH benchmarks that test FlagEvaluator under concurrent load
Thread scaling at 1, 2, 4, and 8 threads across simple flags, targeting match/no-match, and mixed workloads
Read/write contention benchmark (evaluations concurrent with updateState())
Old json-logic-java resolver vs WASM evaluator comparison under 4-thread concurrency

Benchmark Scenarios

Benchmark	Threads	What it measures
`concurrentSimpleFlag_*t`	1/2/4/8	Baseline concurrent throughput
`concurrentTargetingMatch_*t`	1/2/4/8	Targeting rule that matches
`concurrentTargetingNoMatch_*t`	1/2/4/8	Targeting rule, default path
`concurrentMixedFlags_*t`	1/2/4/8	Random flags + contexts
`concurrentWithStateUpdate`	4	Read/write contention
`oldResolver_ConcurrentSimple`	4	Old vs new comparison (simple)
`newEvaluator_ConcurrentSimple`	4	Old vs new comparison (simple)
`oldResolver_ConcurrentTargeting`	4	Old vs new comparison (targeting)
`newEvaluator_ConcurrentTargeting`	4	Old vs new comparison (targeting)

How to run

cd java
./mvnw clean package
java -jar target/benchmarks.jar ConcurrentFlagEvaluatorBenchmark

Test plan

Code compiles (mvnw compile test-compile)
Run full JMH benchmark suite and verify results are stable
Verify thread scaling shows expected patterns

Closes #61

🤖 Generated with Claude Code

Closes #61 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…marks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove superseded benchmark files: - FlagEvaluatorBenchmarkTest (manual System.nanoTime loop, replaced by JMH) - FlagEvaluatorJmhBenchmark (layered context eval, subsumed by ContextSizeBenchmark) - PerformanceAnalysisBenchmark (manual phase timing, replaced by JMH) - ResolverComparisonBenchmark (both main/ and test/ copies, replaced by ComparisonBenchmark) Move concurrent old-vs-new comparison from ConcurrentFlagEvaluatorBenchmark to ComparisonBenchmark (X4 section) for a clean separation of concerns. Final benchmark structure: - ContextSizeBenchmark: E1-E7 evaluation matrix (context size x targeting complexity) - ConcurrentFlagEvaluatorBenchmark: C1-C6 thread scaling and contention - ComparisonBenchmark: X1-X4 old vs new (single-threaded + concurrent + context sweep) - StateManagementBenchmark: S1-S5 state update performance Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…#67) ## Summary - Adds `BENCHMARKS.md` defining a consistent benchmark specification across Rust, Java, and Python - Covers evaluation scenarios (E1-E11), custom operators (O1-O6), state management (S1-S5), concurrency (C1-C6), and old-vs-new comparison (X1-X3) - Standardizes context shapes and flag definitions for direct cross-language comparison ## Context The benchmark PRs (#64, #65, #66) implement subsets of this matrix. This document serves as the reference spec so all three languages converge on the same scenarios and can be compared meaningfully. ## Test plan - [ ] Review benchmark IDs and scenarios for completeness - [ ] Verify context/flag definitions match what implementations use 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Move JMH benchmark classes from src/test to src/main so the shade plugin includes them in the benchmarks.jar fat JAR. Also update three wasm-bindgen symbol hashes in WasmRuntime.java to match the current WASM binary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The 3 wbindgen host function hashes were changed to values that don't match the WASM binary built from the current Rust source. CI builds WASM fresh, so the Java code must use the hashes the build produces. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ix (#96) ## Summary - **WasmRuntime.java** now inspects the WASM module's import section at startup and registers host function handlers by prefix pattern matching, instead of hardcoding 9 function names with wasm-bindgen hash suffixes - **HOST_FUNCTIONS.md** rewritten to document all 9 WASM imports across 3 modules, with dynamic matching examples for Java, Go, and JavaScript ## Motivation wasm-bindgen generates function names with hash suffixes (e.g., `__wbg_getTime_ad1e9878a735af08`) that change whenever Rust dependencies or wasm-bindgen versions update. This has broken CI in multiple PRs (#64). By matching imports by prefix (`__wbg_getTime_*`) instead of exact name, the Java integration survives dependency changes without code updates. ## How it works 1. Load the WASM module via `CompiledEvaluator.load()` 2. Iterate `importSection().stream()` to discover all `FunctionImport`s 3. Match each import by module + name prefix to the appropriate handler 4. Register handlers dynamically in the Chicory `Store` ## Test plan - [x] All 30 Java tests pass (`./mvnw test`) - [x] WASM binary imports verified via `wasm-objdump` - [x] No Rust code changes required Closes #74 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

bench(java): add multithreaded JMH benchmarks for concurrent evaluation

f80df72

Closes #61 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

aepfli mentioned this pull request Feb 10, 2026

docs: add standardized benchmark matrix for cross-language comparison #67

Merged

2 tasks

aepfli and others added 2 commits February 10, 2026 12:02

bench(java): add context size, state management, and comparison bench…

b3d51ff

…marks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

aepfli and others added 6 commits February 10, 2026 13:05

Merge branch 'main' into bench/java-concurrent

ea3c09c

Merge branch 'main' into bench/java-concurrent

af2642e

Merge branch 'main' into bench/java-concurrent

32445f0

Merge branch 'main' into bench/java-concurrent

0826215

aepfli merged commit 78a31ef into main Feb 12, 2026
12 checks passed

This was referenced Feb 12, 2026

refactor(java): eliminate wbindgen hash fragility in WasmRuntime.java #74

Closed

test: add concurrent access functional tests to Java bindings #81

Closed

refactor(java): dynamically match wasm-bindgen host functions by prefix #96

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bench(java): add multithreaded JMH benchmarks for concurrent evaluation#64

bench(java): add multithreaded JMH benchmarks for concurrent evaluation#64
aepfli merged 9 commits into
mainfrom
bench/java-concurrent

aepfli commented Feb 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aepfli commented Feb 10, 2026

Summary

Benchmark Scenarios

How to run

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant