Skip to content

bench(java): add multithreaded JMH benchmarks for concurrent evaluation#64

Merged
aepfli merged 9 commits into
mainfrom
bench/java-concurrent
Feb 12, 2026
Merged

bench(java): add multithreaded JMH benchmarks for concurrent evaluation#64
aepfli merged 9 commits into
mainfrom
bench/java-concurrent

Conversation

@aepfli
Copy link
Copy Markdown
Contributor

@aepfli aepfli commented Feb 10, 2026

Summary

  • Add ConcurrentFlagEvaluatorBenchmark.java with JMH benchmarks that test FlagEvaluator under concurrent load
  • Thread scaling at 1, 2, 4, and 8 threads across simple flags, targeting match/no-match, and mixed workloads
  • Read/write contention benchmark (evaluations concurrent with updateState())
  • Old json-logic-java resolver vs WASM evaluator comparison under 4-thread concurrency

Benchmark Scenarios

Benchmark Threads What it measures
concurrentSimpleFlag_*t 1/2/4/8 Baseline concurrent throughput
concurrentTargetingMatch_*t 1/2/4/8 Targeting rule that matches
concurrentTargetingNoMatch_*t 1/2/4/8 Targeting rule, default path
concurrentMixedFlags_*t 1/2/4/8 Random flags + contexts
concurrentWithStateUpdate 4 Read/write contention
oldResolver_ConcurrentSimple 4 Old vs new comparison (simple)
newEvaluator_ConcurrentSimple 4 Old vs new comparison (simple)
oldResolver_ConcurrentTargeting 4 Old vs new comparison (targeting)
newEvaluator_ConcurrentTargeting 4 Old vs new comparison (targeting)

How to run

cd java
./mvnw clean package
java -jar target/benchmarks.jar ConcurrentFlagEvaluatorBenchmark

Test plan

  • Code compiles (mvnw compile test-compile)
  • Run full JMH benchmark suite and verify results are stable
  • Verify thread scaling shows expected patterns

Closes #61

🤖 Generated with Claude Code

Closes #61

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
aepfli and others added 2 commits February 10, 2026 12:02
…marks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove superseded benchmark files:
- FlagEvaluatorBenchmarkTest (manual System.nanoTime loop, replaced by JMH)
- FlagEvaluatorJmhBenchmark (layered context eval, subsumed by ContextSizeBenchmark)
- PerformanceAnalysisBenchmark (manual phase timing, replaced by JMH)
- ResolverComparisonBenchmark (both main/ and test/ copies, replaced by ComparisonBenchmark)

Move concurrent old-vs-new comparison from ConcurrentFlagEvaluatorBenchmark
to ComparisonBenchmark (X4 section) for a clean separation of concerns.

Final benchmark structure:
- ContextSizeBenchmark: E1-E7 evaluation matrix (context size x targeting complexity)
- ConcurrentFlagEvaluatorBenchmark: C1-C6 thread scaling and contention
- ComparisonBenchmark: X1-X4 old vs new (single-threaded + concurrent + context sweep)
- StateManagementBenchmark: S1-S5 state update performance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
aepfli added a commit that referenced this pull request Feb 10, 2026
…#67)

## Summary
- Adds `BENCHMARKS.md` defining a consistent benchmark specification
across Rust, Java, and Python
- Covers evaluation scenarios (E1-E11), custom operators (O1-O6), state
management (S1-S5), concurrency (C1-C6), and old-vs-new comparison
(X1-X3)
- Standardizes context shapes and flag definitions for direct
cross-language comparison

## Context
The benchmark PRs (#64, #65, #66) implement subsets of this matrix. This
document serves as the reference spec so all three languages converge on
the same scenarios and can be compared meaningfully.

## Test plan
- [ ] Review benchmark IDs and scenarios for completeness
- [ ] Verify context/flag definitions match what implementations use

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
aepfli and others added 6 commits February 10, 2026 13:05
Move JMH benchmark classes from src/test to src/main so the shade plugin
includes them in the benchmarks.jar fat JAR. Also update three
wasm-bindgen symbol hashes in WasmRuntime.java to match the current
WASM binary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The 3 wbindgen host function hashes were changed to values that don't
match the WASM binary built from the current Rust source. CI builds
WASM fresh, so the Java code must use the hashes the build produces.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aepfli aepfli merged commit 78a31ef into main Feb 12, 2026
12 checks passed
aepfli added a commit that referenced this pull request Feb 12, 2026
…ix (#96)

## Summary
- **WasmRuntime.java** now inspects the WASM module's import section at
startup and registers host function handlers by prefix pattern matching,
instead of hardcoding 9 function names with wasm-bindgen hash suffixes
- **HOST_FUNCTIONS.md** rewritten to document all 9 WASM imports across
3 modules, with dynamic matching examples for Java, Go, and JavaScript

## Motivation
wasm-bindgen generates function names with hash suffixes (e.g.,
`__wbg_getTime_ad1e9878a735af08`) that change whenever Rust dependencies
or wasm-bindgen versions update. This has broken CI in multiple PRs
(#64). By matching imports by prefix (`__wbg_getTime_*`) instead of
exact name, the Java integration survives dependency changes without
code updates.

## How it works
1. Load the WASM module via `CompiledEvaluator.load()`
2. Iterate `importSection().stream()` to discover all `FunctionImport`s
3. Match each import by module + name prefix to the appropriate handler
4. Register handlers dynamically in the Chicory `Store`

## Test plan
- [x] All 30 Java tests pass (`./mvnw test`)
- [x] WASM binary imports verified via `wasm-objdump`
- [x] No Rust code changes required

Closes #74

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bench(java): add multithreaded JMH benchmarks for concurrent FlagEvaluator access

1 participant