Skip to content

feat(go): add Go package with wazero WASM runtime, instance pool, and optimized parsing#71

Merged
aepfli merged 6 commits into
mainfrom
feat/go-package
Feb 12, 2026
Merged

feat(go): add Go package with wazero WASM runtime, instance pool, and optimized parsing#71
aepfli merged 6 commits into
mainfrom
feat/go-package

Conversation

@aepfli
Copy link
Copy Markdown
Contributor

@aepfli aepfli commented Feb 10, 2026

Summary

  • Add pure Go package (go/) for evaluating feature flags using the flagd-evaluator WASM module
  • Uses wazero for zero-CGO WebAssembly execution — no native dependencies
  • Implements all 3 host-side optimizations matching the Java implementation:
    • Pre-evaluation cache: static/disabled flags served lock-free via atomic pointer (~22 ns, 0 allocs)
    • Context key filtering: only serialize keys referenced by targeting rules
    • Index-based evaluation: evaluate_by_index for O(1) flag lookup, avoiding string serialization
  • WASM instance pool: N instances (default runtime.NumCPU()) for parallel targeting evaluation
  • Hand-rolled JSON parser: avoids json.Unmarshal reflection overhead (5-7x faster)
  • Generation guard: prevents stale flag index race between UpdateState and concurrent evaluations

Architecture

EvaluateFlag("key", ctx):
  1. atomic cache.Load()           → lock-free
  2. check pre-eval cache          → return if hit (no WASM, no lock)
  3. serialize context             → before acquiring instance
  4. inst = <-pool                 → blocks only if all instances busy
  5. generation check              → reload cache if UpdateState raced
  6. WASM evaluate on instance     → parallel across pool
  7. pool <- inst                  → return instance

Per-Evaluation Latency (single goroutine)

Scenario WASM (this) Native Go jsonlogic Delta
Static flag (pre-eval cache) 22 ns / 0 allocs 1,280 ns / 12 allocs 58x faster
Targeting, small ctx (5 attrs) 5.4 µs / 11 allocs 4.3 µs / 62 allocs 0.8x
Targeting, large ctx (1000+ attrs) 6.1 µs / 11 allocs 291 µs / 3,022 allocs 48x faster

Throughput Scaling (1000 evals across N goroutines)

Targeting, small context:

Goroutines WASM (pool) Native jsonlogic Winner
1 5.4 ms 4.3 ms Native 1.3x
4 2.0 ms 2.4 ms WASM 1.2x
16 1.6 ms 3.0 ms WASM 1.9x

Targeting, large context (1000+ attrs):

Goroutines WASM (pool) Native jsonlogic Winner
1 6.1 ms 291 ms WASM 48x
4 2.3 ms 217 ms WASM 94x
16 1.8 ms 203 ms WASM 113x

Mixed workload (static + targeting + disabled):

Goroutines Before (mutex) After (pool) Speedup
1 1.9 ms 2.1 ms baseline
4 1.9 ms 0.86 ms 2.2x
16 2.0 ms 0.74 ms 2.7x

JSON Result Parsing (hand-rolled vs json.Unmarshal)

Result type json.Unmarshal Hand-rolled Speedup
Bool 809 ns / 7 allocs 112 ns / 3 allocs 7.2x
String 838 ns / 9 allocs 144 ns / 5 allocs 5.8x
Number 950 ns / 9 allocs 160 ns / 4 allocs 5.9x
Error 883 ns / 8 allocs 138 ns / 4 allocs 6.4x
With metadata 1,843 ns / 20 allocs 399 ns / 11 allocs 4.6x

Files

File Purpose
wasm.go //go:embed, memory helpers
host.go 9 wazero host functions (3 modules)
types.go EvaluationResult, UpdateStateResult, options (WithPoolSize, WithPermissiveValidation)
evaluator.go Instance pool, UpdateState with parallel instance sync, generation stamping
evaluate.go Lock-free evaluation pipeline, filtered context serialization
parse.go Hand-rolled JSON parser for EvaluationResult (no json.Unmarshal on hot path)
evaluator_test.go 17 integration tests including TestGenerationGuard race test
benchmark_test.go Full matrix (E1-E11, O1-O6, S1-S5, C1-C6) + throughput (T1-T9)
comparison_test.go Build-tagged comparison vs diegoholiveira/jsonlogic/v3
fast_parse_test.go Parser correctness and benchmark tests

Test plan

  • All 17 integration tests pass (go test -v -race)
  • Generation guard race test passes under -race detector (5 runs)
  • All benchmarks compile and run (go test -bench=. -benchmem)
  • Comparison benchmarks with build tag (go test -tags comparison -bench=.)
  • Concurrent access test with 20 goroutines
  • go build ./... clean
  • Verify WASM binary matches release build

🤖 Generated with Claude Code

aepfli and others added 3 commits February 10, 2026 19:41
…uation

Reduce targeting flag evaluation latency for large contexts by avoiding
unnecessary data transfer across the WASM boundary.

Rust side:
- Walk compiled targeting trees to extract referenced context keys
  (e.g. {"var": "email"} -> "email") during update_state()
- Return per-flag requiredContextKeys and flagIndices in UpdateStateResponse
- Add evaluate_by_index(u32, ctx_ptr, ctx_len) WASM export for O(1) flag
  lookup by numeric index instead of string key HashMap lookup
- Add evaluate_flag_pre_enriched() that skips context enrichment when
  $flagd is already present (host-side enrichment)

Java side:
- Cache requiredContextKeys and flagIndices from updateState() response
- EvaluationContextSerializer.serializeFiltered() serializes only the
  context keys a targeting rule references, plus $flagd enrichment
- evaluateFlag(EvaluationContext) uses filtered serialization + index-based
  eval when available, falls back to full serialization gracefully
- evaluateByIndex() calls the new WASM export with O(1) Vec lookup

JMH results (1000+ attribute LayeredEvaluationContext):
- Targeting flags: ~12.8 µs (down from ~167 µs) — 13x improvement
- vs old json-logic-java: 32-34x faster (409 µs -> 12.8 µs)
- Static/disabled flags: ~0.02 µs (pre-evaluated cache, unchanged)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pure Go package for evaluating feature flags using the flagd-evaluator
WASM module. Uses wazero for zero-CGO WebAssembly execution.

Implements all 3 host-side optimizations from day 1:
- Pre-evaluation cache for static/disabled flags (~24ns, 0 allocs)
- Context key filtering (only serialize keys referenced by targeting rules)
- Index-based WASM evaluation (O(1) flag lookup via evaluate_by_index)

Includes 14 integration tests mirroring Java's FlagEvaluatorTest, full
BENCHMARKS.md matrix (E1-E11, O1-O6, S1-S5, C1-C6), and comparison
benchmarks against diegoholiveira/jsonlogic/v3 showing 40x speedup on
large contexts due to context filtering.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…benchmarks

Replace single-mutex architecture with a pool of WASM instances for
parallel targeting evaluation. Pre-evaluated (static/disabled) flags are
now served lock-free via atomic cache pointers.

Key changes:
- Instance pool: N WASM instances (default runtime.NumCPU()) evaluate
  targeting flags in parallel instead of serializing on one mutex
- Generation guard: atomic generation counter prevents stale flag index
  usage when UpdateState races with evaluateFlag
- Hand-rolled JSON parser: avoids json.Unmarshal reflection overhead for
  EvaluationResult (5-7x faster for common cases, 4.6x for metadata)
- Throughput benchmarks (T1-T9): 1000 evals across 1/4/16 goroutines
  to expose scaling behavior
- Comparison throughput benchmarks: WASM pool vs native jsonlogic at
  varying concurrency levels

Targeting throughput at 16 goroutines: 5.7ms → 1.7ms (3.4x improvement)
Mixed workload at 16 goroutines: 2.0ms → 0.74ms (2.7x improvement)
Pre-evaluated flags: now fully lock-free (no mutex)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aepfli aepfli changed the title feat(go): add Go package with wazero WASM runtime feat(go): add Go package with wazero WASM runtime, instance pool, and optimized parsing Feb 11, 2026
aepfli and others added 2 commits February 12, 2026 08:37
WASM tests share a thread-local singleton evaluator, so they must run
sequentially. Gherkin tests use cucumber which doesn't support
--test-threads, so they run separately without the flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Go package has 13+ tests covering evaluation, parsing, concurrency,
and scale scenarios. Add a CI job that runs them using the embedded WASM
binary already committed in the go/ directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of relying on the committed WASM binary, build it from Rust
source in CI. This ensures Go tests always run against the correct
WASM matching the current Rust code.

Refs: #83

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aepfli aepfli merged commit 75a94f2 into main Feb 12, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant