feat(go): add Go package with wazero WASM runtime, instance pool, and optimized parsing by aepfli · Pull Request #71 · open-feature-forking/flagd-evaluator

aepfli · 2026-02-10T20:12:41Z

Summary

Add pure Go package (go/) for evaluating feature flags using the flagd-evaluator WASM module
Uses wazero for zero-CGO WebAssembly execution — no native dependencies
Implements all 3 host-side optimizations matching the Java implementation:
- Pre-evaluation cache: static/disabled flags served lock-free via atomic pointer (~22 ns, 0 allocs)
- Context key filtering: only serialize keys referenced by targeting rules
- Index-based evaluation: evaluate_by_index for O(1) flag lookup, avoiding string serialization
WASM instance pool: N instances (default runtime.NumCPU()) for parallel targeting evaluation
Hand-rolled JSON parser: avoids json.Unmarshal reflection overhead (5-7x faster)
Generation guard: prevents stale flag index race between UpdateState and concurrent evaluations

Architecture

EvaluateFlag("key", ctx):
  1. atomic cache.Load()           → lock-free
  2. check pre-eval cache          → return if hit (no WASM, no lock)
  3. serialize context             → before acquiring instance
  4. inst = <-pool                 → blocks only if all instances busy
  5. generation check              → reload cache if UpdateState raced
  6. WASM evaluate on instance     → parallel across pool
  7. pool <- inst                  → return instance

Per-Evaluation Latency (single goroutine)

Scenario	WASM (this)	Native Go jsonlogic	Delta
Static flag (pre-eval cache)	22 ns / 0 allocs	1,280 ns / 12 allocs	58x faster
Targeting, small ctx (5 attrs)	5.4 µs / 11 allocs	4.3 µs / 62 allocs	0.8x
Targeting, large ctx (1000+ attrs)	6.1 µs / 11 allocs	291 µs / 3,022 allocs	48x faster

Throughput Scaling (1000 evals across N goroutines)

Targeting, small context:

Goroutines	WASM (pool)	Native jsonlogic	Winner
1	5.4 ms	4.3 ms	Native 1.3x
4	2.0 ms	2.4 ms	WASM 1.2x
16	1.6 ms	3.0 ms	WASM 1.9x

Targeting, large context (1000+ attrs):

Goroutines	WASM (pool)	Native jsonlogic	Winner
1	6.1 ms	291 ms	WASM 48x
4	2.3 ms	217 ms	WASM 94x
16	1.8 ms	203 ms	WASM 113x

Mixed workload (static + targeting + disabled):

Goroutines	Before (mutex)	After (pool)	Speedup
1	1.9 ms	2.1 ms	baseline
4	1.9 ms	0.86 ms	2.2x
16	2.0 ms	0.74 ms	2.7x

JSON Result Parsing (hand-rolled vs `json.Unmarshal`)

Result type	`json.Unmarshal`	Hand-rolled	Speedup
Bool	809 ns / 7 allocs	112 ns / 3 allocs	7.2x
String	838 ns / 9 allocs	144 ns / 5 allocs	5.8x
Number	950 ns / 9 allocs	160 ns / 4 allocs	5.9x
Error	883 ns / 8 allocs	138 ns / 4 allocs	6.4x
With metadata	1,843 ns / 20 allocs	399 ns / 11 allocs	4.6x

Files

File	Purpose
`wasm.go`	`//go:embed`, memory helpers
`host.go`	9 wazero host functions (3 modules)
`types.go`	`EvaluationResult`, `UpdateStateResult`, options (`WithPoolSize`, `WithPermissiveValidation`)
`evaluator.go`	Instance pool, `UpdateState` with parallel instance sync, generation stamping
`evaluate.go`	Lock-free evaluation pipeline, filtered context serialization
`parse.go`	Hand-rolled JSON parser for `EvaluationResult` (no `json.Unmarshal` on hot path)
`evaluator_test.go`	17 integration tests including `TestGenerationGuard` race test
`benchmark_test.go`	Full matrix (E1-E11, O1-O6, S1-S5, C1-C6) + throughput (T1-T9)
`comparison_test.go`	Build-tagged comparison vs `diegoholiveira/jsonlogic/v3`
`fast_parse_test.go`	Parser correctness and benchmark tests

Test plan

All 17 integration tests pass (go test -v -race)
Generation guard race test passes under -race detector (5 runs)
All benchmarks compile and run (go test -bench=. -benchmem)
Comparison benchmarks with build tag (go test -tags comparison -bench=.)
Concurrent access test with 20 goroutines
go build ./... clean
Verify WASM binary matches release build

🤖 Generated with Claude Code

…uation Reduce targeting flag evaluation latency for large contexts by avoiding unnecessary data transfer across the WASM boundary. Rust side: - Walk compiled targeting trees to extract referenced context keys (e.g. {"var": "email"} -> "email") during update_state() - Return per-flag requiredContextKeys and flagIndices in UpdateStateResponse - Add evaluate_by_index(u32, ctx_ptr, ctx_len) WASM export for O(1) flag lookup by numeric index instead of string key HashMap lookup - Add evaluate_flag_pre_enriched() that skips context enrichment when $flagd is already present (host-side enrichment) Java side: - Cache requiredContextKeys and flagIndices from updateState() response - EvaluationContextSerializer.serializeFiltered() serializes only the context keys a targeting rule references, plus $flagd enrichment - evaluateFlag(EvaluationContext) uses filtered serialization + index-based eval when available, falls back to full serialization gracefully - evaluateByIndex() calls the new WASM export with O(1) Vec lookup JMH results (1000+ attribute LayeredEvaluationContext): - Targeting flags: ~12.8 µs (down from ~167 µs) — 13x improvement - vs old json-logic-java: 32-34x faster (409 µs -> 12.8 µs) - Static/disabled flags: ~0.02 µs (pre-evaluated cache, unchanged) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pure Go package for evaluating feature flags using the flagd-evaluator WASM module. Uses wazero for zero-CGO WebAssembly execution. Implements all 3 host-side optimizations from day 1: - Pre-evaluation cache for static/disabled flags (~24ns, 0 allocs) - Context key filtering (only serialize keys referenced by targeting rules) - Index-based WASM evaluation (O(1) flag lookup via evaluate_by_index) Includes 14 integration tests mirroring Java's FlagEvaluatorTest, full BENCHMARKS.md matrix (E1-E11, O1-O6, S1-S5, C1-C6), and comparison benchmarks against diegoholiveira/jsonlogic/v3 showing 40x speedup on large contexts due to context filtering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…benchmarks Replace single-mutex architecture with a pool of WASM instances for parallel targeting evaluation. Pre-evaluated (static/disabled) flags are now served lock-free via atomic cache pointers. Key changes: - Instance pool: N WASM instances (default runtime.NumCPU()) evaluate targeting flags in parallel instead of serializing on one mutex - Generation guard: atomic generation counter prevents stale flag index usage when UpdateState races with evaluateFlag - Hand-rolled JSON parser: avoids json.Unmarshal reflection overhead for EvaluationResult (5-7x faster for common cases, 4.6x for metadata) - Throughput benchmarks (T1-T9): 1000 evals across 1/4/16 goroutines to expose scaling behavior - Comparison throughput benchmarks: WASM pool vs native jsonlogic at varying concurrency levels Targeting throughput at 16 goroutines: 5.7ms → 1.7ms (3.4x improvement) Mixed workload at 16 goroutines: 2.0ms → 0.74ms (2.7x improvement) Pre-evaluated flags: now fully lock-free (no mutex) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

WASM tests share a thread-local singleton evaluator, so they must run sequentially. Gherkin tests use cucumber which doesn't support --test-threads, so they run separately without the flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The Go package has 13+ tests covering evaluation, parsing, concurrency, and scale scenarios. Add a CI job that runs them using the embedded WASM binary already committed in the go/ directory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Instead of relying on the committed WASM binary, build it from Rust source in CI. This ensures Go tests always run against the correct WASM matching the current Rust code. Refs: #83 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

aepfli and others added 3 commits February 10, 2026 19:41

aepfli changed the title ~~feat(go): add Go package with wazero WASM runtime~~ feat(go): add Go package with wazero WASM runtime, instance pool, and optimized parsing Feb 11, 2026

aepfli and others added 2 commits February 12, 2026 08:37

ci: add Go test job to CI workflow

9d419a1

The Go package has 13+ tests covering evaluation, parsing, concurrency, and scale scenarios. Add a CI job that runs them using the embedded WASM binary already committed in the go/ directory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

aepfli mentioned this pull request Feb 12, 2026

refactor(go): determine WASM binary embedding strategy for Go package #83

Closed

8 tasks

aepfli merged commit 75a94f2 into main Feb 12, 2026
14 checks passed

aepfli mentioned this pull request Feb 12, 2026

build(wasm): add release-please WASM update, staleness detection, and Go CI #95

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(go): add Go package with wazero WASM runtime, instance pool, and optimized parsing#71

feat(go): add Go package with wazero WASM runtime, instance pool, and optimized parsing#71
aepfli merged 6 commits into
mainfrom
feat/go-package

aepfli commented Feb 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aepfli commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

Per-Evaluation Latency (single goroutine)

Throughput Scaling (1000 evals across N goroutines)

JSON Result Parsing (hand-rolled vs json.Unmarshal)

Files

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aepfli commented Feb 10, 2026 •

edited

Loading

JSON Result Parsing (hand-rolled vs `json.Unmarshal`)