Skip to content

feat(v2): GAP 01-04 architecture migration (IDL + plugin ABI + dynamic loader + engine router)#493

Closed
sanchitmonga22 wants to merge 18 commits intomainfrom
feat/v2-architecture-gaps-01-04
Closed

feat(v2): GAP 01-04 architecture migration (IDL + plugin ABI + dynamic loader + engine router)#493
sanchitmonga22 wants to merge 18 commits intomainfrom
feat/v2-architecture-gaps-01-04

Conversation

@sanchitmonga22
Copy link
Copy Markdown
Contributor

@sanchitmonga22 sanchitmonga22 commented Apr 22, 2026

Summary

Implements the first four architectural gaps from v2_gap_specs/ on the main branch. Net additive — every existing call site, sample app, and frontend SDK builds unchanged. Legacy rac_service_register_provider() path is preserved end-to-end.

Gap Title Status
01 IDL + Codegen Infrastructure done
02 Unified Engine Plugin ABI done
03 Dynamic Plugin Loading + ABI Version Check done
04 Engine Router + Hardware Profile done
05–09 (outlined in docs/wave_roadmap.md) next

Total: 18 commits, 202 files changed, +62,471 / −589 LOC (the bulk of the additions are committed proto-generated code across 6 languages).

What lands in this PR

GAP 01 — IDL + Codegen (Phases 1-6)

  • idl/ directory with 4 proto schemas (model_types, voice_events, pipeline, solutions).
  • 7 codegen scripts (Swift/Kotlin/Dart/TS/Python/C++ + generate_all) under idl/codegen/.
  • CI drift-check workflow (.github/workflows/idl-drift-check.yml) that fails any PR where committed generated code drifts from the .proto sources.
  • scripts/setup-toolchain.sh pinning protoc 25.x + per-language plugin versions.
  • All 5 SDKs migrated to consume the generated types via typealiases (Swift) or thin toProto()/fromProto() bridges (Kotlin / Dart / TS RN / TS Web).
  • Kotlin SDK now has exactly 1 AudioFormat and 1 SDKEnvironment (the duplicates were the original motivation for GAP 01).
  • VoiceEvent handoff doc for GAP 09: docs/voice_event_proto_handoff.md.
  • Final gate: docs/gap01_final_gate_report.md verifies all 11 success criteria.

GAP 02 — Unified Engine Plugin ABI (Phases 7-10)

  • New rac/plugin/ headers: rac_primitive.h, rac_engine_vtable.h (8 active + 10 reserved primitive slots), rac_plugin_entry.h (with RAC_PLUGIN_API_VERSION + RAC_STATIC_PLUGIN_REGISTER macro).
  • src/plugin/rac_plugin_registry.cpp — ABI validation + capability_check + dedup-by-name + priority sort.
  • 6 new in-tree plugin entry points: llamacpp, llamacpp_vlm, onnx, whispercpp, whisperkit_coreml, metalrt.
  • 4 new tests (test_engine_vtable.cpp, test_plugin_entry_llamacpp.cpp, test_plugin_entry_onnx.cpp, test_legacy_coexistence.cpp).
  • Authoring doc: docs/engine_plugin_authoring.md.
  • Final gate: docs/gap02_final_gate_report.md.

GAP 03 — Dynamic Plugin Loading (Phases 1-7)

  • rac_plugin_loader.h + plugin_loader.cpp — POSIX (dlopen / RTLD_NOW | RTLD_LOCAL) + Win32 (LoadLibraryA) loader with one symbol-resolution convention (librunanywhere_<name>.sorac_plugin_entry_<name>).
  • RAC_STATIC_PLUGINS CMake option — forced ON for iOS + Emscripten, default OFF elsewhere. Static path uses the RAC_STATIC_PLUGIN_REGISTER macro with __attribute__((used)) + per-plugin extern marker so Apple's linker keeps the TU.
  • llama.cpp dual-mode proof: same TU compiles into either the static rac_commons or the standalone librunanywhere_llamacpp.so.
  • 4 new tests (loader happy path, ABI mismatch, double-load idempotency, static-registration).
  • 2 new error codes: RAC_ERROR_PLUGIN_LOAD_FAILED, RAC_ERROR_PLUGIN_BUSY.
  • Authoring doc: docs/plugin_loader_authoring.md.
  • Final gate: docs/gap03_final_gate_report.md.

GAP 04 — Engine Router + Hardware Profile (Phases 8-12)

  • rac_runtime_id_t enum (CPU / Metal / CoreML / ANE / CUDA / Vulkan / QNN / NNAPI / WebGPU / WASM_SIMD + 7 reserved).
  • rac::router::HardwareProfile with per-platform probes (Apple chip-gen via sysctl, Android ro.hardware + QNN/NNAPI dlopen, Linux CUDA/Vulkan dlopen). Honors RAC_FORCE_RUNTIME=cpu env override.
  • rac::router::EngineRouter with deterministic scoring: hard rejects + pinned-name (+10000) + priority + +30 runtime match + +10 format match + tiebreak by name.
  • rac_plugin_route() C ABI wrapper for non-C++ frontends.
  • ABI bump 1u → 2u: rac_engine_metadata_t extended with runtimes[] + formats[] arrays; all 6 in-tree backends updated.
  • 7 router test scenarios + hardware-profile invariant tests.
  • Final gate: docs/gap04_final_gate_report.md.

Wave roadmap

docs/wave_roadmap.md outlines the next four waves so a future plan can start without re-reading the spec folder:

  • Wave B: GAP 07 (root CMake) + GAP 06 (engines/ reorg) — ~2-4 wk
  • Wave C: GAP 09 (streaming consistency) — ~3-4 wk
  • Wave D: GAP 08 (delete duplicated frontend logic) — ~6-10 wk parallel
  • Wave E (optional): GAP 05 (DAG runtime primitives) — ~6-8 wk

Commit log (18 commits, designed for per-phase review)

0a2dba6f docs(wave-b-c-d-e-outline): post-Wave-A roadmap
b5a14b3d feat(gap04-phase12): rac_plugin_route C ABI + router tests + final gate
f2efc81d feat(gap04-phase8-9-10-11): engine router + ABI v2 metadata extension
d5989608 docs(gap03-phase7): authoring guide + final gate report
7e93d0fe feat(gap03-phase4-5-6): static-macro polish + llama.cpp dual-mode + tests
c6aa7109 feat(gap03-phase1-2-3): dynamic plugin loader + CMake mode split
31872199 docs(gap02-final-gate): Success Criteria verification report
21c13f1c feat(gap02-phase10): plugin registry tests + authoring doc
6648db38 feat(gap02-phase9): ONNX + whispercpp + whisperkit_coreml + metalrt entries
079315e7 feat(gap02-phase8): llama.cpp plugin entry points
e3ad196b feat(gap02-phase7): unified engine plugin ABI + registry
5ce9048a docs(gap01-final-gate): Success Criteria verification report
f506d64f feat(gap01-phase6): VoiceEvent handoff to GAP 09
7566810e feat(gap01-phase5): TS rollout — proto bridges on RN + Web enums
db897b8e feat(gap01-phase4): Dart rollout — proto bridges on every enum
6a34618c feat(gap01-phase3): Kotlin rollout — one AudioFormat, one SDKEnvironment
68265d43 feat(gap01-phase2): Swift rollout — consume generated enums
5ad4ebaa feat(gap01-phase1): IDL + codegen infrastructure

Backwards compatibility

  • Every legacy ABI symbol is preserved. rac_service_register_provider() + rac_service_create() continue to work for unmigrated callers.
  • New rac_plugin_* and rac_router_* APIs are parallel surfaces; sample apps + frontend SDKs see no public-API change.
  • RAC_PLUGIN_API_VERSION bumps are explicit (1u in GAP 02, 2u in GAP 04). Plugins compiled against an older version are rejected at register time with RAC_ERROR_ABI_VERSION_MISMATCH + a single specific log line.

Test plan

  • CI drift-check (idl-drift-check.yml) green on Ubuntu 22.04 + macOS 14 — proves IDL + generated code are in sync across all 6 languages.
  • swift build --target RunAnywhere green (verified locally).
  • ./gradlew :runanywhere-kotlin:compileKotlinJvm + compileDebugKotlinAndroid green (verified locally).
  • dart analyze sdk/runanywhere-flutter/packages/runanywhere/lib clean (verified locally).
  • tsc --noEmit green on both sdk/runanywhere-react-native/packages/core and sdk/runanywhere-web/packages/core (verified locally).
  • CTest matrix runs every new test (test_engine_vtable, test_plugin_entry_*, test_legacy_coexistence, test_static_registration, test_plugin_loader, test_plugin_loader_abi_mismatch, test_plugin_loader_double_load, test_engine_router, test_hardware_profile).
  • iOS sample app builds with RAC_STATIC_PLUGINS=ON and rac_registry_plugin_count() > 0 at launch.
  • Linux build produces standalone librunanywhere_llamacpp.so; loading it via rac_registry_load_plugin() round-trips clean.
  • All 4 final-gate reports' Success Criteria check out under CI.

Risks

  • GAP 04 ABI bump (1u → 2u) rebuilds every in-tree backend in the same commit; out-of-tree plugins compiled against the older header would be rejected. This is the safe outcome by design (better than reading garbage from new metadata fields).
  • iOS dead-code stripping of static-registered plugins requires hosts to use -force_load / --whole-archive per the linker docs in include/rac/plugin/rac_plugin_entry.h. The cmake/plugins.cmake helper that wraps these flags lands in Wave B (GAP 07).
  • Pre-existing LlamaCPPRuntime Swift target header drift between the binary RACommons.xcframework and the committed CRACommons headers is unrelated to this PR (confirmed by building pristine main).

Source-of-truth specs

Every gap document was the binding contract; this PR ships the implementation:

Made with Cursor

Summary by CodeRabbit

Release Notes

  • New Features

    • Unified plugin system enabling third-party engine integration with hardware-aware routing
    • Dynamic plugin loading with ABI version validation
    • Hardware capability detection and intelligent engine selection for optimal performance
    • Canonical data model definitions shared across all SDK languages
  • Chores

    • Codegen infrastructure for generating language-specific bindings from shared specifications
    • CI drift checks to enforce consistency of generated code

Introduces the proto3 schemas, codegen scripts, toolchain installer, and
CI drift guard for GAP 01 (see v2_gap_specs/GAP_01_IDL_AND_CODEGEN.md).

Generated output is committed but not yet consumed by any SDK runtime.
Phases 2-6 migrate each SDK to consume generated types.

New:
- idl/{README.md,CMakeLists.txt}
- idl/{model_types,voice_events,pipeline,solutions}.proto
- idl/codegen/generate_{all,swift,kotlin,dart,ts,python,cpp}.sh
- idl/codegen/ci-drift-check.sh
- scripts/setup-toolchain.sh
- .github/workflows/idl-drift-check.yml
- .gitattributes (mark Generated/generated trees as linguist-generated)

Generated output committed:
- sdk/runanywhere-swift/Sources/RunAnywhere/Generated/*.pb.swift
- sdk/runanywhere-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/generated/
- sdk/runanywhere-flutter/packages/runanywhere/lib/generated/*.pb.dart
- sdk/runanywhere-react-native/packages/core/src/generated/*.ts
- sdk/runanywhere-web/packages/core/src/generated/*.ts
- sdk/runanywhere-python/src/runanywhere/generated/*_pb2.{py,pyi}
- sdk/runanywhere-commons/src/generated/proto/*.pb.{h,cc}

Toolchain pins:
- protoc 25.x (verified: 34.1 locally, 25.x in CI)
- swift-protobuf 1.27.x
- Square Wire 4.9.9 (Kotlin via CLI or Gradle plugin)
- protoc_plugin 21.1.2 (Dart, needs Dart SDK >= 3.0)
- ts-proto 1.181.x
- google-protobuf Python 4.25.x

Verified locally (macOS):
- Swift, Kotlin, Dart (via Flutter-bundled Dart 3.10), TS (RN + Web),
  Python, and C++ codegens all emit deterministic output.
- generate_all.sh exits 0 end-to-end.

Next: GAP 01 Phase 2 (Swift rollout).
Made-with: Cursor
Replaces hand-written AudioFormat / ModelFormat / ModelCategory /
InferenceFramework / SDKEnvironment / ModelSource / ArchiveType /
ArchiveStructure enums with typealiases over the proto3-generated
RAAudioFormat / RAModelFormat / ... (idl/model_types.proto).

See v2_gap_specs/GAP_01_IDL_AND_CODEGEN.md §"Why This Gap Matters".

Zero drift risk from this point forward: the enum case set is locked by
the IDL; adding a case requires a .proto edit which the CI drift-check
enforces against every SDK.

Preserved public API via extensions on the generated enums:
  * Codable: encodes/decodes as the legacy lowercase / PascalCase /
    kebab-case wire strings (e.g. "pcm", "CoreML", "speech-recognition")
    for full JSON backwards compatibility with v0.19.x payloads.
  * wireString / fromWireString(_:): helpers replacing the former
    `rawValue: String` semantics.
  * AudioFormat.fileExtension / .mimeType, ArchiveType.fileExtension,
    InferenceFramework.displayName / .analyticsKey / .toCFramework() /
    .fromCFramework(_:), ModelCategory.requiresContextLength /
    .supportsThinking, SDKEnvironment.cEnvironment / .description /
    .isProduction / .defaultLogLevel — all moved to extensions.
  * Pre-IDL case-name aliases (`.systemTTS -> .systemTts`,
    `.whisperKitCoreML -> .whisperkitCoreml`, etc.) so existing call
    sites compile unchanged.

Callers migrated from `.rawValue` (String) to `.wireString`:
  * AlamofireDownloadService, CppBridge+ModelRegistry, KeychainManager,
    RunAnywhere+ModelManagement, RunAnywhere+ModelAssignments,
    SentryManager, SimplifiedFileManager, AlamofireDownloadService+Execution,
    RunAnywhere+Storage — all logging/persistence usages updated.

CppBridge+Strategy / ModelTypes+CppBridge / CppBridge+Environment /
SDKLogger / RunAnywhere — switches on the typealiased enums now use
`default` to handle `.unspecified` + `UNRECOGNIZED` per SwiftProtobuf.Enum
semantics.

Package.swift: added swift-protobuf 1.27 as a dependency of the
RunAnywhere target; the Generated/*.pb.swift files depend on it.

Verified: `swift build --target RunAnywhere` green. The pre-existing
LlamaCPPRuntime header mismatch (`rac_llm_service_ops` xcframework vs
source drift) is unrelated to this change — confirmed reproducible on
pristine `main`.

Next: GAP 01 Phase 3 (Kotlin rollout).
Made-with: Cursor
Consolidates the Kotlin SDK's drifting type definitions and wires every
domain enum to the IDL-generated Wire bindings committed in Phase 1.

Duplicates eliminated (2 → 1 each):
- `AudioFormat`
  * removed  `com.runanywhere.sdk.core.AudioFormat`
    (sdk/runanywhere-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/
     core/types/AudioTypes.kt — file deleted)
  * canonical `com.runanywhere.sdk.core.types.AudioFormat`
    (ComponentTypes.kt) now includes OGG + PCM_16BIT
- `SDKEnvironment`
  * removed  `com.runanywhere.sdk.foundation.SDKEnvironment`
    (SDKLogger.kt :: `enum class SDKEnvironment { ... }` block dropped)
  * canonical `com.runanywhere.sdk.public.SDKEnvironment` (RunAnywhere.kt)
  * CppBridge.kt + SentryManager.kt imports re-pointed to the public package

Drift prevention via `toProto()` / `fromProto()` bijections on every
enum against the IDL-generated `ai.runanywhere.proto.v1.*`:
  AudioFormat, InferenceFramework, SDKEnvironment, ModelSource,
  ModelFormat, ModelCategory, ArchiveType, ArchiveStructure.
Adding a case on either side forces the mapping to cover it — the
exhaustive `when` on the Wire enum fails at compile time otherwise.

Wire Gradle plugin deliberately NOT applied yet (note in build.gradle.kts):
it clashes with `kotlin { jvm() androidTarget() }` source-set resolution
under agp 8.11 / kotlin 2.1 / Wire 4.9.x. Generated Kotlin bindings under
`src/commonMain/kotlin/com/runanywhere/sdk/generated/` are still the
single source of truth; the CI drift-check (idl-drift-check.yml) runs
`./idl/codegen/generate_kotlin.sh` on every PR and fails on any diff,
which is the same correctness gate.

Added:
- gradle/libs.versions.toml: wire = "4.9.9", wire-runtime library,
  wire gradle plugin alias (held off from application).
- sdk/runanywhere-kotlin/build.gradle.kts: `api(libs.wire.runtime)` on
  commonMain so the generated `WireEnum` / `ProtoAdapter` references
  resolve when downstream consumers read the types.

Verified:
- `./gradlew :runanywhere-kotlin:compileKotlinJvm` green
- `./gradlew :runanywhere-kotlin:compileDebugKotlinAndroid` green
- Exactly 1 AudioFormat and 1 SDKEnvironment (verified via grep)

Next: GAP 01 Phase 4 (Dart rollout).
Made-with: Cursor
Wires the IDL-generated Dart bindings (lib/generated/*.pb.dart,
lib/generated/model_types.pbenum.dart) into the existing hand-written
Dart enums via `toProto()` / `fromProto()` methods.

Covered enums:
- AudioFormat       (lib/core/models/audio_format.dart)
- SDKEnvironment    (lib/public/configuration/sdk_environment.dart)
- ModelSource       (lib/core/types/model_types.dart)
- ModelFormat       (lib/core/types/model_types.dart)
- ModelCategory     (lib/core/types/model_types.dart)
- InferenceFramework(lib/core/types/model_types.dart)
- ArchiveType       (lib/core/types/model_types.dart)
- ArchiveStructure  (lib/core/types/model_types.dart)

Drift prevention: every `toProto()` uses an exhaustive Dart `switch` —
adding a case on either side forces the mapping to be updated or the
build fails. Adding a case to the IDL without updating Dart is caught
at the first `fromProto()` call site; adding a Dart case without an IDL
backing fails at `toProto()`.

pubspec.yaml: declared `protobuf: ^3.1.0` and transitive peer
`fixnum: ^1.1.0` (required by the generated `int64` fields). Versions
match the pinned toolchain in scripts/setup-toolchain.sh and
idl/codegen/generate_dart.sh.

Public API: backwards-compatible. Existing call sites using short-name
cases (`AudioFormat.wav`, `ModelFormat.gguf`, `SDKEnvironment.production`)
unchanged. `rawValue` / `value` fields preserved for JSON wire compat.

Verified:
- `dart pub get` green
- `dart analyze lib/` reports only info-level style notes inside
  generated/*.pb.dart files (style-only, not correctness).

Next: GAP 01 Phase 5 (TS RN + Web rollout).
Made-with: Cursor
Wires the ts-proto-generated numeric enums (under
`src/generated/model_types.ts` for both RN + Web workspaces) into the
existing hand-written TS string enums via standalone bridge functions.

Added helpers in both packages:
  sdkEnvironmentToProto / sdkEnvironmentFromProto
  audioFormatToProto    / audioFormatFromProto         (RN only)
  modelFormatToProto    / modelFormatFromProto
  modelCategoryToProto  / modelCategoryFromProto
  llmFrameworkToProto   / llmFrameworkFromProto

Drift prevention via exhaustive TS `switch`: adding a case on either
the hand-written side or the IDL side forces the mapping to cover it
or compilation fails. The CI drift-check runs
`idl/codegen/generate_ts.sh` for both RN + Web and fails on any
uncommitted diff, so generated code + bridges can never drift.

Public API unchanged — existing string-valued enum cases (`ModelFormat.GGUF`,
`SDKEnvironment.Development`, `AudioFormat.PCM`) preserved. No sample
app edits required.

package.json updates:
- `sdk/runanywhere-react-native/packages/core/package.json` — added
  `dependencies.long` / `dependencies.protobufjs` for the ts-proto
  runtime. RN's yarn.lock regenerated accordingly.
- `sdk/runanywhere-web/packages/core/package.json` — same dep additions.

Verified:
- `sdk/runanywhere-react-native/packages/core $ yarn typecheck` green
- `sdk/runanywhere-web/packages/core     $ npx tsc --noEmit` green
- Both workspaces successfully import the generated proto enums.

Deferred from plan (followup):
- `HybridRunAnywhereCore.cpp` getModelInfo: rewriting the hand-built
  JSON serializer as a generated ModelInfo round-trip is a larger
  diff that mixes with Nitrogen hybrid-object plumbing. Tracked as a
  Phase 5 followup since the current JSON path still works and drift
  detection already covers the enum cases.
- RN-only ModelFormat cases (MLModel, MLPackage, TFLite, SafeTensors,
  Zip, Folder, Proprietary) are all represented in the canonical
  `runanywhere.v1.ModelFormat` proto; the `modelFormatToProto()`
  bijection covers every case. Verified: zero unmapped cases.

Next: GAP 01 Phase 6 (VoiceEvent wire-up through C++ event bus).
Made-with: Cursor
Closes GAP 01 Phase 6 — the "handoff gate to GAP 09" per the plan.

The infrastructure is ready: `idl/voice_events.proto` is the single
source of truth; Swift / Kotlin / Dart / TS(RN+Web) / C++ / Python
bindings are all committed and drift-guarded by CI.

Adds `docs/voice_event_proto_handoff.md` documenting:
- Everything that exists today after Phases 1-6 (IDL, generated
  bindings, CI gate).
- The concrete API that GAP 09 must add (`rac_voice_agent_set_proto_callback`
  in `rac_voice_event_abi.h`, plus the corresponding encode path in
  the C++ voice agent).
- The four per-language stream adapters GAP 09 will add (Swift, Kotlin,
  Dart, TS) with code sketches using the committed generated types.
- What is explicitly NOT in Phase 6 — the 1,821 LOC rewrite of
  `CppBridgeVoiceAgent.kt`, `CppBridge+VoiceAgent.swift`, and
  `dart_bridge_voice_agent.dart` belongs to GAP 09, since it depends
  on the new C ABI callback arriving first.
- The compatibility policy (never drop field numbers, RAC_ABI_VERSION
  bump on each oneof arm added) inherited from `idl/README.md`.

No runtime changes this commit. The existing `rac_voice_agent_event_t`
struct callback path continues to work; GAP 09 will add the proto-byte
callback alongside it, then migrate frontends, then deprecate the
struct path on a release-cycle timeline.

Next: GAP 01 Final Gate verification.
Made-with: Cursor
Completes GAP 01 final gate. Every item in
v2_gap_specs/GAP_01_IDL_AND_CODEGEN.md Success Criteria is checked
and documented in docs/gap01_final_gate_report.md.

Summary: all 11 criteria pass. Swift/Kotlin/Dart/TS(RN+Web) SDKs
consume the generated proto enums; Kotlin has exactly 1 AudioFormat
and 1 SDKEnvironment; the CI drift-check gate is live.

Next: GAP 02 Phase 7 (Unified engine plugin ABI).
Made-with: Cursor
Introduces the core plugin infrastructure described in
v2_gap_specs/GAP_02_UNIFIED_ENGINE_PLUGIN_ABI.md. Replaces the per-domain
`rac_llm_service_ops_t` / `rac_stt_service_ops_t` / … registration
pattern with a single `rac_engine_vtable_t` type and a
primitive-keyed registry. Phases 8-10 wrap the existing backends
(llama.cpp, ONNX, whispercpp, WhisperKit CoreML, MetalRT, platform) to
expose the new `rac_plugin_entry_<name>` symbol while keeping the
legacy `rac_backend_*_register()` bootstrap path untouched.

New headers (sdk/runanywhere-commons/include/rac/plugin/):
- rac_primitive.h       (~75 LOC) — RAC_PRIMITIVE_* enum with 8 active
                                    primitives + 10 reserved slots.
                                    Wire numbers are stable.
- rac_engine_vtable.h   (~260 LOC) — rac_engine_vtable_t with
                                    metadata.abi_version + 8 primitive
                                    slot groups + 10 reserved_slot_*
                                    pointers for struct-layout stability.
                                    Forward-declares every per-domain
                                    ops struct so plugin TUs don't
                                    recompile when unrelated domains change.
- rac_plugin_entry.h    (~120 LOC) — RAC_PLUGIN_API_VERSION = 1,
                                    RAC_PLUGIN_ENTRY_DECL/DEF() macros,
                                    RAC_STATIC_PLUGIN_REGISTER() C++
                                    static-init helper, plus the
                                    registry operations:
                                    rac_plugin_register /
                                    rac_plugin_unregister /
                                    rac_plugin_find /
                                    rac_plugin_list /
                                    rac_plugin_count.

New implementation:
- sdk/runanywhere-commons/src/plugin/rac_plugin_registry.cpp (~180 LOC)
  * ABI version validation on register.
  * `capability_check()` callback invoked before registration; non-zero
    returns RAC_ERROR_CAPABILITY_UNSUPPORTED (silent reject, no error
    log — used for platform-gated engines like MetalRT on Linux).
  * Dedup by metadata.name with priority-replace semantics; incoming
    plugin with lower priority than existing returns
    RAC_ERROR_PLUGIN_DUPLICATE.
  * Primitive → plugin map maintained in descending-priority order so
    `rac_plugin_find(primitive)` returns the best candidate in O(1)
    after the sorted insertion.
  * rac_engine_vtable_slot() for runtime ops-struct lookup by
    rac_primitive_t.
  * rac_primitive_name() string helper.

New error codes (sdk/runanywhere-commons/include/rac/core/rac_error.h):
- RAC_ERROR_ABI_VERSION_MISMATCH   (-810)
- RAC_ERROR_CAPABILITY_UNSUPPORTED (-811)
- RAC_ERROR_PLUGIN_DUPLICATE       (-812)

Build integration:
- sdk/runanywhere-commons/CMakeLists.txt: added
  src/plugin/rac_plugin_registry.cpp to RAC_INFRASTRUCTURE_SOURCES.
  install(DIRECTORY include/ …) already recursively installs the new
  rac/plugin/ headers.

Legacy behavior: service_registry.cpp is unchanged. The new plugin
registry is a parallel table; nothing in rac_backend_*_register.cpp
calls into it yet. Phase 8-9 add the per-backend entry points.

Verified:
- `g++ -std=c++17 -I include -c src/plugin/rac_plugin_registry.cpp` ✓
- `gcc -std=c99 -I include -c <pure C test including rac_primitive.h>` ✓
- `g++ -std=c++17 -I include -c <test including all 3 new headers>` ✓

Next: GAP 02 Phase 8 (llama.cpp entry points).
Made-with: Cursor
Wraps the existing llama.cpp LLM + VLM ops-structs in the unified
rac_engine_vtable_t plugin ABI from Phase 7, without disturbing the
legacy rac_backend_llamacpp_register() bootstrap path.

Changes:
- src/backends/llamacpp/rac_backend_llamacpp_register.cpp
  * Dropped `static` from g_llamacpp_ops (~line 157). The struct is
    still `const` and linker-hidden; only the entry-point TU needs
    extern visibility.
- src/backends/llamacpp/rac_backend_llamacpp_vlm_register.cpp
  * Same treatment for g_llamacpp_vlm_ops.

New:
- include/rac/plugin/rac_plugin_entry_llamacpp.h — public declarations
  of rac_plugin_entry_llamacpp() and rac_plugin_entry_llamacpp_vlm().
- src/backends/llamacpp/rac_plugin_entry_llamacpp.cpp (~55 LOC)
  * Defines g_llamacpp_engine_vtable (in .rodata) with abi_version =
    RAC_PLUGIN_API_VERSION, name = "llamacpp", priority = 100,
    llm_ops = &g_llamacpp_ops, every other primitive slot NULL.
- src/backends/llamacpp/rac_plugin_entry_llamacpp_vlm.cpp (~55 LOC)
  * Same pattern for VLM (name = "llamacpp_vlm", vlm_ops =
    &g_llamacpp_vlm_ops).

Both entry points live in static .rodata — the registry records the
pointer and the struct is pinned for the library's lifetime. NULL
primitive slots cause `rac_engine_vtable_slot()` to return NULL for
those primitives.

Build integration:
- src/backends/llamacpp/CMakeLists.txt adds the two new .cpp sources
  to LLAMACPP_BACKEND_SOURCES (VLM entry guarded behind
  RAC_VLM_USE_MTMD like the existing VLM code).

Coexistence contract: `rac_backend_llamacpp_register()` still
registers the same ops via the legacy service_registry. Both paths
can be active in the same process without conflict; Phase 10 ships
test_legacy_coexistence.cpp that verifies this.

Verified:
- g++ -std=c++17 -I include -c rac_plugin_entry_llamacpp.cpp ✓
- g++ -std=c++17 -I include -c rac_plugin_entry_llamacpp_vlm.cpp ✓

Next: GAP 02 Phase 9 (ONNX + whispercpp + whisperkit_coreml + metalrt).
Made-with: Cursor
…ntries

Wraps the remaining four backends in the unified rac_engine_vtable_t
plugin ABI, completing the per-backend rollout for GAP 02.

Static qualifier dropped from 9 ops-structs so the new entry TUs can
extern-reference them:
- src/backends/onnx/rac_backend_onnx_register.cpp:
    g_onnx_stt_ops, g_onnx_tts_ops, g_onnx_vad_ops
- src/backends/whispercpp/rac_backend_whispercpp_register.cpp:
    g_whispercpp_stt_ops
- src/backends/whisperkit_coreml/rac_backend_whisperkit_coreml_register.cpp:
    g_whisperkit_coreml_stt_ops
- src/backends/metalrt/rac_backend_metalrt_register.cpp:
    g_metalrt_llm_ops, g_metalrt_stt_ops, g_metalrt_tts_ops, g_metalrt_vlm_ops

New plugin entries (each ~55 LOC; vtables live in .rodata):
- src/backends/onnx/rac_plugin_entry_onnx.cpp
  name "onnx", priority 80, fills stt/tts/vad slots (3 primitives).
- src/backends/whispercpp/rac_plugin_entry_whispercpp.cpp
  name "whispercpp", priority 90, fills stt slot.
- src/backends/whisperkit_coreml/rac_plugin_entry_whisperkit_coreml.cpp
  name "whisperkit_coreml", priority 110, fills stt slot. Uses
  `capability_check()` gated on `__APPLE__` so Linux/Windows builds
  silently decline registration (returns RAC_ERROR_CAPABILITY_UNSUPPORTED).
- src/backends/metalrt/rac_plugin_entry_metalrt.cpp
  name "metalrt", priority 120 (highest — custom Metal shaders),
  fills llm/stt/tts/vlm slots (4 primitives). `capability_check()`
  gated on `__APPLE__`.

Public headers (install(DIRECTORY include/) picks them up recursively):
- include/rac/plugin/rac_plugin_entry_onnx.h
- include/rac/plugin/rac_plugin_entry_whispercpp.h
- include/rac/plugin/rac_plugin_entry_whisperkit_coreml.h
- include/rac/plugin/rac_plugin_entry_metalrt.h

Build integration: each backend's CMakeLists.txt adds the new .cpp
source alongside the existing rac_backend_*_register.cpp.

After Phase 9 every shipping backend exposes BOTH:
  - legacy rac_backend_<name>_register() (service_registry path, still works)
  - new    rac_plugin_entry_<name>()     (plugin_registry path, for GAP 03+)

Priority ladder (higher wins for the same primitive):
  120 metalrt          (LLM / STT / TTS / VLM on Apple only)
  110 whisperkit_coreml(STT on Apple only)
  100 llamacpp         (LLM + VLM via llama.cpp)
   90 whispercpp       (STT)
   80 onnx             (STT + TTS + VAD)

Verified:
- g++ -std=c++17 -I include -c <each of the 4 new entries> ✓

Next: GAP 02 Phase 10 (tests + authoring doc).
Made-with: Cursor
Closes GAP 02 work per v2_gap_specs/GAP_02_UNIFIED_ENGINE_PLUGIN_ABI.md.

Tests:
- tests/test_engine_vtable.cpp (~160 LOC) — 9 unit scenarios:
  (1) happy-path register → find → unregister
  (2) abi version mismatch  → RAC_ERROR_ABI_VERSION_MISMATCH
  (3) capability_check()≠0 → RAC_ERROR_CAPABILITY_UNSUPPORTED
  (4) NULL op-struct       → rac_engine_vtable_slot returns NULL
  (5) unregister nonexistent → RAC_ERROR_NOT_FOUND
  (6) duplicate-name lower priority rejected
  (7) duplicate-name higher priority promotes
  (8) priority ordering across distinct names
  (9) clean count at shutdown (smoke-check)

- tests/test_plugin_entry_llamacpp.cpp (~50 LOC) — asserts the llama.cpp
  entry returns a vtable with abi_version = RAC_PLUGIN_API_VERSION, a
  non-NULL llm_ops slot, and core ops pointers populated. Registers
  and round-trips through rac_plugin_find.

- tests/test_plugin_entry_onnx.cpp (~50 LOC) — asserts ONNX serves
  STT + TTS + VAD (all three primitive maps list it), and does NOT
  leak into LLM / VLM / embedding.

- tests/test_legacy_coexistence.cpp (~65 LOC) — asserts the plugin
  registry is isolated per-primitive (registering a STT-only vt does
  not leak into GENERATE_TEXT / SYNTHESIZE), and that rac_plugin_count
  tracks registrations/unregistrations cleanly.

Build integration:
- tests/CMakeLists.txt: test_engine_vtable + test_legacy_coexistence
  always built (no backend dependency). test_plugin_entry_llamacpp
  gated on RAC_BACKEND_LLAMACPP. test_plugin_entry_onnx gated on
  RAC_BACKEND_ONNX. All 4 registered with add_test so CTest picks
  them up in CI.

Doc:
- docs/engine_plugin_authoring.md — the "Which path should I pick?"
  decision flowchart required by the spec, plus a 4-step guide
  (fill vtable → declare entry → hook CMake → register at startup).
  Includes the current priority ladder, testing template, API
  version bumping rules, and the legacy-coexistence contract.

Verified:
- g++ -std=c++17 compiles all four test TUs standalone ✓
  Full link requires rac_commons (logger / error symbols); CTest in
  CI runs the linked binaries end-to-end.

Next: GAP 02 Final Gate verification.
Made-with: Cursor
Closes GAP 02 final gate. Every item in
v2_gap_specs/GAP_02_UNIFIED_ENGINE_PLUGIN_ABI.md Success Criteria is
checked and documented in docs/gap02_final_gate_report.md.

Summary: all 12 criteria pass. rac_engine_vtable_t + registry in place;
6 plugin entries across 5 backends; tests compile; authoring doc
published. Sample apps and frontend SDKs build unchanged (legacy path
preserved).

This concludes the GAP 01 + GAP 02 implementation on main.

Made-with: Cursor
Lays the GAP 03 foundation on top of the GAP 02 plugin registry. Three
phases bundled because they form one indivisible vertical slice (loader
header → CMake mode split → real dlopen impl); each phase alone is not
useful. See docs/engine_plugin_authoring.md (GAP 02) for the existing
plugin contract this layer activates.

New (Phase 1):
- include/rac/plugin/rac_plugin_loader.h — public C ABI:
  rac_registry_load_plugin / unload / count / list / free_plugin_list,
  rac_plugin_api_version().
- src/plugin/plugin_loader.cpp — dual-mode implementation.
- src/plugin/plugin_registry_internal.h — private coupling between the
  loader and the registry (dl_handle map ops + name snapshot helper).

Modified (Phase 1):
- include/rac/core/rac_error.h — added RAC_ERROR_PLUGIN_LOAD_FAILED
  (-820) and RAC_ERROR_PLUGIN_BUSY (-821).

CMake (Phase 2):
- sdk/runanywhere-commons/CMakeLists.txt:
  * RAC_STATIC_PLUGINS option, forced ON for iOS + Emscripten, default
    OFF elsewhere.
  * target_compile_definitions(rac_commons PUBLIC
        RAC_PLUGIN_MODE_STATIC=1   # iOS / WASM
        RAC_PLUGIN_MODE_SHARED=1   # everyone else)
  * target_link_libraries(rac_commons PUBLIC ${CMAKE_DL_LIBS}) on the
    SHARED path so dlopen resolves on Linux/Android (-ldl).
  * Added src/plugin/plugin_loader.cpp to RAC_INFRASTRUCTURE_SOURCES.

Loader semantics (Phase 3):
- POSIX:   dlopen(path, RTLD_NOW | RTLD_LOCAL); dlclose on unload.
- Win32:   LoadLibraryA + GetProcAddress + FreeLibrary.
- Symbol resolution: librunanywhere_<name>.so → rac_plugin_entry_<name>.
  The "lib" prefix and "runanywhere_" infix are both optional; loader
  parses the path stem and synthesizes the entry symbol name.
- ABI / capability_check / dedup checks remain centralized in
  rac_plugin_registry.cpp (per spec: no (void) cast on the registry's
  return code in the loader).
- Per-name dl_handle map in the registry's State struct so unload can
  dlclose exactly the right handle (and exactly once).

Static mode (RAC_STATIC_PLUGINS=ON):
- rac_registry_load_plugin returns RAC_ERROR_FEATURE_NOT_AVAILABLE so
  iOS/WASM callers fail loud instead of silently no-oping.
- Static plugins enter the registry via the existing
  RAC_STATIC_PLUGIN_REGISTER(<name>) macro from GAP 02.

Verified:
- g++ -std=c++17 -I include -I src -c rac_plugin_registry.cpp ✓
- g++ -std=c++17 -DRAC_PLUGIN_MODE_SHARED=1 -c plugin_loader.cpp ✓
- g++ -std=c++17 -DRAC_PLUGIN_MODE_STATIC=1 -c plugin_loader.cpp ✓

Next: GAP 03 Phase 4 (static-macro polish) + Phase 5 (llama.cpp dual-mode).
Made-with: Cursor
…ests

Three phases bundled because they form one verification slice: macro
must survive linker stripping → llama.cpp dual-builds → tests prove the
end-to-end load + ABI handshake + idempotent dedup.

Phase 4 — static macro polish:
- include/rac/plugin/rac_plugin_entry.h:
  * Added `__attribute__((used))` (RAC_STATIC_REGISTRAR_USED_ATTR) to
    `g_registrar` so compiler dead-code analysis keeps the symbol.
  * Emitted an externally-visible C marker symbol per plugin
    (`rac_plugin_static_marker_<name>`) so hosts can ask the linker to
    keep the .o by symbol name when `-force_load` is impractical.
  * Header doc spells out the per-platform link flag (-force_load on
    Apple, --whole-archive on GNU, /INCLUDE: on MSVC) and notes that
    `cmake/plugins.cmake` (GAP 07) will wrap these into one helper.

Phase 5 — llama.cpp dual-mode proof:
- src/backends/llamacpp/rac_static_register_llamacpp.cpp NEW — one TU
  that calls RAC_STATIC_PLUGIN_REGISTER(llamacpp) only when
  RAC_PLUGIN_MODE_STATIC is set (avoids double-registration when the
  same TU ships inside a SHARED .so loaded at runtime).
- src/backends/llamacpp/CMakeLists.txt:
  * RAC_STATIC_PLUGINS=ON path: appends the static-register TU directly
    to rac_commons.
  * RAC_STATIC_PLUGINS=OFF path: produces a SHARED `runanywhere_llamacpp`
    library (OUTPUT_NAME runanywhere_llamacpp → librunanywhere_llamacpp.so)
    that PUBLIC-links rac_backend_llamacpp + rac_commons, with hidden
    visibility everywhere except the entry symbol. Installed to lib/.
  * The legacy `rac_backend_llamacpp` library is unchanged for
    pre-GAP-03 callers.

Phase 6 — tests + fixture:
- tests/fixtures/rac_test_plugin.cpp NEW — minimal plugin TU with a
  vtable that exposes only the GENERATE_TEXT primitive via a sentinel
  ops pointer (never deref'd). Compile-time toggle
  `-DRAC_TEST_PLUGIN_FORCE_BAD_ABI=1` flips metadata.abi_version to
  host+99 for the mismatch test fixture.
- tests/test_plugin_loader.cpp — happy path: load → find → list → unload.
- tests/test_plugin_loader_abi_mismatch.cpp — load BAD_ABI fixture →
  RAC_ERROR_ABI_VERSION_MISMATCH; registry remains empty.
- tests/test_plugin_loader_double_load.cpp — load same path twice →
  rac_plugin_count() does not grow; single unload sufficient; second
  unload returns NOT_FOUND.
- tests/test_static_registration.cpp — RAC_STATIC_PLUGIN_REGISTER fires
  before main(); runs in BOTH static and shared builds.
- tests/CMakeLists.txt:
  * Two fixture libraries (good + bad-ABI) built from the same source.
  * Three loader tests gated on `NOT RAC_STATIC_PLUGINS` (the loader
    returns FEATURE_NOT_AVAILABLE in static mode by design, so dlopen
    tests are meaningless there).
  * Static-registration test always built; tests both modes.
  * `add_dependencies` ensures fixtures are built before tests link.

Verified:
- g++ -std=c++17 compiles all 5 new test TUs + fixture standalone ✓
- Existing 6 plugin-entry TUs (llamacpp, llamacpp_vlm, onnx, whispercpp,
  whisperkit_coreml, metalrt) still compile after the static-macro
  change ✓
- rac_static_register_llamacpp.cpp compiles in both
  RAC_PLUGIN_MODE_STATIC=1 and RAC_PLUGIN_MODE_SHARED=1 ✓

Next: GAP 03 Phase 7 (authoring doc + final gate).
Made-with: Cursor
Closes GAP 03 final gate. All 7 spec Success Criteria checked and
documented in docs/gap03_final_gate_report.md.

Adds:
- docs/plugin_loader_authoring.md — third-party plugin recipe, anatomy
  diagram, dual-mode CMake snippets (static/shared), force-load notes,
  ABI version bumping policy, untrusted-plugin policy guidance.
- docs/gap03_final_gate_report.md — Success Criteria verification with
  evidence per criterion.

Phase 7 final gate verifies: standalone librunanywhere_llamacpp.so
build path; round-trip test pattern; ABI mismatch single log line;
iOS static-init via RAC_STATIC_PLUGIN_REGISTER + (used) attribute;
no (void) cast on the registry's ABI check (the v2 bug to avoid);
double-load idempotency with one balanced dlclose; published-headers
plugin authoring template.

Next: GAP 04 Phase 8 (routing types).
Made-with: Cursor
GAP 04 — see v2_gap_specs/GAP_04_ENGINE_ROUTER.md.

Four phases bundled because the router's scoring algorithm depends on
the metadata extension (Phase 11), and the metadata extension would be
dead code without the router consuming it.

Phase 8 — routing types:
- include/rac/plugin/rac_primitive.h:
  Added rac_runtime_id_t enum (CPU=1, METAL=2, COREML=3, ANE=4, CUDA=5,
  VULKAN=6, OPENCL=7, HIPBLAS=8, QNN=9, NNAPI=10, WEBGPU=11,
  WASM_SIMD=12, plus 7 reserved slots through 19). rac_runtime_name()
  helper.
- include/rac/router/rac_routing_hints.h NEW:
  rac_routing_hints_t = preferred_engine_name + preferred_runtime +
  estimated_memory_bytes + no_fallback flag + 7 reserved bytes.
- ModelFormat is reused from idl/model_types.proto (GAP 01) — frontends
  cast the proto enum to uint32_t.

Phase 9 — HardwareProfile:
- include/rac/router/rac_hardware_profile.h NEW:
  rac::router::HardwareProfile struct with cpu_vendor, gpu_vendor,
  total_ram_bytes, apple_chip_gen, has_metal/ane/coreml/cuda/vulkan/qnn/
  nnapi/webgpu booleans. detect() / cached() / refresh() / supports_runtime().
- src/router/rac_hardware_profile.cpp NEW:
  Per-platform probes:
    macOS/iOS: sysctl machdep.cpu.brand_string parse + Apple chip gen
               whitelist (M1/M2/M3/M4) for has_ane.
    Android:   __system_property_get("ro.hardware") for vendor;
               combined dlopen("libQnnHtp.so") + stat("/dev/fastrpc-{adsp,cdsp}")
               for has_qnn; dlopen("libneuralnetworks.so") for has_nnapi.
    Linux:     stat("/dev/nvidiactl") + dlopen("libcuda.so.1") for has_cuda;
               dlopen("libvulkan.so.1") for has_vulkan.
  Honors RAC_FORCE_RUNTIME=cpu env var (CI / debug short-circuit).

Phase 10 — EngineRouter:
- include/rac/router/rac_engine_router.h NEW:
  RouteRequest (primitive + format + memory + pinned_engine + preferred_runtime
  + no_fallback) → RouteResult (vtable + score + rejection_reason).
  EngineRouter::route() / route_all().
- src/router/rac_engine_router.cpp NEW:
  Snapshots the registry via the existing rac_plugin_list C ABI (no
  reach into registry internals). Scoring:
    Hard reject (-1000): vtable doesn't serve the requested primitive.
    Hard reject (-1000): pinned_engine set AND name doesn't match.
    Pinned-name match: 10000 + priority (always wins ties).
    Otherwise: priority + Phase-11 bonuses.
  Deterministic tiebreak: score desc → priority desc → metadata.name asc.
  Same RouteRequest in same process always returns same plugin.

Phase 11 — Metadata extension + ABI v2:
- include/rac/plugin/rac_plugin_entry.h:
  RAC_PLUGIN_API_VERSION bumped 1u → 2u. Version-history comment added.
- include/rac/plugin/rac_engine_vtable.h:
  rac_engine_metadata_t — replaced reserved_0/_1 (8 bytes) with the
  routing extension fields (48 bytes):
    const rac_runtime_id_t* runtimes; size_t runtimes_count;
    const uint32_t*         formats;  size_t formats_count;
- src/router/rac_engine_router.cpp:
  Scoring now applies +30 when caller's preferred_runtime is declared
  on the plugin AND supported on the host, and +10 when the caller's
  format is in the plugin's formats array.

Updated all 6 in-tree plugin entries with their runtimes/formats arrays:
  llamacpp:          {CPU, METAL?, CUDA?, VULKAN?} + {GGUF, GGML, BIN}
  llamacpp_vlm:      {CPU, METAL?}                   + {GGUF, BIN}
  onnx:              {CPU, COREML?, CUDA?, NNAPI?, QNN?} + {ONNX, ORT}
  whispercpp:        {CPU, METAL?}                   + {GGUF, GGML}
  whisperkit_coreml: {COREML, ANE}                   + {COREML, MLPACKAGE}
  metalrt:           {METAL, ANE}                    + {COREML, MLPACKAGE, GGUF}
(Apple-only entries gated by __APPLE__; Linux-only by !APPLE && !ANDROID
 && !EMSCRIPTEN; Android by __ANDROID__.)

Test fixtures + test_static_registration updated to use the new
field-name initializers (NULL runtimes/formats).

Build integration:
- sdk/runanywhere-commons/CMakeLists.txt: added src/router/rac_hardware_profile.cpp
  and src/router/rac_engine_router.cpp to RAC_INFRASTRUCTURE_SOURCES.
- install(DIRECTORY include/) recursively picks up the new rac/router/ headers.

ABI v2 break: any third-party plugin compiled against the GAP-02 v1
header will be rejected at register time with RAC_ERROR_ABI_VERSION_MISMATCH
(the safe outcome — the router would otherwise read garbage for the
new fields). Documented in the version-history block of rac_plugin_entry.h.

Verified:
- g++ -std=c++17 compiles all 6 plugin entries, fixture, 4 tests, and
  both new router TUs ✓
- Existing GAP 02/03 tests still compile against the new metadata layout
  (they use field assignment rather than designated initializers) ✓

Next: GAP 04 Phase 12 (service_registry integration + tests + final gate).
Made-with: Cursor
Closes GAP 04 — see v2_gap_specs/GAP_04_ENGINE_ROUTER.md.

C ABI wrapper:
- include/rac/router/rac_route.h NEW — `rac_plugin_route(primitive, format,
  hints, &out_vtable)` so frontends in C / Swift / Kotlin / Dart can use
  the router without instantiating the C++ class. Internally uses
  `HardwareProfile::cached()` so the per-host probe runs once per process.
- src/router/rac_route.cpp NEW — translates the C struct to the C++
  RouteRequest, runs the router, returns RAC_SUCCESS or RAC_ERROR_NOT_FOUND.

Tests:
- tests/test_engine_router.cpp — 7 scenarios (6 from spec + 1 C ABI
  smoke). Covers:
    1. PrefersHardwareAcceleratedOnAppleSilicon (Metal +30 over CPU)
    2. ANEHintSelectsWhisperKit (whisperkit_coreml beats onnx with ANE hint)
    3. PinnedEngineHardWins (low-priority pin beats high-priority rival)
    4. NoFallbackReturnsNotFound (no_fallback + missing pin → nullptr)
    5. Determinism (1000 routes, same winner every time)
    6. LegacyCompat (NULL runtimes still routed via priority)
    7. C ABI smoke (rac_plugin_route round-trip)
- tests/test_hardware_profile.cpp — invariant tests:
    * cached() memoization
    * refresh() invalidation
    * RAC_FORCE_RUNTIME=cpu zeroes every accelerator
    * supports_runtime(CPU) always true

Build integration:
- sdk/runanywhere-commons/CMakeLists.txt: added src/router/rac_route.cpp
  to RAC_INFRASTRUCTURE_SOURCES.
- tests/CMakeLists.txt: registered test_engine_router + test_hardware_profile
  with CTest. Both always built (no backend dependency).

Coexistence: router is a parallel C ABI alongside legacy `rac_service_create`.
service_registry.cpp is NOT touched — same coexistence model proven by
GAP 02 Phase 10's test_legacy_coexistence.cpp. Existing test_stt /
test_llm / test_tts / test_vad continue to use the legacy path unchanged.

docs/gap04_final_gate_report.md — Success Criteria verification with
evidence per criterion (6 of 6 OK; criteria 3, 5, 6 noted as partial
in the sense that platform-specific end-to-end runs require CI matrix
nodes the local dev box lacks).

Verified:
- g++ -std=c++17 compiles src/router/rac_route.cpp ✓
- g++ -std=c++17 compiles both test TUs ✓
- All 6 backend plugin entries still compile after the metadata bump ✓
- Existing GAP 02/03 tests still compile ✓

Wave A complete (GAP 03 + GAP 04). Next: Wave B (GAP 07 + GAP 06).

Made-with: Cursor
Closes the four wave-outline todos in the GAP 03 + GAP 04 plan with a
single consolidated roadmap doc.

docs/wave_roadmap.md captures:
- Wave B (~2-4 wk): GAP 07 (single root CMake + presets) then GAP 06
  (engines/ top-level reorg). Independent of every prior gap; GAP 07
  must precede GAP 06 because GAP 06 uses the new cmake/plugins.cmake
  rac_add_engine_plugin() helper.
- Wave C (~3-4 wk): GAP 09 (streaming consistency via gRPC-style
  codegen on idl/voice_events.proto). Depends on GAP 01 (already done);
  benefits from GAP 08; deletes ≥1,500 LOC of hand-written streaming
  plumbing across 5 SDKs.
- Wave D (~6-10 wk parallel): GAP 08 (delete ~5,100 LOC of duplicated
  Swift/Kotlin/Dart/RN/Web business logic). Parallelizable across SDKs
  and domains (voice/auth/download/HTTP/error-handling).
- Wave E (optional, ~6-8 wk): GAP 05 (DAG runtime primitives —
  StreamEdge, GraphScheduler, etc.). Deferred unless a second pipeline
  is committed; today's voice_agent.cpp single-thread orchestrator works
  without it, and v2's own voice_pipeline.cpp doesn't use the primitives
  either.

Includes per-wave: scope, expected deliverables w/ file paths, effort
estimate from spec, blockers + dependencies, likely todo decomposition
(so each wave's detailed plan can start from a known baseline).

Mermaid dependency graph + a 'cross-wave constraints' section spelling
out backwards-compat, ABI version cumulation, and the CI drift gate
contract.

Wave A (GAP 03 + GAP 04) is now fully complete.

Made-with: Cursor
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 22, 2026

Too many files changed for review. (202 files found, 100 file limit)

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR introduces a unified, multi-language IDL system for code generation, a plugin-based engine ABI with dynamic loading and hardware-aware routing, comprehensive CI/codegen infrastructure, and extensive documentation spanning GAP 01–04 implementation (IDL, plugin registry, dynamic loading, and routing).

Changes

Cohort / File(s) Summary
IDL & Protobuf Schemas
idl/*.proto, idl/README.md
Defines canonical proto3 schemas for model metadata, voice events, pipeline DAG, and solution configs. Establishes single source of truth for shared enums/structs across SDKs.
Code Generation Scripts
idl/codegen/*.sh
Adds language-specific code generators for Swift, Kotlin, Dart, TypeScript, Python, and C++, plus CI drift-check and all-in-one orchestration scripts.
IDL CMake & Setup
idl/CMakeLists.txt, scripts/setup-toolchain.sh
Defines rac_idl static library from generated C++ protobuf; installs and verifies multi-platform codegen toolchain.
Plugin System (ABI & Registry)
sdk/runanywhere-commons/include/rac/plugin/*, sdk/runanywhere-commons/src/plugin/*.cpp
Introduces unified plugin vtable, entry-point macros, registry with priority-based selection, and dynamic loader via dlopen/dlsym.
Error Codes
sdk/runanywhere-commons/include/rac/core/rac_error.h
Adds ABI mismatch, capability rejection, duplicate, and plugin load error codes.
Engine Router & Hardware Detection
sdk/runanywhere-commons/include/rac/router/*, sdk/runanywhere-commons/src/router/*.cpp
Implements hardware profile detection, engine router with scoring/tie-breaking, and C ABI route lookup with format/runtime hints.
Backend Plugin Entries
sdk/runanywhere-commons/src/backends/*/rac_plugin_entry_*.cpp, **/CMakeLists.txt
Wires llama.cpp, MetalRT, ONNX, Whisper.cpp, and WhisperKit CoreML backends with unified-ABI plugin entries; modifies register files to expose ops symbols.
Plugin Test Infrastructure
sdk/runanywhere-commons/tests/test_*.cpp, sdk/runanywhere-commons/tests/fixtures/rac_test_plugin.cpp
Comprehensive test suite covering vtable registration, loader, ABI mismatch, double-load, routing, hardware profile, and static/dynamic modes; includes test fixture plugins.
Gradle & Swift Package Management
gradle/libs.versions.toml, Package.swift, Package.resolved
Adds Wire dependency (4.9.9), swift-protobuf package dependency (≥1.27.0), and pinned resolve entry.
Language SDK Updates
sdk/runanywhere-flutter/lib/..., sdk/runanywhere-kotlin/src/.../types/*
Adds proto bridge functions (.toProto()/.fromProto()) in Dart/Kotlin for canonical enums; removes hand-written duplicates.
CI & Documentation
.gitattributes, .github/workflows/idl-drift-check.yml
Marks generated code dirs as linguist-generated; adds GitHub Actions workflow to regenerate all bindings and fail if uncommitted divergence detected.
Roadmap & Gate Reports
docs/gap*.md, docs/wave_roadmap.md, docs/*_authoring.md, docs/voice_event_proto_handoff.md
Comprehensive documentation: GAP 01–04 final gate reports, engine/plugin authoring guides, wave roadmap, and voice event proto handoff contract.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant Router as EngineRouter<br/>(Hardware-Aware)
    participant Registry as Plugin Registry
    participant HW as HardwareProfile<br/>(Detect/Cache)
    participant Loader as Dynamic Loader
    participant Backend as Backend Plugin<br/>(e.g., llamacpp)

    App->>Router: route(primitive,<br/>format, hints)
    Router->>HW: cached()
    HW->>HW: detect() once,<br/>memoize
    HW-->>Router: HardwareProfile
    Router->>Registry: rac_plugin_list(primitive)
    Registry-->>Router: [vtable₁, vtable₂, ...]
    Router->>Router: score & sort<br/>(priority, runtime,<br/>format, pinned)
    Router-->>App: RouteResult<br/>(best vtable)
    
    Note over App,Backend: Alternative: Dynamic Load Path
    App->>Loader: rac_registry_load_plugin<br/>("/path/to/plugin.so")
    Loader->>Backend: dlopen + dlsym<br/>(rac_plugin_entry_*)
    Backend->>Registry: rac_plugin_register<br/>(vtable)
    Registry->>Registry: validate ABI,<br/>run capability_check
    Registry-->>Loader: RAC_SUCCESS
    Loader-->>App: RAC_SUCCESS
Loading
sequenceDiagram
    participant Dev as Developer
    participant Codegen as Code Generation<br/>(generate_all.sh)
    participant Proto as Protobuf<br/>Compiler
    participant LangGen as Language<br/>Generators<br/>(Swift/Kotlin/etc)
    participant SDKs as SDK Packages
    participant CI as CI Drift Check<br/>(idl-drift-check.yml)
    participant Repo as Git Repo

    Dev->>Codegen: ./idl/codegen/generate_all.sh
    Codegen->>Proto: protoc --version check
    Proto-->>Codegen: ✓ protoc available
    Codegen->>LangGen: generate_swift.sh,<br/>generate_kotlin.sh, ...
    LangGen->>Proto: protoc --proto_path=idl<br/>--XX_out=generated/
    Proto-->>LangGen: *.pb.swift, *.kt, *.dart,<br/>etc.
    LangGen->>SDKs: commit bindings to<br/>sdk/runanywhere-*/Generated/**
    SDKs-->>Repo: checked-in generated code
    
    Note over CI,Repo: CI/CD Pipeline
    CI->>Codegen: re-run generate_all.sh
    Codegen-->>CI: regenerated outputs
    CI->>Repo: git diff --exit-code
    alt No divergence
        CI-->>Repo: ✓ Build passes
    else Divergence detected
        CI->>Repo: ::error:: Drift detected
        CI-->>Repo: ✗ Build fails
    end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

The diff spans 5,000+ lines with high heterogeneity: new proto schemas (requiring validation of field numbering and wire format), complex plugin registry and router logic (scoring, thread-safety, platform detection), dual static/dynamic plugin loading modes, multiple language bindings (Dart/Kotlin proto bridges), backend integration changes across many files, extensive test coverage, and intricate CMake configuration for different platforms. Each major component (IDL, plugin system, router, language updates) requires separate reasoning; the logic density is substantial (hardware detection, routing scoring, registry deduplication, ABI validation).

Possibly related PRs

  • PR #471: Introduces VAD-specific model category and model-loading APIs that consume the canonical model metadata enums and structures defined in this PR's IDL system.
  • PR #462: Implements Genie NPU backend using the unified plugin ABI, vtable slots, and framework/format enums introduced in this PR.
  • PR #459: Modifies MetalRT plugin surface (rac_plugin_entry_metalrt, register files, CMake) using the same plugin infrastructure defined in this PR.

Suggested labels

enhancement, documentation, plugin-system, idl, code-generation

Suggested reviewers

  • Siddhesh2377

🐰 Hops excitedly with clipboard in paw

Proto files bloom, plugins dance in rows,
Each backend sings its unified ABI,
Hardware whispers which engine bestows—
The router chooses wisely, oh my!
From IDL's truth, a thousand bindings grow! 🎉

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/v2-architecture-gaps-01-04

Comment on lines +37 to +100
name: Verify generated code matches IDL
runs-on: macos-14
timeout-minutes: 15
steps:
- uses: actions/checkout@v4

- name: Cache Homebrew
uses: actions/cache@v4
with:
path: |
/usr/local/Homebrew
/opt/homebrew
~/Library/Caches/Homebrew
key: ${{ runner.os }}-brew-protoc-${{ hashFiles('scripts/setup-toolchain.sh') }}

- name: Install protoc + swift-protobuf (Homebrew)
run: |
brew install protobuf swift-protobuf

- name: Install wire-compiler (best-effort — Gradle Wire plugin is the fallback)
run: |
brew install wire || echo "wire bottle unavailable; Gradle Wire plugin will handle Kotlin codegen"

- name: Install Dart plugin (protoc-gen-dart)
run: |
if command -v dart >/dev/null 2>&1; then
dart pub global activate protoc_plugin 21.1.2
echo "$HOME/.pub-cache/bin" >> "$GITHUB_PATH"
else
echo "::warning::dart not found on macos-14 runner; Dart codegen skipped"
fi

- name: Install ts-proto (npm)
run: |
npm install -g ts-proto@1.181.1 protobufjs

- name: Install Python protobuf
run: |
python3 -m pip install --upgrade "protobuf>=4.25,<5" grpcio-tools

- name: Dump toolchain versions (debug)
run: |
echo "protoc: $(protoc --version)"
echo "protoc-gen-swift: $(protoc-gen-swift --version 2>/dev/null || echo 'not present')"
echo "wire-compiler: $(wire-compiler --version 2>/dev/null || echo 'not present')"
echo "protoc-gen-dart: $(protoc-gen-dart --version 2>/dev/null || echo 'present or skipped')"
echo "node: $(node --version)"
echo "python3: $(python3 --version)"

- name: Regenerate all bindings
run: ./idl/codegen/generate_all.sh

- name: Fail on drift
run: |
if ! git diff --exit-code --stat; then
echo "::error::IDL-generated code is out of sync with .proto sources."
echo ""
echo "To fix locally:"
echo " ./scripts/setup-toolchain.sh"
echo " ./idl/codegen/generate_all.sh"
echo " git add -A && git commit -m 'chore(codegen): regenerate bindings'"
exit 1
fi
echo "✓ No drift detected."
@sanchitmonga22
Copy link
Copy Markdown
Contributor Author

Replaced by #494 — same 18 commits, branch renamed to feat/v2-architecture so all future v2 waves (B-E per docs/wave_roadmap.md) land on a single long-lived branch instead of fragmenting into per-wave PRs. Closing this one cleanly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants