Fixes ios by shubhammalhotra28 · Pull Request #514 · RunanywhereAI/runanywhere-sdks

shubhammalhotra28 · 2026-06-25T03:49:00Z

Description

Brief description of the changes made.

Type of Change

Bug fix
New feature
Documentation update
Refactoring

Testing

Lint passes locally
Added/updated tests for changes

Platform-Specific Testing (check all that apply)

Swift SDK / iOS Sample:

Tested on iPhone (Simulator or Device)
Tested on iPad / Tablet
Tested on Mac (macOS target)

Kotlin SDK / Android Sample:

Tested on Android Phone (Emulator or Device)
Tested on Android Tablet

Flutter SDK / Flutter Sample:

Tested on iOS
Tested on Android

React Native SDK / React Native Sample:

Tested on iOS
Tested on Android

Playground:

Tested on target platform
Verified no regressions in existing Playground projects
Web SDK / Web Sample:
Tested in Chrome (Desktop)
Tested in Firefox
Tested in Safari
WASM backends load (LlamaCpp + ONNX)
OPFS storage persistence verified (survives page refresh)
Settings persistence verified (localStorage)

Labels

Please add the appropriate label(s):

SDKs:

Swift SDK - Changes to Swift SDK (sdk/runanywhere-swift)
Kotlin SDK - Changes to Kotlin SDK (sdk/runanywhere-kotlin)
Flutter SDK - Changes to Flutter SDK (sdk/runanywhere-flutter)
React Native SDK - Changes to React Native SDK (sdk/runanywhere-react-native)
Web SDK - Changes to Web SDK (sdk/runanywhere-web)
Commons - Changes to shared native code (sdk/runanywhere-commons)

Sample Apps:

iOS Sample - Changes to iOS example app (examples/ios)
Android Sample - Changes to Android example app (examples/android)
Flutter Sample - Changes to Flutter example app (examples/flutter)
React Native Sample - Changes to React Native example app (examples/react-native)
Web Sample - Changes to Web example app (examples/web)

Checklist

Code follows project style guidelines
Self-review completed
Documentation updated (if needed)

Screenshots

Attach relevant UI screenshots for changes (if applicable):

Mobile (Phone)
Tablet / iPad
Desktop / Mac

Note

Medium Risk
Changes span real-time audio session lifecycle, voice-agent turn processing, and LLM streaming wiring across native and JS—high user-visible impact but no auth or data-store changes; RN iOS Pod post-install edits add build-pipeline risk on Xcode upgrades.

Overview
Voice agent and audio — Swift and React Native now run a mic driver while streamVoiceAgent() is active: energy-based utterance segmentation, per-turn submission to the C core, and TTS playback with the mic gated during replies. iOS/RN native layers use a shared .playAndRecord session so capture is not torn down when TTS plays; playback/capture managers skip reconfiguring or deactivating a session the agent owns.

RN SDK — generateStream switches to atomic llmGenerateStreamProto (replacing handle subscription + non-streaming generate), fixing chat streams that never received tokens. VoiceAgentMicDriver, playWav, and Nitro proxy eager resolution support the voice pipeline.

VLM — Llama.cpp backend adds LFM2-VL detection and <|startoftext|> + ChatML prompting; commons VLM streaming strips special tokens (e.g. <|im_end|>) from streamed/display text.

Examples / tooling — iOS sample drops the Solutions YAML demo (view, generated YAML, sync script, verify step) and wires more voice event arms in the sample ViewModel. RN example Podfile fixes Xcode 26 CocoaPods static XCFramework output lists and avoids marking Copy XCFrameworks always-out-of-date; catalog memoryRequirement values align with real download sizes; model banners fall back to framework. React is pinned to 19.2.3; RN workspace Package.resolved removed.

^{Reviewed by Cursor Bugbot for commit 6bf113c. Configure here.}

Summary by CodeRabbit

New Features
- Added support for a new vision-language model type, including improved prompt formatting and model detection.
- Introduced microphone-driven voice agent streaming with direct playback of generated audio.
- Added direct WAV playback support and better shared audio-session handling.
Bug Fixes
- Improved voice-session event handling, including clearer error states and more responsive UI updates.
- Cleaned up streamed tokens so special markers no longer appear in output.
- Fixed model status banners to show the correct framework more consistently.
Chores
- Updated example and build setup files, including dependency version adjustments.

coderabbitai · 2026-06-25T03:49:39Z

📝 Walkthrough

Walkthrough

The PR adds LFM2-VL support in the llama.cpp VLM path, refines VLM token streaming, rewires voice-agent capture and playback around a new mic driver and turn-event bridge, removes the iOS Solutions demo wiring, and updates React Native build and dependency pins.

Changes

Voice Agent Runtime and Bridge

Layer / File(s)	Summary
Bridge and audio sessions `sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+ModalityProtoABI.swift`, `sdk/runanywhere-commons/src/features/voice_agent/voice_agent_proto_abi.cpp`, `sdk/runanywhere-swift/Sources/RunAnywhere/Features/STT/Services/AudioCaptureManager.swift`, `sdk/runanywhere-swift/Sources/RunAnywhere/Features/TTS/Services/AudioPlaybackManager.swift`, `sdk/runanywhere-react-native/packages/core/ios/HybridAudioCapture.swift`, `sdk/runanywhere-react-native/packages/core/ios/HybridAudioPlayback.swift`	The voice-turn ABI adds event callbacks, and the shared audio-session helpers switch to configurable ownership and full-duplex reuse.
Mic driver loop `sdk/runanywhere-swift/Sources/RunAnywhere/Features/VoiceAgent/Services/VoiceAgentMicDriver.swift`, `sdk/runanywhere-react-native/packages/core/src/Features/VoiceSession/AudioPlaybackManager.ts`	The new Swift mic driver captures audio, segments utterances, submits turns to the native core, and plays synthesized replies.
Stream lifecycle wiring `sdk/runanywhere-react-native/packages/core/src/Public/Extensions/VoiceAgent/RunAnywhere+VoiceAgent.ts`, `sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceAgent.swift`	The React Native and Swift voice-agent stream extensions start and stop the mic driver alongside the existing stream flow.
Example voice event handling `examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift`	The iOS example view model now updates session state, transcript state, audio level, and error state from new event arms.

LLM and VLM Runtime

Layer / File(s)	Summary
LLM streaming iterator `sdk/runanywhere-react-native/packages/core/src/native/NitroModulesGlobalInit.ts`, `sdk/runanywhere-react-native/packages/core/src/Public/Extensions/LLM/RunAnywhere+TextGeneration.ts`	The text-generation extension now pulls `LLMStreamEvent` data from one native callback and finishes or cancels the iterator through `llmCancelProto`.
VLM token cleanup `sdk/runanywhere-commons/src/features/vlm/vlm_module.cpp`	`generated_stream_token_trampoline` now strips special tokens before updating streamed text and token telemetry.
LFM2VL prompt and detection `engines/llamacpp/rac_vlm_llamacpp.cpp`	`LFM2VL` gets a dedicated manual prompt template branch and metadata detection from architecture and name fields.
Model catalog memory requirements `examples/react-native/RunAnywhereAI/src/services/ModelCatalogBootstrap.ts`	The seeded model catalog updates `memoryRequirement` values for several LlamaCPP, VLM, and embedding entries.
Model status banner fallback `examples/react-native/RunAnywhereAI/src/screens/ChatScreen.tsx`, `examples/react-native/RunAnywhereAI/src/screens/STTScreen.tsx`, `examples/react-native/RunAnywhereAI/src/screens/TTSScreen.tsx`	The model status banner now receives a framework fallback from `currentModel.framework` when `preferredFramework` is unset.

iOS Solutions Demo Removal

Layer / File(s)	Summary
Navigation and docs cleanup `examples/ios/RunAnywhereAI/AGENTS.md`, `examples/ios/RunAnywhereAI/RunAnywhereAI/App/ContentView.swift`	The iOS example docs and utility navigation remove Solutions references and add the iOS Voice Keyboard link.
Solution script removal `examples/ios/RunAnywhereAI/scripts/sync-solutions-yamls.sh`, `examples/ios/RunAnywhereAI/scripts/verify.sh`	The solution YAML sync script is deleted and verification no longer invokes its check step.

Build and Dependency Alignment

Layer / File(s)	Summary
CocoaPods post-install fixes `examples/react-native/RunAnywhereAI/ios/Podfile`	The Xcode 16 script-phase workaround skips Copy XCFrameworks phases, and the Xcode 26 xcframework rewrite maps static bundles to `lib*.a` paths.
Dependency pins and lockfiles `package.json`, `examples/react-native/RunAnywhereAI/package.json`, `sdk/runanywhere-react-native/package.json`, `examples/react-native/RunAnywhereAI/ios/RunAnywhereAI.xcworkspace/xcshareddata/swiftpm/Package.resolved`	React is pinned to 19.2.3 in the root and example manifests, and the SwiftPM resolved package contents are cleared.

Sequence Diagram(s)

sequenceDiagram
  participant StreamConsumer
  participant GenerateStream as RunAnywhere+TextGeneration.generateStream
  participant NitroModulesGlobalInit
  participant NativeLLM as native.llmGenerateStreamProto

  StreamConsumer->>GenerateStream: next()
  GenerateStream->>NitroModulesGlobalInit: getNitroModulesProxySync()
  GenerateStream->>NativeLLM: llmGenerateStreamProto(callback)
  NativeLLM-->>GenerateStream: LLMStreamEvent
  GenerateStream-->>StreamConsumer: queued event
  StreamConsumer->>GenerateStream: return()
  GenerateStream->>NativeLLM: llmCancelProto()

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Possibly related PRs

RunanywhereAI/runanywhere-sdks#284 — Both PRs touch sdk/runanywhere-react-native/packages/core/src/Features/VoiceSession/AudioPlaybackManager.ts and related WAV playback flow.

Suggested reviewers

sanchitmonga22

Poem

I twitched my nose at tokens bright,
and hopped through streams by moonlit light.
With carrots, code, and a cheerful beep,
the burrow sang in voices deep. 🐇

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	⚠️ Warning	The template is mostly followed, but the required Description section is still placeholder text and does not explain the actual changes.	Replace the placeholder with a concrete summary of the fixes and scope, and add screenshots only if any UI changes are involved.
Title check	❓ Inconclusive	The title is too vague to identify the actual change; it doesn't describe the specific iOS fixes in this PR.	Use a specific title naming the main iOS/RN/Swift fix, such as the voice-agent and audio-session changes.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fixes-ios

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix prepared fixes for both issues found in the latest run.

✅ Fixed: Null LLM text crash
- Restored the null-safe fallback when copying LLM text for turn lifecycle emission.
✅ Fixed: Turn error ends active UI
- Kept per-turn voice errors as messages without transitioning the active session UI into an error state.

Or push these changes by commenting:

@cursor push d245a33033

Preview (d245a33033)

diff --git a/examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift b/examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift
--- a/examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift
+++ b/examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift
@@ -561,8 +561,6 @@
         case let .error(err):
             logger.error("Voice agent error: \(err.message)")
             errorMessage = err.message
-            sessionState = .error(err.message)
-            currentStatus = "Error"
 
         case let .sessionError(err):
             logger.error("Voice session error: \(err.message)")

diff --git a/sdk/runanywhere-commons/src/features/voice_agent/voice_agent_proto_abi.cpp b/sdk/runanywhere-commons/src/features/voice_agent/voice_agent_proto_abi.cpp
--- a/sdk/runanywhere-commons/src/features/voice_agent/voice_agent_proto_abi.cpp
+++ b/sdk/runanywhere-commons/src/features/voice_agent/voice_agent_proto_abi.cpp
@@ -397,7 +397,7 @@
         }
         {
             const std::string stt_text(stt.text);
-            const std::string llm_text(llm.text);
+            const std::string llm_text(llm.text ? llm.text : "");
             pending_emits.emplace_back([handle, stt_text, llm_text]() {
                 emit_turn_lifecycle(
                     handle, runanywhere::v1::TURN_LIFECYCLE_EVENT_KIND_AGENT_RESPONSE_COMPLETED,

_{You can send follow-ups to the cloud agent here.}

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit 6bf113c. Configure here.}

cursor · 2026-06-25T03:51:46Z

        {
            const std::string stt_text(stt.text);
-            const std::string llm_text(llm.text ? llm.text : "");
+            const std::string llm_text(llm.text);


Null LLM text crash

High Severity

Constructing std::string directly from llm.text removes the prior null guard. If rac_llm_generate succeeds but leaves llm.text null, this undefined behavior can crash during turn lifecycle emission.

^{Reviewed by Cursor Bugbot for commit 6bf113c. Configure here.}

cursor · 2026-06-25T03:51:46Z

            logger.error("Voice agent error: \(err.message)")
            errorMessage = err.message
+            sessionState = .error(err.message)
+            currentStatus = "Error"


Turn error ends active UI

Medium Severity

Every .error voice event now sets sessionState to .error, so isActive becomes false while streamVoiceAgent() and the SDK mic driver keep running. A single failed turn can show a dead session in the UI even though capture continues.

^{Reviewed by Cursor Bugbot for commit 6bf113c. Configure here.}

coderabbitai

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

sdk/runanywhere-commons/src/features/vlm/vlm_module.cpp (1)

1216-1240: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Avoid truncating cleaned tokens to 511 bytes.

The new scratch buffer caps display_token at 511 bytes. Any longer token is silently truncated before it is appended to ctx->text, counted, and emitted, which corrupts the streamed/output text on that path.

Suggested fix

-    char cleaned[512];
-    const char* display_token = vlm_strip_special_tokens(safe_token, cleaned, sizeof(cleaned));
+    std::string cleaned(std::strlen(safe_token) + 1, '\0');
+    const char* display_token =
+        vlm_strip_special_tokens(safe_token, cleaned.data(), cleaned.size());

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sdk/runanywhere-commons/src/features/vlm/vlm_module.cpp` around lines 1216 -
1240, Avoid truncating cleaned tokens in the VLM streaming path: the local
scratch buffer used by vlm_strip_special_tokens in the token handling block can
silently cut display_token to 511 bytes before it is appended to ctx->text,
counted, and published. Update the token-cleaning flow in vlm_module.cpp around
the display_token/publish_event logic to preserve the full token content, using
a dynamically sized or sufficiently sized buffer so dispatch_vlm_stream_event
and the generation event receive the complete token text.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/ios/RunAnywhereAI/AGENTS.md`:
- Line 67: Update the navigation table entry for MoreHubView so Voice Keyboard
is clearly marked as iOS-only. In AGENTS.md, adjust the MoreHubView row to
reflect the same platform gating used in the runtime view’s `#if` os(iOS) logic,
so the docs do not imply Voice Keyboard appears on macOS.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift`:
- Around line 585-588: The .agentResponseStarted handling in VoiceAgentViewModel
is clearing both assistantResponse and currentTranscript, which removes the
finalized user transcript too early. Update the .agentResponseStarted case to
clear only assistantResponse and leave currentTranscript intact so the .userSaid
transcript remains visible. Use the VoiceAgentViewModel state handling around
.userSaid and .agentResponseStarted to locate the change.

In `@sdk/runanywhere-commons/src/features/voice_agent/voice_agent_proto_abi.cpp`:
- Line 400: `llm.text` is used in `voice_agent_proto_abi.cpp` without
validation, which can crash before the existing error handling and also
propagate a null pointer into TTS. In the `llm_text` construction and the later
synthesis path, add a guard in the `voice_agent` flow to verify `llm.text` is
non-null and non-empty before creating the `std::string` or calling the
TTS/synthesis helpers, and route invalid responses through the existing failure
path in the same block.

In `@sdk/runanywhere-react-native/packages/core/ios/HybridAudioCapture.swift`:
- Around line 143-151: The audio session setup in
HybridAudioCapture.startRecording is incorrectly using audioSession.category as
the only signal for whether to keep full-duplex mode. Add an explicit state flag
in this flow, such as isFullDuplexActive, to track ownership from
activateAudioSession and stopRecording(deactivateSession:), then use that flag
to decide when to apply .record/.measurement. Update the startRecording logic so
plain STT starts reconfigure to measurement mode unless a real voice-agent
session is active, instead of relying on the stale .playAndRecord category.

In `@sdk/runanywhere-react-native/packages/core/ios/HybridAudioPlayback.swift`:
- Around line 182-188: The shared-session reuse check in HybridAudioPlayback’s
audio-session setup is too loose because it keys off AVAudioSession.category
alone, which can remain .playAndRecord after the voice agent has deactivated.
Update the logic around the session configuration path to gate reuse on explicit
active ownership/state from the voice-session controller instead of the stale
category, so the code only skips setCategory(.playback, mode: .default, options:
[.duckOthers]) and setActive(true) when a voice session is truly active. Keep
the ownsSession flag in sync with that explicit state so cleanup still runs
correctly when the agent is no longer holding the session.

In
`@sdk/runanywhere-react-native/packages/core/src/Features/VoiceAgent/VoiceAgentMicDriver.ts`:
- Around line 87-104: The VoiceAgentMicDriver stop/start flow can leave an
in-flight processTurn() from a previous session alive, allowing a stale native
turn to complete after restart and affect the new session. Update
VoiceAgentMicDriver so stop() either awaits or invalidates any outstanding turn
work before returning, and make processTurn()/the turn-completion path check a
session/token or cancellation state that survives start() resetting stopped;
ensure the stale result cannot pass the completion guard and trigger playback
after a new start.
- Around line 74-82: `VoiceAgentMicDriver.start` activates the audio session
before calling `capture.startRecording`, but if recording throws, the session
can stay active because `stop()` may do nothing when recording never started.
Wrap the activation/startRecording sequence in a failure path inside
`VoiceAgentMicDriver` and, on any error from `startRecording`, explicitly clean
up by stopping/deactivating the capture session before rethrowing so the session
is not left active.

In
`@sdk/runanywhere-react-native/packages/core/src/Public/Extensions/LLM/RunAnywhere`+TextGeneration.ts:
- Around line 268-273: Guard the native cancel in the iterator cleanup path so
`return()` only calls `native.llmCancelProto()` when this `LLMStream` still owns
an active generation. Use the stream’s active-state tracking in
`RunAnywhere+TextGeneration` (and its `finish()`/cleanup flow) to skip cancel on
unopened, already-finished, or stale iterators, preventing `return()` from
aborting a newer generation started after the iterator was disposed.

In
`@sdk/runanywhere-react-native/packages/core/src/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceAgent.ts:
- Around line 336-347: The mic startup in VoiceAgentMicDriver is fire-and-forget
and the stream still uses adapter.stream() delegation, which can leave the voice
session hanging on startup failure and is not Hermes-compatible. Update
RunAnywhere+VoiceAgent to await micDriver.start() before entering the stream
consumption path, and replace yield* adapter.stream() in the streaming logic
with an explicit manual iterator.next() loop for async-iterable compatibility.
Make sure the finally block still stops the mic driver after the loop exits.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Features/TTS/Services/AudioPlaybackManager.swift`:
- Around line 141-144: Capture the session-ownership decision at playback start
in AudioPlaybackManager by storing the current managesAudioSession value in the
locked State snapshot before calling configureAudioSession(), then use that
stored value during cleanup instead of re-reading the mutable property; update
the start/stop flow in AudioPlaybackManager and its State handling so cleanup
uses the same ownership choice for the whole playback instance.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceAgent.swift:
- Around line 225-236: The mic-driver task in RunAnywhere+VoiceAgent currently
only logs failures from VoiceAgentMicDriver.run() and lets adapter.stream()
continue waiting, which can hang the session. Update the flow around micTask and
the stream loop so the mic task is raced against streaming, and if it exits
unexpectedly with a non-cancellation error, propagate that failure or finish the
continuation immediately. Use the existing symbols VoiceAgentMicDriver.run(),
micTask, and adapter.stream() to locate the logic and make sure a dead mic
session cannot leave the stream pending.

---

Outside diff comments:
In `@sdk/runanywhere-commons/src/features/vlm/vlm_module.cpp`:
- Around line 1216-1240: Avoid truncating cleaned tokens in the VLM streaming
path: the local scratch buffer used by vlm_strip_special_tokens in the token
handling block can silently cut display_token to 511 bytes before it is appended
to ctx->text, counted, and published. Update the token-cleaning flow in
vlm_module.cpp around the display_token/publish_event logic to preserve the full
token content, using a dynamically sized or sufficiently sized buffer so
dispatch_vlm_stream_event and the generation event receive the complete token
text.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fb7d83eb-8113-45c9-8651-22956b724f3d

📥 Commits

Reviewing files that changed from the base of the PR and between c272a1f and 6bf113c.

⛔ Files ignored due to path filters (2)

examples/ios/RunAnywhereAI/RunAnywhereAI/Generated/SolutionsYaml.swift is excluded by !**/generated/**
yarn.lock is excluded by !**/yarn.lock, !**/*.lock

📒 Files selected for processing (30)

engines/llamacpp/rac_vlm_llamacpp.cpp
examples/ios/RunAnywhereAI/AGENTS.md
examples/ios/RunAnywhereAI/RunAnywhereAI/App/ContentView.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Solutions/SolutionsView.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift
examples/ios/RunAnywhereAI/scripts/sync-solutions-yamls.sh
examples/ios/RunAnywhereAI/scripts/verify.sh
examples/react-native/RunAnywhereAI/ios/Podfile
examples/react-native/RunAnywhereAI/ios/RunAnywhereAI.xcworkspace/xcshareddata/swiftpm/Package.resolved
examples/react-native/RunAnywhereAI/package.json
examples/react-native/RunAnywhereAI/src/screens/ChatScreen.tsx
examples/react-native/RunAnywhereAI/src/screens/STTScreen.tsx
examples/react-native/RunAnywhereAI/src/screens/TTSScreen.tsx
examples/react-native/RunAnywhereAI/src/services/ModelCatalogBootstrap.ts
package.json
sdk/runanywhere-commons/src/features/vlm/vlm_module.cpp
sdk/runanywhere-commons/src/features/voice_agent/voice_agent_proto_abi.cpp
sdk/runanywhere-react-native/package.json
sdk/runanywhere-react-native/packages/core/ios/HybridAudioCapture.swift
sdk/runanywhere-react-native/packages/core/ios/HybridAudioPlayback.swift
sdk/runanywhere-react-native/packages/core/src/Features/VoiceAgent/VoiceAgentMicDriver.ts
sdk/runanywhere-react-native/packages/core/src/Features/VoiceSession/AudioPlaybackManager.ts
sdk/runanywhere-react-native/packages/core/src/Public/Extensions/LLM/RunAnywhere+TextGeneration.ts
sdk/runanywhere-react-native/packages/core/src/Public/Extensions/VoiceAgent/RunAnywhere+VoiceAgent.ts
sdk/runanywhere-react-native/packages/core/src/native/NitroModulesGlobalInit.ts
sdk/runanywhere-swift/Sources/RunAnywhere/Features/STT/Services/AudioCaptureManager.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Features/TTS/Services/AudioPlaybackManager.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Features/VoiceAgent/Services/VoiceAgentMicDriver.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+ModalityProtoABI.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceAgent.swift

💤 Files with no reviewable changes (5)

examples/ios/RunAnywhereAI/scripts/verify.sh
examples/ios/RunAnywhereAI/RunAnywhereAI/App/ContentView.swift
examples/react-native/RunAnywhereAI/ios/RunAnywhereAI.xcworkspace/xcshareddata/swiftpm/Package.resolved
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Solutions/SolutionsView.swift
examples/ios/RunAnywhereAI/scripts/sync-solutions-yamls.sh

coderabbitai · 2026-06-25T04:01:01Z

 | 1 | `VisionHubView` | VLM camera |
 | 2 | `VoiceAssistantView` | Full voice agent (STT + LLM + TTS pipeline) |
-| 3 | `MoreHubView` | RAG, STT, TTS, VAD, Storage, Solutions, Voice Keyboard |
+| 3 | `MoreHubView` | RAG, STT, TTS, VAD, Storage, Voice Keyboard |


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Mark Voice Keyboard as iOS-only in the navigation table.

Line 67 currently reads like MoreHubView exposes Voice Keyboard on every platform, but the runtime view gates that entry behind #if os(iOS). The doc is inaccurate for macOS.

✏️ Suggested doc fix

-| 3 | `MoreHubView` | RAG, STT, TTS, VAD, Storage, Voice Keyboard | +| 3 | `MoreHubView` | RAG, STT, TTS, VAD, Storage, iOS-only Voice Keyboard |

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

| 3 | `MoreHubView` | RAG, STT, TTS, VAD, Storage, Voice Keyboard |

| 3 | `MoreHubView` | RAG, STT, TTS, VAD, Storage, iOS-only Voice Keyboard |

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/ios/RunAnywhereAI/AGENTS.md` at line 67, Update the navigation table entry for MoreHubView so Voice Keyboard is clearly marked as iOS-only. In AGENTS.md, adjust the MoreHubView row to reflect the same platform gating used in the runtime view’s `#if` os(iOS) logic, so the docs do not imply Voice Keyboard appears on macOS.

coderabbitai · 2026-06-25T04:01:01Z

+        case .agentResponseStarted:
+            assistantResponse = ""
+            currentTranscript = ""
+


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Don’t clear the finalized user transcript when the response starts.

.userSaid sets currentTranscript, but .agentResponseStarted can arrive right after and erase it before the user sees what was transcribed. Clear only the assistant response here.

Proposed fix

case .agentResponseStarted: assistantResponse = "" - currentTranscript = ""

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

case .agentResponseStarted:

assistantResponse = ""

currentTranscript = ""

case .agentResponseStarted:

assistantResponse = ""

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift` around lines 585 - 588, The .agentResponseStarted handling in VoiceAgentViewModel is clearing both assistantResponse and currentTranscript, which removes the finalized user transcript too early. Update the .agentResponseStarted case to clear only assistantResponse and leave currentTranscript intact so the .userSaid transcript remains visible. Use the VoiceAgentViewModel state handling around .userSaid and .agentResponseStarted to locate the change.

coderabbitai · 2026-06-25T04:01:01Z

        {
            const std::string stt_text(stt.text);
-            const std::string llm_text(llm.text ? llm.text : "");
+            const std::string llm_text(llm.text);


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Validate llm.text before constructing or synthesizing from it.

Line 400 now constructs std::string from llm.text without a null check; if an LLM backend returns success with a null/empty text, this can crash before the error path runs, and Lines 415/417 would also pass the null pointer into TTS.

Proposed fix

if (rc != RAC_SUCCESS) { if (have_lifecycle_llm) { rac::llm::release_lifecycle_llm(&llm_ref); } @@ error_message = "LLM generation failed"; goto cleanup_and_return; } + if (!llm.text || llm.text[0] == '\0') { + rac_llm_result_free(&llm); + if (have_lifecycle_llm) { + rac::llm::release_lifecycle_llm(&llm_ref); + } + rac_stt_result_free(&stt); + if (have_lifecycle_stt) { + rac::lifecycle::release_lifecycle_stt(&stt_ref); + } + pending_emits.emplace_back([handle]() { + emit_component_failure(handle, "llm", RAC_ERROR_INVALID_STATE, + "LLM generation was empty"); + }); + error_code = RAC_ERROR_INVALID_STATE; + error_message = "LLM generation was empty"; + rc = error_code; + goto cleanup_and_return; + } { const std::string stt_text(stt.text); const std::string llm_text(llm.text);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const std::string llm_text(llm.text);

if (!llm.text || llm.text[0] == '\0') {

rac_llm_result_free(&llm);

if (have_lifecycle_llm) {

rac::llm::release_lifecycle_llm(&llm_ref);

}

rac_stt_result_free(&stt);

if (have_lifecycle_stt) {

rac::lifecycle::release_lifecycle_stt(&stt_ref);

}

pending_emits.emplace_back([handle]() {

emit_component_failure(handle, "llm", RAC_ERROR_INVALID_STATE,

"LLM generation was empty");

});

error_code = RAC_ERROR_INVALID_STATE;

error_message = "LLM generation was empty";

rc = error_code;

goto cleanup_and_return;

}

{

const std::string stt_text(stt.text);

const std::string llm_text(llm.text);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/runanywhere-commons/src/features/voice_agent/voice_agent_proto_abi.cpp` at line 400, `llm.text` is used in `voice_agent_proto_abi.cpp` without validation, which can crash before the existing error handling and also propagate a null pointer into TTS. In the `llm_text` construction and the later synthesis path, add a guard in the `voice_agent` flow to verify `llm.text` is non-null and non-empty before creating the `std::string` or calling the TTS/synthesis helpers, and route invalid responses through the existing failure path in the same block.

coderabbitai · 2026-06-25T04:01:01Z

+                    // Preserve a full-duplex session if the voice agent already
+                    // configured one (activateAudioSession → .playAndRecord):
+                    // switching back to .record would silence the TTS reply and
+                    // disable the simultaneous playback the agent needs. STT capture
+                    // (no prior activate) falls through to the unprocessed
+                    // .record/.measurement path that gives Whisper the cleanest signal.
+                    if audioSession.category != .playAndRecord {
+                        try audioSession.setCategory(.record, mode: .measurement)
+                    }


🎯 Functional Correctness | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n sdk/runanywhere-react-native/packages/core/ios/HybridAudioCapture.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 15829

🏁 Script executed:

grep -n "SDKLogger\|NSLock\|OSAllocatedUnfairLock" sdk/runanywhere-react-native/packages/core/ios/HybridAudioCapture.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 311

🏁 Script executed:

python3 <<'PY' # According to Apple's AVAudioSession documentation: # When you call setActive(false), the session is deactivated but the category # remains unchanged. It does not automatically revert to a default category. # Therefore, if a full-duplex session was configured with .playAndRecord, # and then deactivateSession(true) is called, the session becomes inactive # but retains .playAndRecord as its category. # The code in configureAndActivateSession (lines 149-151) checks: # if audioSession.category != .playAndRecord { # try audioSession.setCategory(.record, mode: .measurement) # } # If the previous session was .playAndRecord and was deactivated: # 1. The category remains .playAndRecord # 2. The next startRecording() will NOT change the category to .record # 3. This means the session is activated in .playAndRecord mode, not .record mode # 4. This could affect audio quality for pure STT capture (measurement mode provides # cleaner signal for Whisper by disabling echo cancellation/processing) # The original review comment is CORRECT: # - Checking only `category` is insufficient because it persists after deactivation # - The code should explicitly track full-duplex ownership OR always reset to # .record/.measurement when starting a plain STT session # - Alternatively, use `isSessionActive` or explicit state to track if a voice-agent # session is currently active print("CONFIRMED: The category persists after setActive(false). The review concern is valid.") print("The code does track state internally (CaptureState in line 42), but does not") print("expose whether the session is full-duplex vs single-direction.") print("") print("Issue: Lines 149-151 rely solely on category check, which can be stale.") PY

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 467

Don't infer active voice-agent state from category alone.

After stopRecording(deactivateSession: true), the AVAudioSession category persists as .playAndRecord. A subsequent startRecording() will detect this stale category and skip the .record/.measurement configuration, resulting in STT capture using full-duplex session processing instead of the optimized measurement mode.

Introduce an explicit boolean flag (e.g., isFullDuplexActive) to track voice-agent session ownership, or unconditionally apply .record/.measurement on plain recording starts that are not preceded by an active activateAudioSession() call.

if audioSession.category != .playAndRecord { try audioSession.setCategory(.record, mode: .measurement) }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/runanywhere-react-native/packages/core/ios/HybridAudioCapture.swift` around lines 143 - 151, The audio session setup in HybridAudioCapture.startRecording is incorrectly using audioSession.category as the only signal for whether to keep full-duplex mode. Add an explicit state flag in this flow, such as isFullDuplexActive, to track ownership from activateAudioSession and stopRecording(deactivateSession:), then use that flag to decide when to apply .record/.measurement. Update the startRecording logic so plain STT starts reconfigure to measurement mode unless a real voice-agent session is active, instead of relying on the stale .playAndRecord category.

coderabbitai · 2026-06-25T04:01:01Z

+        if session.category == .playAndRecord {
+            lock.withLock { $0.ownsSession = false }
+            return
+        }
+        try session.setCategory(.playback, mode: .default, options: [.duckOthers])
+        try session.setActive(true)
+        lock.withLock { $0.ownsSession = true }


🩺 Stability & Availability | 🟡 Minor

Gate shared-session reuse on explicit ownership, not stale category.

AVAudioSession.category == .playAndRecord persists even when the voice agent deactivates its session. If the agent ends a call but leaves the category configured, HybridAudioPlayback incorrectly identifies this "stale" state as an active shared session. This causes it to skip reconfiguring to .playback, retain ownsSession = false, and fail to clean up the session later, leaving the app in a suboptimal .playAndRecord state.

Replace the category-only check with a check for an active voice session controller, or require explicit state signaling to ensure the shared session is currently active before skipping configuration.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/runanywhere-react-native/packages/core/ios/HybridAudioPlayback.swift` around lines 182 - 188, The shared-session reuse check in HybridAudioPlayback’s audio-session setup is too loose because it keys off AVAudioSession.category alone, which can remain .playAndRecord after the voice agent has deactivated. Update the logic around the session configuration path to gate reuse on explicit active ownership/state from the voice-session controller instead of the stale category, so the code only skips setCategory(.playback, mode: .default, options: [.duckOthers]) and setActive(true) when a voice session is truly active. Keep the ownsSession flag in sync with that explicit state so cleanup still runs correctly when the agent is no longer holding the session.

coderabbitai · 2026-06-25T04:01:01Z

+  async stop(): Promise<void> {
+    if (this.stopped) return;
+    this.stopped = true;
+    try {
+      this.capture.stopRecording();
+    } catch {
+      /* noop */
+    }
+    try {
+      this.playback.stop();
+    } catch {
+      /* noop */
+    }
+    this.preRoll = [];
+    this.utterance = [];
+    this.inSpeech = false;
+    this.logger.info('Voice-agent mic capture stopped');
+  }


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Invalidate or await in-flight turns on stop.

processTurn() is fire-and-forget, so stop() can return while a native turn is still pending. If the driver is started again before that promise resolves, start() sets stopped = false, allowing the old result to pass Line 190 and play a stale reply in the new session.

Proposed fix

private stopped = false; private processing = false; + private generation = 0; + private inFlightTurn: Promise<void> | undefined; @@ async start(): Promise<void> { @@ this.stopped = false; + this.generation += 1; @@ async stop(): Promise<void> { if (this.stopped) return; this.stopped = true; + this.generation += 1; @@ this.utterance = []; this.inSpeech = false; + this.speechMs = 0; + this.silenceMs = 0; + this.processing = false; + await this.inFlightTurn?.catch(() => undefined); this.logger.info('Voice-agent mic capture stopped'); } @@ if (speechMs >= MIN_SPEECH_MS) { @@ this.processing = true; - void this.processTurn(audio).finally(() => { - this.processing = false; - }); + const generation = this.generation; + this.inFlightTurn = this.processTurn(audio, generation).finally(() => { + if (this.generation === generation) { + this.processing = false; + } + }); @@ - private async processTurn(audio: Uint8Array): Promise<void> { - if (this.stopped || audio.byteLength === 0) return; + private async processTurn(audio: Uint8Array, generation: number): Promise<void> { + if (this.stopped || this.generation !== generation || audio.byteLength === 0) return; @@ - if (this.stopped) return; + if (this.stopped || this.generation !== generation) return;

Also applies to: 170-173, 182-200

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/runanywhere-react-native/packages/core/src/Features/VoiceAgent/VoiceAgentMicDriver.ts` around lines 87 - 104, The VoiceAgentMicDriver stop/start flow can leave an in-flight processTurn() from a previous session alive, allowing a stale native turn to complete after restart and affect the new session. Update VoiceAgentMicDriver so stop() either awaits or invalidates any outstanding turn work before returning, and make processTurn()/the turn-completion path check a session/token or cancellation state that survives start() resetting stopped; ensure the stale result cannot pass the completion guard and trigger playback after a new start.

coderabbitai · 2026-06-25T04:01:01Z

        async return(): Promise<IteratorResult<LLMStreamEventType>> {
          // Await the native cancel before resolving so back-to-back
          // cancel → generate sequences are race-free. Matches Swift
          // cancelGeneration() which awaits CppBridge.LLM.shared.cancelProto().
          try { await native.llmCancelProto(); } catch { /* noop */ }
-          if (inner) {
-            try { await inner.return?.(); } catch { /* noop */ }
-          }
+          finish();


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Guard llmCancelProto() behind active-stream state.

llmCancelProto() is a global native cancel entrypoint, not a request-scoped handle. Calling it unconditionally from return() means cleanup on an unopened or already-finished iterator can abort a different generation that started afterward.

Suggested fix

async return(): Promise<IteratorResult<LLMStreamEventType>> { - try { await native.llmCancelProto(); } catch { /* noop */ } + if (started && !done) { + try { + await native.llmCancelProto(); + } catch { + /* noop */ + } + } finish(); return { value: undefined as unknown as LLMStreamEventType, done: true }; },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async return(): Promise<IteratorResult<LLMStreamEventType>> {

// Await the native cancel before resolving so back-to-back

// cancel → generate sequences are race-free. Matches Swift

// cancelGeneration() which awaits CppBridge.LLM.shared.cancelProto().

try { await native.llmCancelProto(); } catch { /* noop */ }

if (inner) {

try { await inner.return?.(); } catch { /* noop */ }

}

finish();

async return(): Promise<IteratorResult<LLMStreamEventType>> {

// Await the native cancel before resolving so back-to-back

// cancel → generate sequences are race-free. Matches Swift

// cancelGeneration() which awaits CppBridge.LLM.shared.cancelProto().

if (started && !done) {

try {

await native.llmCancelProto();

} catch {

/* noop */

}

}

finish();

return { value: undefined as unknown as LLMStreamEventType, done: true };

},

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/runanywhere-react-native/packages/core/src/Public/Extensions/LLM/RunAnywhere`+TextGeneration.ts around lines 268 - 273, Guard the native cancel in the iterator cleanup path so `return()` only calls `native.llmCancelProto()` when this `LLMStream` still owns an active generation. Use the stream’s active-state tracking in `RunAnywhere+TextGeneration` (and its `finish()`/cleanup flow) to skip cancel on unopened, already-finished, or stale iterators, preventing `return()` from aborting a newer generation started after the iterator was disposed.

coderabbitai · 2026-06-25T04:01:02Z

+      const micDriver = new VoiceAgentMicDriver();
+      void micDriver.start().catch((error) => {
+        logger.error(
+          `Voice-agent mic driver stopped: ${error instanceof Error ? error.message : String(error)}`
+        );
+      });
+      try {
+        yield* adapter.stream();
+      } finally {
+        // Breaking out of the consuming loop (or unsubscribe) tears down mic
+        // capture, mirroring Swift's `defer { micTask.cancel() }`.
+        await micDriver.stop();


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Await mic startup and use the manual async iterator loop.

Starting the mic driver fire-and-forget can leave the stream open with no audio on permission/start failure, and it can race with finally if the consumer unsubscribes while startup is still pending. Also replace yield* with an explicit iterator.next() loop for Hermes/Nitro async-iterable compatibility.

Proposed fix

// turn fan out to this same handle callback, so collectors see them. const micDriver = new VoiceAgentMicDriver(); - void micDriver.start().catch((error) => { - logger.error( - `Voice-agent mic driver stopped: ${error instanceof Error ? error.message : String(error)}` - ); - }); + await micDriver.start(); + const iterator = adapter.stream()[Symbol.asyncIterator](); try { - yield* adapter.stream(); + while (true) { + const { value, done } = await iterator.next(); + if (done) break; + yield value; + } } finally { + await iterator.return?.(); // Breaking out of the consuming loop (or unsubscribe) tears down mic // capture, mirroring Swift's `defer { micTask.cancel() }`. await micDriver.stop();

As per coding guidelines, sdk/runanywhere-react-native/**/*.ts should use “manual iterator.next() loops instead of for await...of due to Hermes limitations.”

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const micDriver = new VoiceAgentMicDriver();

void micDriver.start().catch((error) => {

logger.error(

`Voice-agent mic driver stopped: ${error instanceof Error ? error.message : String(error)}`

);

});

try {

yield* adapter.stream();

} finally {

// Breaking out of the consuming loop (or unsubscribe) tears down mic

// capture, mirroring Swift's `defer { micTask.cancel() }`.

await micDriver.stop();

const micDriver = new VoiceAgentMicDriver();

await micDriver.start();

const iterator = adapter.stream()[Symbol.asyncIterator]();

try {

while (true) {

const { value, done } = await iterator.next();

if (done) break;

yield value;

}

} finally {

await iterator.return?.();

// Breaking out of the consuming loop (or unsubscribe) tears down mic

// capture, mirroring Swift's `defer { micTask.cancel() }`.

await micDriver.stop();

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/runanywhere-react-native/packages/core/src/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceAgent.ts around lines 336 - 347, The mic startup in VoiceAgentMicDriver is fire-and-forget and the stream still uses adapter.stream() delegation, which can leave the voice session hanging on startup failure and is not Hermes-compatible. Update RunAnywhere+VoiceAgent to await micDriver.start() before entering the stream consumption path, and replace yield* adapter.stream() in the streaming logic with an explicit manual iterator.next() loop for async-iterable compatibility. Make sure the finally block still stops the mic driver after the loop exits.

Source: Coding guidelines

coderabbitai · 2026-06-25T04:01:02Z

+        // Configure audio session for playback (unless the caller owns it)
+        if managesAudioSession {
+            try configureAudioSession()
+        }


🩺 Stability & Availability | 🟠 Major

Snapshot session ownership per playback instance.

The managesAudioSession flag is a mutable property read independently during playback start and cleanup. If this flag changes between these two points, the cleanup logic may incorrectly deactivate a shared session or fail to deactivate a dedicated one, causing race conditions.

Resolve this by capturing the flag's value at the start of playback within the locked State struct and using that snapshot during cleanup to ensure consistent session management.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Features/TTS/Services/AudioPlaybackManager.swift` around lines 141 - 144, Capture the session-ownership decision at playback start in AudioPlaybackManager by storing the current managesAudioSession value in the locked State snapshot before calling configureAudioSession(), then use that stored value during cleanup instead of re-reading the mutable property; update the start/stop flow in AudioPlaybackManager and its State handling so cleanup uses the same ownership choice for the whole playback instance.

coderabbitai · 2026-06-25T04:01:02Z

+                    let micDriver = VoiceAgentMicDriver(handle: handle)
+                    let micTask = Task {
+                        do {
+                            try await micDriver.run()
+                        } catch is CancellationError {
+                            // Expected when the consumer stops the session.
+                        } catch {
+                            SDKLogger.voiceAgent.error("Voice-agent mic driver stopped: \(error.localizedDescription)")
+                        }
+                    }

-                let adapter = VoiceAgentStreamAdapter(handle: handle)
-                for await event in adapter.stream() {
-                    if Task.isCancelled { break }
-                    continuation.yield(event)
+                    defer { micTask.cancel() }


🩺 Stability & Availability | 🟠 Major

Propagate mic-driver failure to the stream

If micDriver.run() throws (e.g., permission or audio session errors), the current task only logs the error while adapter.stream() waits indefinitely. Consumers will hang because no utterances are generated and the stream is never terminated.

Race the mic task with the stream loop and fail or finish the continuation if the mic driver exits unexpectedly.

Current flow risk
```swift let micTask = Task { try await micDriver.run() } // Swallows errors // Continues to stream loop which hangs if mic never starts for await event in adapter.stream() { ... } ```

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceAgent.swift around lines 225 - 236, The mic-driver task in RunAnywhere+VoiceAgent currently only logs failures from VoiceAgentMicDriver.run() and lets adapter.stream() continue waiting, which can hang the session. Update the flow around micTask and the stream loop so the mic task is raced against streaming, and if it exits unexpectedly with a non-cancellation error, propagate that failure or finish the continuation immediately. Use the existing symbols VoiceAgentMicDriver.run(), micTask, and adapter.stream() to locate the logic and make sure a dead mic session cannot leave the stream pending.

shubhammalhotra28 added 2 commits June 24, 2026 12:57

fixing swift

0a24ba3

fixes for rn and ios

6bf113c

cursor Bot reviewed Jun 25, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

Siddhesh2377 approved these changes Jun 25, 2026

View reviewed changes

Siddhesh2377 added enhancement New feature or request ios-sample iOS example app ios-sdk iOS / Swift SDK ready-to-merge Approved and ready to merge core C++ commons core (runanywhere-commons) labels Jun 25, 2026

shubhammalhotra28 merged commit 5d0e6df into main Jun 25, 2026
26 of 28 checks passed

	\| 3 \| `MoreHubView` \| RAG, STT, TTS, VAD, Storage, Voice Keyboard \|
	\| 3 \| `MoreHubView` \| RAG, STT, TTS, VAD, Storage, iOS-only Voice Keyboard \|

-            const std::string llm_text(llm.text);
+        if (!llm.text || llm.text[0] == '\0') {
+            rac_llm_result_free(&llm);
+            if (have_lifecycle_llm) {
+                rac::llm::release_lifecycle_llm(&llm_ref);
+            }
+            rac_stt_result_free(&stt);
+            if (have_lifecycle_stt) {
+                rac::lifecycle::release_lifecycle_stt(&stt_ref);
+            }
+            pending_emits.emplace_back([handle]() {
+                emit_component_failure(handle, "llm", RAC_ERROR_INVALID_STATE,
+                                       "LLM generation was empty");
+            });
+            error_code = RAC_ERROR_INVALID_STATE;
+            error_message = "LLM generation was empty";
+            rc = error_code;
+            goto cleanup_and_return;
+        }
+        {
+            const std::string stt_text(stt.text);
+            const std::string llm_text(llm.text);

Uh oh!

Conversation

shubhammalhotra28 commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Platform-Specific Testing (check all that apply)

Labels

Checklist

Screenshots

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (2 warnings, 1 inconclusive)

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 25, 2026

Choose a reason for hiding this comment

Null LLM text crash

Uh oh!

cursor Bot Jun 25, 2026

Choose a reason for hiding this comment

Turn error ends active UI

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shubhammalhotra28 commented Jun 25, 2026 •

edited

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

cursor Bot left a comment •

edited

Loading