Skip to content

Continual crashes with Gemma 4 E2B litertlm on iOS (works on Android emulator) #218

@bradcypert

Description

@bradcypert

Every attempt to run inference with the Gemma 4 E2B .litertlm model crashes the app (on iOS) with a SIGSEGV. The crash is deterministic and 100% reproducible on my iOS device, but I am not having the same issues with an Android Emulator.

Environment:

  • flutter_gemma: 0.13.2
  • Model: Gemma 4 E2B (gemma-4-E2B-it.litertlm from litert-community HuggingFace)
  • Device: iPhone 16 Pro Max
  • OS: iPhone OS 26.2 (23C55) — beta
  • Flutter: 3.41.6 - stable

The crash is in MediaPipe's native LlmLiteRTExecutor::PrefillInternal. A memset is called with a null destination pointer (x0 = 0x0, size x2 = 0x1000), indicating a tensor output buffer is null. It looks like the C++ code has no null guard before the memset but Im not entirely sure why the tensor output buffer would be null to begin with.

  Crash stack (Thread 18 — crashed):
  0   libsystem_platform.dylib   _platform_memset + 108
  1   Runner                     odml::infra::LlmLiteRTExecutor::PrefillInternal(tflite::impl::SignatureRunner*, absl::Span<int const>, bool) + 852
  2   Runner                     odml::infra::LlmLiteRTExecutor::Prefill(litert::lm::ExecutorInputs const&, litert::lm::ExecutorPrefillParams const&) + 1956
  3   Runner                     odml::infra::LockedLlmExecutor::Prefill(...) + 304
  4   Runner                     odml::infra::(anonymous namespace)::LlmExecutorCalculator::Process(mediapipe::CalculatorContext*) + 1884
  5   Runner                     mediapipe::CalculatorNode::ProcessNode(...)
  6   Runner                     mediapipe::internal::SchedulerQueue::RunCalculatorNode(...)
  7   Runner                     mediapipe::internal::SchedulerQueue::RunNextTask()
  8   Runner                     mediapipe::ThreadPool::RunWorker() + 128
  9   Runner                     mediapipe::ThreadPool::WorkerThread::ThreadBody(void*)

ARM Thread State at crash:

  x0: 0x0000000000000000  ← null destination for memset
  x1: 0x0000000000000000  ← fill value (0)
  x2: 0x0000000000001000  ← size (4096 bytes)
  esr: 0x92000046 (Data Abort) byte write Translation fault

Dart/Swift call chain that triggered it (Thread 11):

  InferenceSession.generateResponse(prompt:)  (InferenceModel.swift:144)
  closure #1 in PlatformServiceImpl.generateResponse(completion:)  (FlutterGemmaPlugin.swift:239)

Flutter usage:

  await FlutterGemma.installModel(
    modelType: ModelType.gemmaIt,
    fileType: ModelFileType.litertlm,
  ).fromNetwork(
    'https://huggingface.co/litert-community/gemma-4-E2B-it-litert-lm/resolve/main/gemma-4-E2B-it.litertlm',
  ).install();

  final model = await FlutterGemma.getActiveModel(maxTokens: 4096);
  final session = await model.createSession(temperature: 0.7, topK: 40);
  await session.addQueryChunk(Message.text(text: prompt, isUser: true));
  final response = await session.getResponse();

What I've tried:

  • Confirmed the model file downloads successfully (no 401, file is valid)
  • Increased maxTokens from 1024 to 4096 -- crash persists
  • Reproduced across two separate builds

Expected behavior: Inference completes and returns a response.
Actual behavior: App crashes with SIGSEGV. LlmLiteRTExecutor::PrefillInternal calls memset on a null tensor buffer with no null check.

Notes:

  • The .litertlm format uses LlmLiteRTExecutor. I have not tested a .task format model on this device. It's possible this is specific to the LiteRT executor path
  • The crash is in native MediaPipe C++ code, and I dont believe it can be caught by Flutter/Dart error handling but that honestly wouldn't help me much as the app revolves around Gemma usage :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions