Bug: Android GPU samplers silently fall back to CPU — ~3× decode slowdown
On Android arm64 (flutter_gemma 0.15.0, native-v0.11.0-a), the OpenCL and WebGPU top-k samplers both fail to dlopen at runtime and the engine falls back to CPU sampling:
W/native: OpenCL sampler not available, falling back to statically linked C API: UNAVAILABLE
W/native: └ Could not load shared library libLiteRtTopKOpenClSampler.so:
dlopen failed: cannot locate symbol "LiteRtCreateEnvironment"
referenced by ".../base.apk!/lib/arm64-v8a/libLiteRtTopKOpenClSampler.so"
W/native: WebGPU sampler not available, falling back to statically linked C API: UNAVAILABLE
W/native: └ Could not load shared library libLiteRtTopKWebGpuSampler.so:
dlopen failed: cannot locate symbol "LiteRtCreateEnvironment" ...
W/native: GPU sampler unavailable. Falling back to CPU sampling.
Root cause (verified with llvm-readelf on the bundled .sos)
The two sampler libraries in prebuilt/android_arm64/ reference LiteRtCreateEnvironment as undefined, but their DT_NEEDED list contains only libm / libdl / liblog / libc:
$ llvm-readelf -d libLiteRtTopKOpenClSampler.so | grep NEEDED
(NEEDED) Shared library: [libm.so]
(NEEDED) Shared library: [libdl.so]
(NEEDED) Shared library: [liblog.so]
(NEEDED) Shared library: [libc.so]
The symbol is exported (GLOBAL DEFAULT) by the flutter_gemma-rebuilt libLiteRtLm.so, but Bionic (Nougat+) uses per-library linker namespaces — undefined references are only resolved against the caller's NEEDED chain, not against arbitrary already-loaded libs in the process. So even with libLiteRtLm.so already in the process, dlopen("libLiteRtTopKOpenClSampler.so") fails.
This matches the upstream report google-ai-edge/LiteRT-LM#2211, which measured a 2.87× end-to-end decode speedup (3.03 → 8.70 tok/s) after patching in the missing DT_NEEDED.
Why this affects flutter_gemma specifically
native/litert_lm/build_android.sh deliberately links LiteRt symbols statically into the rebuilt libLiteRtLm.so (see the comment at lines 96–102), so the upstream patchelf --add-needed libLiteRt.so workaround does not apply verbatim — there is no libLiteRt.so in flutter_gemma's Android distribution. The correct patch target for this distribution is libLiteRtLm.so.
Reproduction
- App on flutter_gemma 0.15.0, real Android arm64 device.
- Run any Gemma 4 inference path that decodes more than a few tokens.
- Observe the warnings above in logcat and ~2–3 tok/s decode on Gemma 4 E2B INT4.
Proposed fix
In native/litert_lm/build_android.sh, after step 8 (the companion-lib copy), run patchelf --add-needed libLiteRtLm.so on both sampler .sos.
Bug: Android GPU samplers silently fall back to CPU — ~3× decode slowdown
On Android arm64 (flutter_gemma 0.15.0, native-v0.11.0-a), the OpenCL and WebGPU top-k samplers both fail to
dlopenat runtime and the engine falls back to CPU sampling:Root cause (verified with
llvm-readelfon the bundled.sos)The two sampler libraries in
prebuilt/android_arm64/referenceLiteRtCreateEnvironmentas undefined, but theirDT_NEEDEDlist contains onlylibm / libdl / liblog / libc:The symbol is exported (
GLOBAL DEFAULT) by the flutter_gemma-rebuiltlibLiteRtLm.so, but Bionic (Nougat+) uses per-library linker namespaces — undefined references are only resolved against the caller'sNEEDEDchain, not against arbitrary already-loaded libs in the process. So even withlibLiteRtLm.soalready in the process,dlopen("libLiteRtTopKOpenClSampler.so")fails.This matches the upstream report google-ai-edge/LiteRT-LM#2211, which measured a 2.87× end-to-end decode speedup (3.03 → 8.70 tok/s) after patching in the missing
DT_NEEDED.Why this affects flutter_gemma specifically
native/litert_lm/build_android.shdeliberately links LiteRt symbols statically into the rebuiltlibLiteRtLm.so(see the comment at lines 96–102), so the upstreampatchelf --add-needed libLiteRt.soworkaround does not apply verbatim — there is nolibLiteRt.soin flutter_gemma's Android distribution. The correct patch target for this distribution islibLiteRtLm.so.Reproduction
Proposed fix
In
native/litert_lm/build_android.sh, after step 8 (the companion-lib copy), runpatchelf --add-needed libLiteRtLm.soon both sampler.sos.