Project memory for this repo. When you change anything that affects future work — architecture, security behavior, new features, deprecated paths, public APIs, native JNI contracts, new scope — update this file as part of the same change. A future session reads this to reconstruct intent; if it drifts, work breaks.
Privacy-first, offline-only on-device AI assistant. No Google Play services, no network telemetry, no analytics. In-scope pillars: on-device LLM chat, RAG over user documents, vision-language models (VLM), voice (TTS+STT), Remote Server with bundled web UI, HF Explorer, on-device image generation / img2img / inpaint / 4× upscale via the :ai_sd AAR (re-pivoted in 2026-05-08), first-party plugin system with ONNX inference + capability-gated APIs (re-pivoted in 2026-05-11). Out of scope: tool calling, Termux integration. (Image generation was originally cut on 2026-04-20 and re-added on 2026-05-08. Plugin marketplace was also originally cut on 2026-04-20 and re-added on 2026-05-11 as a first-party plugin runtime — DexClassLoader, Plugin contract with @Composable Content(), capability-gated OnnxApi/HxsApi/NetworkApi, floating plugin dock with smooth switch transitions.)
Package: com.dark.tool_neuron · minSdk 29 · targetSdk 36 · abiFilters arm64-v8a, x86_64.
Modules:
:app— UI (Compose), viewmodels, DI graph, activities, services.:hxs— encrypted key-value store (Kotlin wrapper + C++ core).:hxs_encryptor— crypto + integrity primitives: Argon2id, AES-GCM/ChaCha20-Poly1305, BoringSSL, ML-KEM-768, ML-DSA-65, Ed25519, HKDF, mmap+mlockSecureBuffer, plus the native security policy / auth / boot-integrity stack.:native-server— embedded OpenAI-compatible HTTP server (cpp-httplib + nlohmann/json header-only via FetchContent, no BoringSSL / OpenSSL / zlib dep). Powers Remote Server mode.:plugin-api— pure-Kotlin contract plugin authors compile against.Plugininterface with lifecycle +@Composable Content(),PluginContext,PluginCapabilityenum,PluginManifest,OnnxApi,HxsApi,NetworkApi. Compose deps arecompileOnly; host provides them at runtime via classloader-parent.:plugin-exc— plugin host runtime.PluginExecutor,PluginLoader(DexClassLoader from<filesDir>/plugins/<id>/classes.dex+ .so detection across SUPPORTED_ABIS),PluginRegistry,PluginInstance,CapabilityGate(PolicyEngine.hasSession + manifest cap check),PluginContainerActivity(hosts plugin'sContent()with AnimatedContent fade+scale plugin-switch transitions),PluginDock(floating M3-expressive surface with circle chips per open plugin, tertiary-corner native-code badge),OnnxApiImpl(wraps onnxruntime-android 1.20.0, gated by AI_ONNX),HxsApiImpl(per-plugin collectionplugin_<id>, indexed string keys, gated by HXS_READ/WRITE),NetworkApiImpl(WebNative.fetchBytes GET, gated by INTERNET).:download_manager,:networking— ancillary modules.
Prebuilt AARs in libs/:
gguf_lib-release.aar— chat + VLM + embedding inference engine.ai_sherpa-release.aar— TTS / STT.ai_sd-release.aar— Stable Diffusion (text-to-image / img2img / inpaint / 4× upscale) via QNN on Snapdragon NPU and MNN on CPU. Currently the debug AAR is shipped because the release AAR's R8 minified theStableDiffusionManager.Companion.getInstanceaccessor; consumer-rules in:ai_sdneed a-keep class com.dark.ai_sd.StableDiffusionManager$Companion { *; }before we can switch to release.
- Dev (debug):
./gradlew :app:compileDebugKotlinto verify compilation../gradlew :app:installDebugto install. Neverassemble + adb install. - Release: built from Android Studio, signed via
local.propertieskeysTN_KEYSTORE_PATH / TN_KEYSTORE_PASSWORD / TN_KEY_ALIAS / TN_KEY_PASSWORD. If any are missing, release falls back to unsigned so dev flow isn't blocked. - Native:
./gradlew :hxs_encryptor:externalNativeBuildDebug— BoringSSL + liboqs fetched via CMake FetchContent. The LSP flags'openssl/mem.h' not foundetc. as false positives — build-green is the source of truth, not clangd. - Instrumented tests:
./gradlew :app:connectedDebugAndroidTest.
- HXS-only persisted storage. No
SharedPreferences, Room, DataStore, or raw files. The only exception isapp_bootstrap/k.bin(XOR-masked raw blob holding the Keystore-wrapped DEK) — it has to live outside the encrypted vault by construction. - Security logic lives in C++/JNI. Kotlin wraps native; every auth / trust / policy decision is made native and crosses JNI as opaque token or bool. No boolean-trust-through-JNI, no Kotlin
if (verify)gating. - No comments in source except one-liner
//for non-obvious WHY. No decorative banners, no block comments, no docstrings on internal/private. Names and structure must be self-documenting. - Never write fully-qualified class names inline.
importat the top; short name in the body. - ViewModels live under
com.dark.tool_neuron.viewmodel. Never co-locate a VM with its screen. - Commit hygiene: conventional commits, no
Co-Authored-Bytrailer, never push without explicit ask, never skip hooks. Don't commit unless the user explicitly asks. - Research / exploration subagents run on Sonnet at low effort — not Opus — unless the user overrides.
- No TODOs, stubs, or half-implementations. Every task is coded end-to-end.
- When you change security-affecting state, update CLAUDE.md in the same change.
- One Scaffold only — the root
AppScaffold. Screens takeinnerPadding: PaddingValuesand render plainColumn/LazyColumn/Box. Per-route top bars go inAppTopBar.kt'swhen, bottom bars inAppBottomBar.kt'swhen. - Library modules must NOT minify. Only
:appminifies. R8 collides onType a.a is defined multiple timesagainst pre-minified prebuilt jars (e.g.gguf_lib-release-runtime.jar) if libraries also pre-minify. Library rules go in each module'sconsumer-rules.pro. - No spec/plan/research docs in the repo. Project memory belongs here. Implementation roadmaps belong in conversation context, not in
*.mdfiles at the repo root.
- UI: PasswordScreen / SetupPasswordScreen wrapped in
SecureScreen(FLAG_SECURE).AppScaffoldwatchesshouldLockand re-routes to PasswordScreen. - App Kotlin:
SecurityManager(only auth API the app consumes),SessionHolder(opaque 32-byte token),AppLockObserver(ProcessLifecycleOwner; clears on ON_STOP),NativeIntegrity(TOFU .so hashes + APK signer capture),AccessibilityGuard,RootGuard,PinStrength,AppPreferences(encrypted HXS + sealed AuthState),AppKeyStore(Android Keystore wrap/unwrap of DEK). - Native (libhxs_encryptor.so):
PolicyEngine— singleis_allowed(feature_id, token)gate;AuthNative(Argon2id setup/verify, emits session token);BootIntegrity(JNI_OnLoad hooks, hook-baseline capture);IntegrityGuard(debugger / frida / xposed / sig / hash);CryptoEngine(AEAD, HKDF, Argon2id, Ed25519, X25519);HybridKem/HybridSign(X25519+ML-KEM-768, Ed25519+ML-DSA-65).
AppKeyStorereads / writesfilesDir/app_bootstrap/k.bin. Layout:[magic "TNDK"(4)][version(1)][iv_len(2)][iv][ct_len(2)][ct]masked byte-wise with a 32-byte hardcoded XOR key. XOR is obfuscation; the cryptographic protection is the Keystore-wrapped ciphertext inside.- Keystore alias
toolneuron_vault_dek_v1: AES-256-GCM,setIsStrongBoxBacked(true)withStrongBoxUnavailableExceptionfallback to TEE. NOTsetUserAuthenticationRequired(true)(chicken-and-egg with setup flow). - First launch: generate 32-byte DEK via
SecureRandom, wrap, write XOR-masked blob. - Every launch: read, unmask, parse, unwrap.
AppKeyStore.backing()classifies viaKeyInfo.securityLevel(API 31+) orKeyInfo.isInsideSecureHardware→ STRONGBOX / TEE / SOFTWARE_FALLBACK / UNKNOWN.AppKeyStore.wipe()deletesk.bin+ Keystore alias.- Legacy migration: if
k.binis missing butapp_bootstrap/has any other file, wipeapp_bootstrap/*+app_prefs/*+ Keystore alias, then re-bootstrap.
Every per-vault user-key is derived as HKDF(ikm = DEK, salt = installSignerHash, info = "tn.<scope>.user_key.v2"). The signer hash is SHA-256(packageInfo.signingInfo.firstSigner.toByteArray()), computed once per process via AppKeyStore.installSignerHash() and cached in cachedSignerHash (cleared on wipe()). On API < 28 it falls back to GET_SIGNATURES.
Why salt-bind to the signer: Keystore-wrapped DEK is already device-bound (a different device cannot unwrap k.bin). Signer-binding closes the same-device, replaced-APK attack — root + repack ToolNeuron with an attacker cert + boot it on the legit device. The Keystore alias is uid-scoped, so the patched APK can unwrap the DEK; but its signing certificate hashes to a different value, so every user-key derived under the attacker build is wrong. AEAD records fail to decrypt. The repo's openOrRebuild helper detects the open failure and wipes the vault, so the attacker gets a fresh empty vault — the original encrypted bytes on disk stay sealed under the legitimate signer's user-key forever.
If getPackageInfo(... GET_SIGNING_CERTIFICATES) returns null/empty (some weird OEM, broken install), installSignerHash() throws SecurityException and the app refuses to bootstrap. Don't add a fallback that returns zeros — that would defeat the binding.
| Vault dir | Sealed under | Notes |
|---|---|---|
app_bootstrap/k.bin |
Android Keystore alias toolneuron_vault_dek_v1 (StrongBox/TEE), wrapped in XOR-masked envelope |
Format: [magic "TNDK"(4)][version(1)][iv_len(2)][iv][ct_len(2)][ct], byte-XOR with hardcoded 32-byte key. The XOR is obfuscation; the cryptographic protection is the Keystore-wrapped ct. |
app_prefs/ |
tn.app_prefs.user_key.v2 |
All preferences. AuthState rides a second AEAD layer keyed tn.app_prefs.auth_key.v2, AAD "tn.auth_state.v1". |
chat_store_v2/ |
tn.chats.user_key.v2 |
Chats + messages. Replaces the legacy plaintext chat_store/ (deleted on first v2 boot). |
chat_documents_meta_v1/ |
tn.chat_documents.user_key.v2 |
RAG document metadata (id, name, mime, chunk count, sourceId). Dir name is historical — the user-key is v2. |
chat_documents/sources_v2/ |
per-file AEAD via SourceFileVault |
Each <sourceId>.bin is [iv(12)][ct][tag(16)] AEAD blob. Per-file key is HKDF(DEK, salt=signerHash, info="tn.chat_doc_source.user_key.v2@<sourceId>"). AAD = sourceId UTF-8 bytes (rename → decrypt fails). Replaces the legacy plaintext chat_documents/sources/ (deleted on first v2 boot). |
rag_keyword_v1/ |
tn.rag_keyword.user_key.v2 |
BM25 inverted-index records. |
download_history_v1/ |
tn.download_history.user_key.v2 |
Download history (id, displayName, type, status, totalBytes, completedAt, error). Capped at 50 newest; oldest pruned on insert. Fresh-created on first launch of the Downloads-screen build; no migration. |
v1 → v2 migration is destructive. Each repo's openOrRebuild tries openEncrypted with the v2 key. If that fails (existing v1 data sealed under the old non-signer-bound key), it wipes the vault dir and re-creates fresh. On first launch with a v2 build, an existing user loses their PIN, chat history, and RAG attachments — one time. The Keystore alias is preserved (so the DEK is still the same), only the per-vault user-keys change.
HexStorage.openEncrypted(path, appKey=DEK, userKey=HKDF(DEK, salt=signerHash, info="tn.app_prefs.user_key.v2"), encryptor). Auth-critical state rides a second AEAD layer: writeAuthState/readAuthState use key HKDF(DEK, salt=signerHash, info="tn.app_prefs.auth_key.v2") and AAD "tn.auth_state.v1". Ordinary flags (onboarding_complete, tc_accepted, setup_done, server settings, etc.) are plain encrypted records.
version(1) = 4
security_mode(1) — 0=NONE, 1=APP_PASSWORD
salt_len(2) + salt
hash_len(2) + hash
failed_attempts(2)
next_attempt_at_ms(8)
has_panic(1)
if has_panic:
panic_salt_len(2) + panic_salt
panic_hash_len(2) + panic_hash
last_seen_now_ms(8) — monotonic wall-clock anchor
Decoder accepts v1/v2/v3/v4 and zero-fills missing fields. Bump AuthState.VERSION and the decoder when extending.
hxs::auth::setup(pin) → {salt[16], hash[32]}— Argon2idt=4 / m=128 MiB (131072 KiB) / p=1 / outLen=32.hxs::auth::verify(pin, salt, stored_hash) → 32-byte session_token | null. Constant-timeCRYPTO_memcmp, thenpolicy::register_session(token).hxs::auth::invalidate()→policy::invalidate_session().
Every gated call: PolicyEngine.isAllowed(Feature, sessionToken). Native logic order:
tampered→ false (latched one-way).is_pro_feature(fid)(fid ≥ 1000) → false. This is the flip-point for the future license system.is_unauth_feature(fid)(APP_LAUNCH, OPEN_VAULT, AUTH_SETUP, AUTH_VERIFY, UI_PASSWORD_SCREEN, UI_SETUP_SCREEN, UI_INTRO) → true.passthroughis on (set whensecurity_mode == NONE) → true.- Else require
session_active && CRYPTO_memcmp(stored_token, given_token, 32) == 0.
State mutations: register_session, invalidate_session, set_passthrough, mark_tampered, reset_for_testing (test-only). Feature IDs in policy_engine.h mirrored as PolicyEngine.Feature in Kotlin — keep in sync.
LockoutPolicy.backoffMillis(failed) — first 3 free, then 1m → 5m → 15m → 1h → 4h → 12h → 24h. WIPE_THRESHOLD = 10 triggers SecurityManager.hardWipe().
Clock-rollback defense: AuthState.lastSeenNowMs updated on every verify; if nowMs + CLOCK_SKEW_GRACE_MS (5 min) < lastSeenNowMs, the attempt is double-penalized and backoff extends from max(nowMs, lastSeenNowMs).
hardWipe(): session.clear() → PolicyEngine.invalidateSession() → PolicyEngine.markTampered() → prefs.clearAuthState() → keyStore.wipe(). keyStore.wipe() is scorched-earth: clears the cached DEK reference, recursively deletes everything under filesDir (models, voice, chat_store, chat_documents, model_store, plugin_store, rag_prefs, app_prefs, app_bootstrap, config, cache subtree) and cacheDir, then removes the Keystore alias. After hardWipe the app is in an unrecoverable in-process state (markTampered latches, files are gone), so the user must hit Restart on WipedScreen to bootstrap fresh.
Panic PIN: SecurityManager.setPanicPin(pin) writes a second Argon2id hash. The gate is securityMode == APP_PASSWORD only — the live-session check was removed because ProcessLifecycleOwner.ON_STOP (notification panel pull, brief focus loss) clears the session via AppLockObserver while a Compose dialog stays visible above the locked screen, producing a non-deterministic "Couldn't set panic PIN" failure when the user submits. The panic-PIN UI is reachable only from Settings, which is itself gated by shouldLock → PasswordScreen re-routing, so the persistent "lock is set" fact is sufficient. verifyPassword tries real first; on mismatch, if hasPanic, tries panic. Panic match → hardWipe() + VerifyResult.Wiped (UX-indistinguishable from "attempts exceeded"). clearPanicPin and disableLock use the same securityMode == APP_PASSWORD gate for the same reason.
PIN rules: 6 digits exactly. Weak PINs (all-same, monotonic ±1, top-20 commons) rejected at setup via PinStrength.evaluate.
JNI_OnLoad in hxs_encryptor.cpp:
boot::install_ptrace_self_trace()— PTRACE_TRACEME.boot::capture_hook_baselines()— first 32 bytes ofauth::verify,policy::is_allowed,boot::hard_fail.
TNApplication.onCreate() in main process:
integrity.scanProcessEnvironment()— debugger / frida / xposed. OnlyFAIL_DEBUGGER | FAIL_FRIDA→BootIntegrity.hardFail(reasons).FAIL_XPOSEDis not a hard fail — it's recorded intoTNApplication.softEnvReasonsand surfaced astamperEvidencein the one-timeRootWarningDialoglater.integrity.bootVerify()— TOFU .so hash walk rebound to install identity. Manifest layout v2:version(1) + signerHashLen(2)+signerHash + versionCode(8) + lastUpdateTime(8) + count(4) + (nameLen(2)+filename + hashLen(2)+hash)*. If{signerHash, longVersionCode, lastUpdateTime}differs (or no manifest), re-TOFU and store. Within the same install identity, filename-set + hash mismatch →FAIL_LIB_HASH→ hard fail. Filenames are stored (not absolute paths) since Android reshuffles/data/app/~~…/.apk_signer_hash_v1is also written for the future license-binding path.BootIntegrity.verifyHookBaselines()— re-reads the prologue and compares; catches inline hooks.accessibilityGuard.scan()— does NOT hard-fail any longer. Suspicious packages flow throughScaffoldViewModelintoRootWarning.a11yPackagesfor the same one-time warning dialog.appLockObserver.register().
Root detection was removed from the boot path. RootGuard.scan() runs from ScaffoldViewModel on first launch; if rooted and rootWarningShown == false, AppScaffold shows a one-time RootWarningDialog. Acknowledging flips the flag.
The same one-time-warning treatment now also covers two adjacent "rooted-user" tamper signals that previously hard-failed the app:
- Xposed / LSPosed / Riru —
scan_process_environmentstill detects the/proc/self/mapssubstrings, butTNApplication.onCreateonly hard-fails onFAIL_DEBUGGER | FAIL_FRIDA. A standaloneFAIL_XPOSEDbit is recorded intoTNApplication.softEnvReasons(process-singleton,internal set) and surfaced astamperEvidenceinRootWarningDialog.FAIL_DEBUGGERandFAIL_FRIDAremain hard fails — those are active attack tools, not just a user-installed framework. - Suspicious accessibility services —
accessibilityGuard.scan()no longer hard-fails in release onSuspiciousAttached. The packages flow intoRootWarning.a11yPackagesand render as a third paragraph in the same dialog.
ScaffoldViewModel.RootWarning is (rootEvidence, tamperEvidence, a11yPackages) — three independent Set<String>. The dialog renders a paragraph per non-empty section with a single "I understand" button. acknowledgeRootWarning() flips rootWarningShown once and silences all three sources for that install.
BootIntegrity.hardFail(reason) → PolicyEngine.markTampered() + _exit(1), unless setRelaxedForTesting(true) (tests).
All detection strings in integrity.cpp use HXS_OBF(var, "literal") (compile-time XOR). Verified clean: strings libhxs_encryptor.so | grep -iE '^(frida|gadget|linjector|xposed|TracerPid)$' returns nothing.
SessionHolder.active: StateFlow<Boolean>flips onAuthNative.verifysuccess, off onclear().AppLockObserver(aDefaultLifecycleObserveronProcessLifecycleOwner) callssession.clear()onON_STOPwhensecurity.isLockEnabled. Clear also callsAuthNative.invalidate().ScaffoldViewModel.shouldLock = security.isLockEnabled && !session.active.AppScaffoldre-routes to PasswordScreen withpopUpTo(0) { inclusive = true }, except when on a non-interruptible route (PasswordScreen,SetupScreen,IntroScreen).
ui/components/SecureScreen.kt adds FLAG_SECURE on enter, clears on dispose. Applied to PIN entry. Not global (so users can screenshot their own chats). util/SecureClipboard.kt sets EXTRA_IS_SENSITIVE (TIRAMISU+) and auto-clears after 30 s if the clip still matches.
app/build.gradle.kts: isMinifyEnabled = true, isShrinkResources = true, isDebuggable = false, isJniDebuggable = false. ProGuard (app/proguard-rules.pro) strips Log.d/v/i via assumenosideeffects (keeps w/e), -repackageclasses '', -allowaccessmodification. Manifest: allowBackup=false; TempActivity + ColorShowcaseActivity un-exported; InferenceService runs in :inference process.
The license plumbing is live: every gated feature routes through PolicyEngine.isAllowed(Feature, sessionToken). Feature IDs ≥ 1000 are PRO_* and currently return false. To enable monetization:
- Flip the
is_pro_featurebranch inpolicy_engine.cppfromreturn falseto "verify signed license blob". - License blob layout (planned):
{device_id_hash, features_bitmap, expiry_unix, nonce, signature}— Ed25519 or ML-DSA-65 signed; public key XOR-baked into native at build time. - APK signer SHA-256 is already captured on first launch and stored as
apk_signer_hash_v1. That's the anchor. - Device id must be an attested Keystore key fingerprint (hardware-rooted), NOT
Settings.Secure.ANDROID_ID. - License blob lives in a separate HXS collection sealed under the same DEK.
- On tamper detection, invalidate any loaded license too — single fail-closed path.
policy_engine.{h,cpp}—is_allowed, session registry, tamper latch,reset_for_testing.auth.{h,cpp}—setup(pin),verify(pin, salt, hash), hardened Argon2id.boot_integrity.{h,cpp}— env scan, lib-hash verify, hook-baseline capture/verify,hard_fail,setRelaxedForTesting.integrity.{h,cpp}— debugger / frida / xposed checks, file hashing, APK sig compare.xor_str.h—HXS_OBF(var, "literal")compile-time XOR.crypto_engine.{h,cpp}— AEAD, HKDF, PBKDF2, Argon2id, Ed25519, X25519, SHA-256.memory_guard.{h,cpp}—SecureBuffer(mmap+mlock+mprotect),secure_zero,secure_compare.pq_kem.{h,cpp}— X25519 + ML-KEM-768 hybrid KEM.pq_sign.{h,cpp}— Ed25519 + ML-DSA-65 hybrid signatures.hxs_encryptor.cpp— JNI bindings +JNI_OnLoad.CMakeLists.txt— fetches BoringSSL + liboqs;-march=armv8-a+crypto+sha2on arm64; LTO/gc-sections/icf on release;-Wl,-z,max-page-size=16384on every owned native CMake target.
HxsEncryptor.kt, PolicyEngine.kt, AuthNative.kt, BootIntegrity.kt.
AppKeyStore, SessionHolder, SecurityManager, SecurityModule, AppPreferences, AuthState, VerifyResult, LockoutPolicy, NativeIntegrity, AppLockObserver, AccessibilityGuard, RootGuard, PinStrength, KeyFingerprint.
ui/components/SecureScreen.kt— FLAG_SECURE wrapper.ui/screens/password_screen/PasswordScreen.kt+setup_screen/SetupPasswordScreen.kt— wrapped in SecureScreen.ui/screens/setup_screen/SetupThemeScreen.kt— first-run theme + palette.ui/screens/system_ui/AppScaffold.kt— single Scaffold; auto-lock + server-lockdown re-routing.util/SecureClipboard.kt,util/VlmPaths.kt.ui/screens/guide/— hub + 7 detail screens viaGuideDetailLayout+GuideTopBar.ui/screens/home_screen/PlusMenu.kt— Documents / Thinking / Attach image (image disabled untilisVlmLoaded).ui/screens/home_screen/HomeScreenBottomBar.kt— image-attach button + mic button + transcribe equalizer.ui/screens/server/ServerScreen.kt+ServerTopBar.kt— Remote Server config + token + status + request log.ui/screens/hf_explorer/{HfExplorerScreen,HfRepoDetailScreen}.kt— search / filter / repo browser.
app/build.gradle.kts, app/proguard-rules.pro, hxs_encryptor/build.gradle.kts, gradle/libs.versions.toml, app/src/main/AndroidManifest.xml.
Vision rides on top of an active GGUF chat model via a separate mmproj projector file. Image data crosses AIDL via ParcelFileDescriptor[] (1 MB binder limit forbids byte[]).
VLM models live as <modelsDir>/vlm/<repoLeaf>/{base.gguf, mmproj.gguf}. A HuggingFace repo is detected as VLM if any .gguf file in its tree has mmproj (case-insensitive) in its name. Downloads pull both files into the per-repo folder; loading the base auto-loads the colocated mmproj. There is no manual "load projector" UI.
boolean loadVlmProjector(String path, int threads, int imageMinTokens, int imageMaxTokens);
boolean loadVlmProjectorFromFd(in ParcelFileDescriptor pfd, int threads, int imageMinTokens, int imageMaxTokens);
void releaseVlmProjector();
boolean isVlmLoaded();
String getVlmInfo();
String getVlmDefaultMarker();
void generateVlm(String messagesJson, in ParcelFileDescriptor[] imageFds, int maxTokens, IGenerationCallback callback);
InferenceService.generateVlm(messagesJson, imageFds, maxTokens, cb) reads each PFD via AutoCloseInputStream.readBytes() on Dispatchers.IO, hands List<ByteArray> to engine.generateVlmFlow, and bridges GenerationEvent → IGenerationCallback. Read failures → callback.onError.
InferenceClient.isVlmLoaded: StateFlow<Boolean> mirrors service-side state. loadVlmProjector(path, threads=2) is the path-based load used by auto-load. generateVlm(context, messagesJson, imageUris, maxTokens): Flow<InferenceEvent> opens PFDs, hands the array to the service, closes after the call. InferenceCoordinator.run() per-iteration: if iteration==0 AND last user has non-empty imageUris AND isVlmLoaded.value → VLM route. buildMessagesJson(messages, vlmLastUserId=lastUser.id) prepends getVlmDefaultMarker() to the last user's content.
ModelSessionManager.load(model):
releaseVlmProjector()if currently loaded.- Load base.
- On success, if
pathType == FILEand the path is inside<modelsDir>/vlm/, callVlmPaths.colocatedMmproj(baseFile). Present → load. Missing → surfacevlmAutoLoadErrorviaStateFlow<String?>; UI showsVlmErrorBanner.
ModelSessionManager.unload() releases projector first.
ChatMessage.imageUris: List<String> persisted via ChatRepository.TAG_MSG_IMAGES = 8 (JSON array of URI strings). Image bytes never land on disk.
Chat.forkedFromChatId: String? persisted via ChatRepository.TAG_FORKED_FROM = 9 (chats collection). Set by ChatRepository.forkChat(sourceChatId, atMessageId) — clones every message up to and including the cut point into a new chat with title "<src> (fork)". Drawer renders TnIcons.Fork + "Forked" label next to the title when the field is non-null. Forking is gated on !isGenerating in HomeViewModel.forkFromMessage.
ModelCatalog.fetchRepo flags any repo whose tree contains a *mmproj*.gguf file; non-mmproj .gguf rows get isVlm=true, repoPath, mmprojFileName, mmprojFileUri, mmprojSizeBytes. Tag list adds "VLM". ModelStoreViewModel.downloadModel routes VLM base into vlmModelFile(repoPath, fileName); on completion, enqueues mmproj into the same folder under the same modelId. Finalize inserts a single ModelInfo whose path is the base .gguf.
PlusMenu Attach-image is disabled when !isVlmLoaded, with a "Attach image · VLM required" badge. PendingImageRow renders thumbnails via BitmapFactory.decodeStream. MessageBubble renders UserImageThumbnails when message.imageUris.isNotEmpty(). InstalledModelCard shows a "VLM" tag for paths under models/vlm/.
HomeViewModel and InferenceCoordinator take Application (for contentResolver + generateVlm(app, ...)).
Streaming TTS playback of assistant messages and tap-to-toggle STT input via the sherpa-onnx AAR. The AAR exposes VITS + Kokoro TTS and Whisper STT only; SupertonicTTS is not supported.
Install path: Store only. BYOM / SAF directory import was removed (2026-04-24). The Store downloads .tar.bz2 from sherpa-onnx GitHub releases, extracts into <filesDir>/voice/<tts|stt>/<folder>/, builds the sherpa-onnx config JSON, inserts ModelInfo + ModelConfig. Archive deleted after extraction. First TTS/STT download of each kind is auto-selected as active.
boolean loadTtsModel(String configJson); void unloadTtsModel(); boolean isTtsLoaded();
float[] synthesize(String text, int speakerId, float speed); int getTtsSampleRate();
boolean loadSttModel(String configJson); void unloadSttModel(); boolean isSttLoaded();
String recognize(in float[] samples, int sampleRate);
String recognizeFromFd(in ParcelFileDescriptor pfd, int sampleCount, int sampleRate);
synthesize and recognize are batch — there is no streaming callback. Streaming TTS is faked by sentence-chunking at the text layer.
app/src/main/java/com/dark/tool_neuron/voice/:
TtsPlayer— sentence-chunk streaming viaAudioTrack.MODE_STREAM + WRITE_BLOCKING. Cancellable per-chunk via_speakingId.value == messageIdcheck.SttRecorder—AudioRecord16 kHz monoENCODING_PCM_FLOAT, sourceMediaRecorder.AudioSource.VOICE_RECOGNITION. ExposesisRecording,amplitudeflows for UI.VoiceModelManager—@Singleton. Auto-loads active TTS/STT on first use by readingAppPreferences.activeTtsModelId/activeSttModelId. UsesMutexto serialize loads. InjectsLazy<AppPreferences>to avoid eager construction in non-main processes.VoiceArchive— extraction. Streams.tar.bz2throughBZip2CompressorInputStream→TarArchiveInputStream, writes each entry into a per-archive folder, builds the sherpa-onnx config JSON. Per-entrysafeResolverejects path-traversal. Calls backonEntry(name)for per-file UI progress.
No modelType field on ModelInfo. ProviderType is canonical (GGUF / TTS / STT / EMBEDDING). HuggingFaceModel.modelType: String is the pre-install hint; ModelStoreViewModel.finalizeNonVlmDownload maps it to ProviderType at insert time. HomeViewModel.chatModels filters to ProviderType.GGUF.
AppPreferences keys active_tts_model and active_stt_model (encrypted HXS records). Empty → fallback to first installed model of that type. Voice models live under <filesDir>/voice/<tts|stt>/<folder>/. Voice deletes need deleteRecursively() since the folder is non-empty (current limitation: store delete uses File.delete).
TtsPlayer.sanitize(text) strips code fences / inline code / markdown emphasis / links / headers. splitIntoSentences(text) breaks at ./!/?/…/;/\n after ≥20 chars or at comma/space if ≥180 chars. Each chunk synth'd on Dispatchers.IO and written into the AudioTrack with WRITE_BLOCKING. AudioTrack is lazy at getTtsSampleRate(); recreated on rate change.
SttRecorder.start() reads 1024-sample chunks in a tight loop, snapshots max abs into _amplitude, appends to a synchronized buffer. stop() snapshots into FloatArray, releases. The array is passed to InferenceClient.recognize(samples, 16000) from HomeViewModel.stopRecordingAndTranscribe, which pushes recognized text into _transcribedText: StateFlow<String?>. HomeScreenBottomBar observes and appends to its local text state, then calls consumeTranscribedText(). STT is unloaded after each transcription to free memory.
RECORD_AUDIO requested at first mic tap via ActivityResultContracts.RequestPermission; on grant, immediately startRecording(). No FOREGROUND_SERVICE_MICROPHONE (UI is held while recording).
- Speak / Stop button on assistant bubbles (
MessageActions) whenvoiceTtsAvailable. Icon flips betweenTnIcons.VolumeandTnIcons.PlayerStop, and shows a CircularProgressIndicator with stop-icon overlay while the TTS model is loading (isSpeakLoading). - Mic
ActionButtonalways rendered. No STT installed → navigate to ModelStore. Permission missing → request. ElsestartRecording(). - Recording crossfades the input bar to
RecordingEqualizer([X cancel] [waveform] [✓ stop]). Stop callsstopRecordingAndTranscribe. - Image-attach button moved out of PlusMenu into the input bar.
- Voice errors surface through the same
VlmErrorBannercomponent. - No dedicated Voice Settings screen — Store manages downloads + first-install becomes active. Default TTS / STT swap surfaces in Settings → Voice section (
SettingsViewModel.voiceSection); selecting a different model writesactive_tts_model/active_stt_modeland callsVoiceModelManager.unloadTts/Stt()so the next request reloads the new pick.
VoiceModelManager, TtsPlayer, SttRecorder are all @Singleton. HomeViewModel injects VoiceModelManager. ModelStoreViewModel injects AppPreferences to flag first install of each kind as active.
ModelCatalog.BUILT_IN_MODELS carries four sherpa-onnx releases: vits-piper-en_US-amy-low (TTS, ~30 MB), vits-piper-en_US-libritts-high (TTS, ~124 MB), sherpa-onnx-whisper-tiny-en (STT, ~75 MB), sherpa-onnx-whisper-tiny (STT, ~82 MB). URLs hit sherpa-onnx/releases/download/{tts,asr}-models/…. If sherpa-onnx restructures their releases, the Store surfaces the failure.
Embedded HTTP server exposing every installed engine over an OpenAI-compatible API on the local network. Standalone replacement for the rejected Ktor PR. As of 2026-05-11 the server is multi-engine: chat GGUF, VLM (chat + images), embeddings, TTS, STT, and image generation (txt2img / img2img / inpaint / 4x upscale). No TLS, no mDNS, no outbound calls.
Three processes:
:app— UI, chat ViewModels,ServerController(AIDL client),InferenceClient(AIDL client of:inference). HXS / Keystore live here; nothing crosses out.:inference—InferenceService(chat-side llama.cpp + sherpa-onnx). Untouched by the server.:server—RemoteServerService, its own per-type engine instances (ServerEnginechat,ServerVlmEngine,ServerEmbeddingEngine,ServerTtsEngine,ServerSttEngine,ServerImageEngine), the embedded native HTTP server, the bearer token in native memory. Foreground (dataSync,stopWithTask="false"). Independent: app crash doesn't kill it, server crash doesn't kill the app.
:server doesn't open HXS. The bearer token, full engines catalog (per-model id / display name / file path / mmproj path / per-model config JSON / kind), web-UI HTML, and docs HTML are all handed across via AIDL start(configJson). When the user rotates the token in the UI, :app regenerates + persists, then pushes the new token to :server via IRemoteServerService.rotateToken(newToken).
When the user swipes the app away, :app and :inference both die. :server keeps running because it's foreground and because handleStart self-calls startService(Intent(this, RemoteServerService::class.java)) immediately before startForeground. That transitions it from bind-only to started lifecycle — without it, every binder client dying (which happens on swipe-to-kill) makes the service eligible for destruction even with a foreground notification. Reopening the app re-binds; ServerController calls currentSnapshotJson() and recentRequestEventsJson(100) to rehydrate the Server Screen with whatever's running right now.
Tapping the notification body opens MainActivity with EXTRA_OPEN_SERVER_SCREEN=true, which routes straight to the Server Screen. The Stop button on the notification fires startService(action=ACTION_STOP) against :server, which tears down in-process.
ServerEngineRegistry (in :server) holds one instance per kind. Each kind is loaded lazily on first request and cached:
| Kind | Wrapper | Library backing |
|---|---|---|
gguf |
ServerEngine |
com.dark.gguf_lib.GGMLEngine |
vlm |
ServerVlmEngine |
GGMLEngine + mmproj projector |
embedding |
ServerEmbeddingEngine |
com.dark.gguf_lib.EmbeddingEngine |
tts |
ServerTtsEngine |
com.dark.ai_sherpa.OfflineTts (VITS) |
stt |
ServerSttEngine |
com.dark.ai_sherpa.OfflineRecognizer (Whisper) |
image_gen |
ServerImageEngine |
com.dark.ai_sd.StableDiffusionManager (QNN/MNN) |
image_upscaler |
ServerImageEngine |
same SDK, separate loadUpscaler path |
Why lazy-load: a 4 GB device cannot hold every engine simultaneously. On start(configJson) the primary chat GGUF (or first VLM if no chat) is preloaded so first /v1/chat/completions returns fast; everything else materialises on first request to that endpoint. Per-kind Mutex/synchronized lock prevents two requests racing the load. Engine instances are NOT reaped on idle in this iteration — only shutdownAll() on server stop. If RAM becomes a problem on smaller devices, the natural extension is a per-kind TTL cache or a POST /v1/admin/unload?kind=image_gen route.
Cross-process gotchas:
:serverloadsStableDiffusionManager.getInstance(context)independently from:app. They each have their own native pipeline. Theqnnlibs.tar.xzruntime extraction is uid-shared at<filesDir>/ai_sd_runtime/; whichever process extracts first wins, the other sees the existing files and skips. Image gen via the server therefore requires the user to have downloaded the SD runtime through the Image Task screen at least once.:serverdoes NOT open HXS to readprefs.activeTtsModelId/activeSttModelId. Voice "default" is whatever the catalog ranks first per kind —ServerController.buildEnginesCatalogwalks theModelRepositoryin install order, so the active voice model from:appdoesn't influence which voice the server uses as fallback. Clients pick explicitly via themodelfield. If the requestmodelis unknown, the server falls back tofirst_of_kind.:inference(the chat-side llama.cpp service) is untouched by the server — no AIDL hops between:serverand:inference. Server-side load and chat-side load are independent. This is intentional: it keeps the request path on one process and prevents the server lockdown from also blocking chat-side voice/image flows on the same engine.
| Method | Path | Auth | Stream | Purpose |
|---|---|---|---|---|
| GET | /, /index.html, /webui |
public | - | Bundled Material-3 web UI |
| GET | /docs, /docs/ |
public | - | API documentation |
| GET | /health |
public | - | Liveness ping {status:ok} |
| GET | /v1/models |
auth | - | Full enabled-engine catalog (id, type, owned_by) |
| POST | /v1/chat/completions |
auth | yes | Chat GGUF; auto-routes to VLM when message contains image_url parts |
| POST | /v1/embeddings |
auth | - | Dense embeddings — {input: string | string[]} |
| POST | /v1/audio/speech |
auth | - | TTS — body {model, input, voice, speed}, returns audio/wav |
| POST | /v1/audio/transcriptions |
auth | - | STT — multipart file=<wav> + model=<id> |
| POST | /v1/images/generations |
auth | - | txt2img — body {model, prompt, negative_prompt?, steps?, cfg?, width?, height?, seed?}, returns {data:[{b64_json}]} |
| POST | /v1/images/edits |
auth | - | img2img / inpaint — multipart image, optional mask, model, prompt, sampling params |
| POST | /v1/images/upscale |
auth | - | 4x upscale — multipart image + model |
Every non-public route is gated by the same pre_routing_handler: rate-limit token bucket → ban list → bearer auth (constant-time compare, 20 consecutive fails → 1 h ban). The same post_routing_handler records every request into the 128-entry audit ring buffer + pushes it across to :app via InferenceBridge.onRequestEvent. No exceptions, no per-route auth bypass.
native-server/src/main/cpp/
server_core.{h,cpp} — httplib::Server lifecycle, pre/post-routing, every route registration
server_auth.{h,cpp} — bearer token store + constant-time compare + 401/403
server_crypto.{h,cpp} — getrandom(2) RNG, const_time_eq, base64url, base64 std, base64 decoder, secure_zero
server_models.{h,cpp} — typed catalog: id + display_name + path + mmproj_path + config_json + Kind + created
+ has_id_of_kind / first_of_kind / has_any_of_kind / build_list_response
server_audit.{h,cpp} — 128-entry ring buffer of request events
server_rate_limit.{h,cpp} — per-client token bucket (cap=30, refill=1/s) + auth-fail ban (20 fails → 1 h)
server_webui.{h,cpp} — set/clear/get/has HTML (mutex-protected std::string)
server_docs.{h,cpp} — same for /docs
server_staging.{h,cpp} — tmpfile dir for large binary payloads; staged paths handed across JNI (no byte[] copies)
wav_codec.{h,cpp} — minimal RIFF/PCM16 + IEEE-float decode (STT input); encode is on the Kotlin side
gen_session.{h,cpp} — chat / VLM streaming session: token queue + cancellation
reply_session.{h,cpp} — single-shot reply session for embeddings / TTS / STT / image; carries text or staged binary path
openai_schema.{h,cpp} — ChatRequest parser (detects has_images), VLM image part extractor (base64 data URLs ONLY),
error envelope, embedding response, transcription response, image response builders
jvm_bridge.{h,cpp} — JavaVM pin, JNI upcalls (startGeneration + startEmbedding + startTts + startStt
+ startImageGen + startImageUpscale + cancelGeneration + onRequestEvent)
native_server.cpp — JNI entry points (start/stop/token/catalog/bridge/feeders/staging/audit/rl/webui/docs) + JNI_OnLoad
CMake fetches cpp-httplib v0.18.5 and nlohmann/json v3.11.3, both header-only (HTTPLIB_COMPILE=OFF, HTTPLIB_REQUIRE_OPENSSL=OFF, HTTPLIB_REQUIRE_ZLIB=OFF, JSON_BuildTests=OFF). Same flags as :hxs_encryptor: c++17, -fvisibility=hidden, -fstack-protector-strong, LTO/gc-sections/icf release, -march=armv8-a+crypto+sha2 on arm64, -Wl,-z,max-page-size=16384. Read timeout was lifted from 15s → 60s and payload max from 1 MB → 64 MB to accommodate base64-encoded VLM images and multipart audio/image uploads.
Payload mechanics:
- Streaming (chat / VLM):
gen_sessionqueue withnativeFeedToken / nativeFeedDone / nativeFeedError; the httplib chunked content provider drains it onto an SSE stream. - Single-response (embeddings / TTS / STT / image):
reply_session— bridge callsnativeFeedReplyText(replyId, body, mime)for JSON/text ornativeFeedReplyBinary(replyId, path, mime)for staged binary; the route handler blocks onsession->wait(timeout). - Big binary upload (multipart image, mask, wav): cpp-httplib decodes multipart natively; the route writes each part to
<cacheDir>/server-staging/tn_<rand>_<name>viaserver_staging::write_bytes, hands the path to Java via JNI string (avoids byte[] JNI copies), and unlinks on response. - Big binary download (TTS wav, generated PNG): Kotlin writes the bytes to the staged path, hands the path back via
nativeFeedReplyBinary(path, mime), the C++ side reads + sends + unlinks. PNG responses are base64-encoded into JSONb64_jsonper OpenAI; WAV is sent as rawaudio/wav. - VLM image_url parts: only
data:image/...;base64,...URLs are accepted. Network URLs return 400 (offline-only scope). Decoded bytes are staged to tmpfiles and the paths passed toInferenceBridge.startGeneration(..., imagePaths=[...]). Sanitised messages (image parts collapsed into text-onlycontent) are forwarded to the engine alongside the path list — the Kotlin bridge reads each tmpfile and feeds the bytes toGGMLEngine.generateVlmFlow(imageData = [...]).
Bundled at app/src/main/assets/server_webui.html. Single Material-3 SPA with a sidebar tab strip that swaps the main panel between four workspaces:
- Chat — preserved from the prior build: localStorage history, markdown rendering, streaming with blinking cursor, settings dialog, connection indicator. Adds an attach-image button (📎) that converts the uploaded image to a
data:image/...;base64,...URL and appends it as an OpenAI multi-partimage_urlcontent entry on the next send. Server auto-detects and routes to the VLM engine. - Embeddings — model select, multi-line input (one row per line), runs
/v1/embeddings, shows vector count + first 8 dims of each row. - Voice — two cards. TTS: model + text + voice id + speed, plays the returned WAV inline. STT: model + WAV upload, shows transcribed text.
- Image — segmented switch (Generate / Edit / Inpaint / Upscale). Prompt + negative + steps/CFG/width/height for diffusion modes. Input image file for Edit/Inpaint/Upscale. Mask file for Inpaint. Result is rendered inline from
b64_json.
refreshModelCache() hits /v1/models once per tab activation and filters per-kind for the model dropdowns. JNI: nativeSetWebUiHtml(html) pushes the bundled file at server start; nativeClearWebUi() clears on stop. Same applies to /docs via nativeSetDocsHtml + app/src/main/assets/server_docs.html. The docs file documents every endpoint with copy-pasteable curl examples.
JSON object passed to IRemoteServerService.start(configJson). Built by ServerController.start() from :app:
{
"token": "tn_sk_<base64url>",
"port": 11434,
"bindMode": "ALL_INTERFACES | LOOPBACK_ONLY | WIFI_ONLY",
"webUiHtml": "<bundled assets/server_webui.html>",
"docsHtml": "<bundled assets/server_docs.html>",
"engines": [
{ "id":"...", "name":"...", "path":"<abs file path>", "type":"gguf|vlm|embedding|tts|stt|image_gen|image_upscaler",
"mmproj_path":"<vlm only, optional>", "config_json":"{...}", "created":1715000000, "primary":true|false }
]
}engines is built by walking the entire installed ModelRepository and mapping ProviderType → engine kind. GGUF chat models living under <modelsDir>/vlm/ with a colocated *mmproj*.gguf are auto-classified as vlm. config_json merges the per-model loadingParamsJson + inferenceParamsJson from ModelConfig. URI-pathType models are skipped — the server only supports FILE paths because there's no clean way to trampoline content URIs across the :server process boundary.
:server-side Kotlin (in app/src/main/java/com/dark/tool_neuron/service/server/, runs in :server process)
RemoteServerService.kt— plainService, NOT@AndroidEntryPoint. Holds theServerEngineRegistry, theServerInferenceBridge, and aRemoteCallbackList<IRemoteServerCallback>. ImplementsIRemoteServerService.Stubinline. Foreground promotion happens inside the AIDLstart(configJson)call — parses the catalog, sets the staging dir, callsregistry.setCatalog, preloads the primary chat (or first VLM if no chat), configures + starts the native HTTP server, publishes aServerSnapshotto all callbacks.onStartCommandonly handlesACTION_STOP(the notification's Stop button).onCreatecallsnativeSetStagingDir(<cacheDir>/server-staging/)so the cleanup path is wired even before astart.ServerEngine.kt— wrapsGGMLEnginefor chat GGUF.load(modelId, path, configJson)(carries the id so the registry can decide reload vs. reuse),unload(),generateMultiTurnFlow(...),setSampling,setSystemPrompt,stopGeneration. Same JSON shapeInferenceServiceparses for chat (contextSize, threadMode, flashAttn, cacheTypeK/V, sampling, kvSink/Window/Evict).ServerVlmEngine.kt— separateGGMLEngineinstance.ensureLoaded(modelId, basePath, mmprojPath, configJson)releases any prior projector, loads the base GGUF, then auto-loads the mmproj (preferring the explicit path; falling back to colocated*mmproj*.gguf).generateFlow(messagesJson, imageBytes, maxTokens)dispatches toGGMLEngine.generateVlmFlow(imageData=..., imageQuality=HIGH).ServerEmbeddingEngine.kt— wrapsEmbeddingEngine.ensureLoaded(modelId, path, configJson)callsengine.load(path, threads, contextSize). ExposesembedBatch(texts, normalize=true)— the bridge JSON-encodes the result fornativeFeedReplyText.ServerTtsEngine.kt— wraps sherpa-onnxOfflineTts(VITS only).ensureLoadedbuildsOfflineTtsConfigfrom the model config JSON;synthesize(text, speakerId, speed)returns mono float samples;sampleRate()exposes the codec rate so the WAV encoder writes the right header.ServerSttEngine.kt— wraps sherpa-onnxOfflineRecognizer(Whisper only).recognize(samples, sampleRate)returns the transcribed text or null on failure.ServerImageEngine.kt— wrapsStableDiffusionManager.ensureRuntime()is gated on<filesDir>/ai_sd_runtime/qnnlibs.tar.xzexisting (downloaded by:app-sideImageGenManager).loadDiffusion(id, name, path, width, height)walks the model dir, buildsDiffusionModelConfig, and callssdk.loadModel.loadUpscaler(id, path)toggles MNN vs. OpenCL based on filename.generate(params)issdk.generateImageSync(...)(blocks until result).upscale(bitmap)posts the bitmap and.first {}s onupscaleStatefor Complete/Error. PNG encoding/decoding lives here too.ServerEngineRegistry.kt— single source of truth for catalog + lazy loading.chatFor,vlmFor,embedFor,ttsFor,sttFor,imageGenFor(width, height),upscalerFor. Each method picks the entry by id or falls back tofirstOf(kind). Per-kind locks (Mutexfor suspend, plainObjectfor sherpa's synchronous load) serialise concurrent loads.ServerInferenceBridge.kt— extendsInferenceBridge. Each upcall (startGeneration/startEmbedding/startTts/startStt/startImageGen/startImageUpscale) launches a coroutine on aSupervisorJobIO scope, calls the right registry method, dispatches to the engine, and feeds the result back viaNativeServer.nativeFeed*. Chat + VLM streaming uses the existingnativeFeedToken / nativeFeedDone / nativeFeedErrortriplet; everything else uses the single-shotnativeFeedReplyText / nativeFeedReplyBinary / nativeFeedReplyError. VLM marker is prefixed onto the last user message viaengine.defaultMarker()(=engine.getVlmDefaultMarker()in the Kotlin call site).ServerCatalog.kt— typed catalog model +ServerEngineKindenum + JSON serializer matching the C++set_catalog_jsonparser.ServerWavCodec.kt— Kotlin-side RIFF reader/writer used by TTS (encode floats → wav) and STT (decode wav → floats). Mirrors the native helper.ServerSnapshot(inRemoteServerService.kt, internal) — phase / modelId / modelName / host / displayHost / lanHost / port / bindModeName / wifiActive / reason. Serialised to JSON for cross-process shipping.modelId / modelNamereflect the primary engine (chat GGUF or first VLM) — the snapshot doesn't enumerate every loaded engine.BindResolver.kt,ServerTypes.kt— unchanged.
ServerController.kt—@Singleton, AIDL client.start()walksModelRepository.models.value, builds the full multi-engine catalog (one JSON entry per installed model), packages with token / port / bind mode / web-UI HTML / docs HTML, callsIRemoteServerService.start(configJson). URI-pathType models are silently skipped. The "selected chat model" pref still exists but is now only used to mark a"primary": trueflag inside the engines array — if it's unset or invalid the first installed chat GGUF wins. Start enables as long as any engine is installed (was: required a chat model).stop()androtateToken()forward via AIDL.viewmodel/ServerViewModel.kt— addsanyEngineInstalled: StateFlow<Boolean>. Chat-model selector card stays; selecting only seeds theprimaryflag.ui/screens/server/ServerScreen.kt— Start button enabled if any engine is installed OR a chat is selected.
Unchanged from the single-engine build. :app calls bindService(intent, conn, BIND_AUTO_CREATE). Android starts :server. AIDL stub returned. App registers callback, reads currentSnapshotJson() to rehydrate. App-killed-but-server-running: re-launching the app re-binds; currentSnapshotJson() returns phase=running with all live fields, plus recentRequestEventsJson(100) for the log card. Server foregrounds only during AIDL start, not on bindService alone — so a brief "exists, idle, no notification" state is impossible (we never enter it).
ScaffoldViewModel.serverRunning: StateFlow<Boolean> derived from ServerController.state. AppScaffold LaunchedEffect(serverRunning, currentRoute) re-routes to ServerScreen with popUpTo(0) { inclusive = true } when running and not already there. BackHandler(running) {} inside ServerScreen absorbs back. Drawer gesture hidden via showDrawer = currentRoute == HomeScreen.route && !serverRunning. Chat-side load/unload + sendMessage in HomeViewModel are gated on !serverController.isBusy; same for ModelStoreViewModel.downloadPack / downloadModel / setActive. Because the lockdown is UI-routing rather than per-VM gating, the Image Task / Voice screens are inherently unreachable while server is running — no per-VM gate needed there. The server owns whatever model state it has loaded for its current request; chat-side reload would have nothing to clobber anyway because they're separate engine instances in separate processes.
ensureToken()callsnativeGenerateToken()→tn_sk_+ 32 random bytes base64url-encoded (getrandom(2)).- Stored plaintext in encrypted HXS vault (
AppPreferences.serverToken). - Handed to native at start (
nativeSetToken); zeroed on stop (nativeClearToken). - 20 consecutive auth-fails from same client_addr → 1 h ban.
- Reveal in UI gates on
session.isAllowed(AUTH_VERIFY). Rotate generates a new token + invalidates the old.
server_token (String), server_port (String, validated [1024..65535], default 11434), server_bind_mode (String, default ALL_INTERFACES), server_auto_start (Boolean, reserved), server_configured (Boolean), server_selected_model (String — primary chat hint only as of multi-engine). All ride the same encrypted app_prefs vault.
HTTPS / TLS, mDNS / Bonjour, QR-pairing, dynamic-model-load over the wire, streaming usage metrics, request-log persistence to HXS, network-URL image fetching (offline-only). Audio transcoding — /v1/audio/transcriptions accepts WAV PCM (16-bit or 32-bit float) only; MP3/AAC etc. return a generic decode failure. Per-engine RAM accounting — engines stay loaded until server stop; no idle TTL. Image-gen progress streaming — /v1/images/generations is single-response (uses generateImageSync); the live diffusion intermediate previews available in the in-app Image Task screen are not exposed.
Rewritten 2026-04-29. All HF traffic flows through :networking (curl-impersonate Chrome116 + bundled CA bundle); the previous HttpURLConnection path is gone. Filter chips are populated dynamically from /api/models-tags-by-type; the README on the detail screen renders client-side from /{author}/{repo}/raw/main/README.md.
repo/HuggingFaceApi.kt(Hilt@Singleton class) — URL builders + thin HTTP layer. Methods:fetchJson(url): Result<JSONObject>,fetchJsonArray(url): Result<JSONArray>,fetchRaw(url): Result<String>,probe(url): Result<Int>. All go throughWebNative.fetchwithAccept: application/jsonandAccept-Encoding: gzip. URL builders:modelInfoUrl,modelTreeUrl,resolveFileUrl,rawFileUrl,searchUrl,quickSearchUrl,trendingUrl,tagsByTypeUrl. Failures are typed viaHfApiError(RateLimited(retryAfterSeconds),NotFound,Forbidden,Network,Parse,Http).repo/hf/HfClient.kt(Hilt@Singleton) — typed explorer endpoints overHuggingFaceApi.searchModels,quickSearch,trending,modelDetail,readme,tagsCatalog(cached 24h in encryptedapp_prefsunder keyshf_tags_catalog_v1+hf_tags_catalog_v1_at).repo/hf/HfModels.kt—HfModelSummary,HfModelDetail(withHfSibling/HfGgufMeta/HfCardData),HfTrendingItem,HfQuickResult,HfTagsCatalog/HfTagEntry,HfGatedenum (OPEN/GATED/AUTO).repo/hf/HfJsonParse.kt— internalorg.jsonparsers for each shape.repo/HuggingFaceExplorer.kt— kept as a thin compat wrapper exposingsearchModels/searchGgufRepos/fetchRepoDetailmapped to legacyExplorerRepo/HfRepoDetailtypes forModelStoreViewModel.
ModelCatalog and RepositoryValidator inject HuggingFaceApi directly. They no longer touch HttpURLConnection.
State flows: query, filters: HfFilters, results: List<HfModelSummary>, isSearching, searchError: HfApiError?, tagsCatalog, trending, history, hideAdded, detailState, fileFilter, fileSizeBucket, existingRepoPaths. On VM init: kicks off tagsCatalog() and trending(12) once each (cached).
Search trigger policy (intentional, rate-limit conservative):
- IME action / Search button → fires
client.searchModels. - Any chip toggle / sort change / param-range slider release → fires fresh search via
updateAndSearch. - Per-keystroke quicksearch is deliberately not wired.
HfFilters carries only fields that map to documented HF list params: libraries: Set<String> (default {"gguf"}, multiple filter=…), author: String (author=…), gated: GatedFilter (gated=true|false), paramsMinMillions/Max (num_parameters=min:7B,max:13B), sort: HfSort. The previous "kitchen sink" filters (apps, inference_provider, languages, licenses, regions, other-tags, quant chips, trained-dataset, pipeline-tag, inference-warm) and post-filter sliders (min-downloads, min-likes, recent-days) were dropped because (a) HF rejected the speculative URL params with HTTP 400, and (b) the heavy filter UI added clutter without unlocking working searches. Tags catalog still fetched + cached for future use; just not wired to chip rows yet.
ui/screens/hf_explorer/HfExplorerScreen.kt— search hero (TnTextField + ActionButton submit), history strip when empty, sort row, gated/hide-added quick toggles, collapsible Filters card with: param-range slider, library chips (GGUF/Transformers/Safetensors/ONNX/MLX/Diffusers), author text. Trending strip when results empty + query empty. Result cards with author-initials avatar, downloads/likes/pipeline pills, tag chips, Gated badge variants (Gated / Gated · auto), Add/Added trailing icon. Errors render viaErrorBannerwith rate-limit aware copy.ui/screens/hf_explorer/HfRepoDetailScreen.kt—HeaderCardwith stats + gated badge;GatedNoticeblock when gated (license prompt preview + sign-in CTA);GgufCard(architecture, context, total bytes, BOS/EOS) when GGUF;CardDataView(license, base model, languages, task, tag chips); file filter pills + file rows; README rendered vialazyMarkdownItemsfrom raw markdown; failure view distinguishes rate-limit / not-found / forbidden / network / parse / http.
Sealed in encrypted app_prefs (under existing tn.app_prefs.user_key.v2 HKDF key). Plaintext JSON never lands on disk. 24h TTL; forceRefresh = true bypasses. On every cold start the first explorer open hydrates from the cache; if expired or empty, hits /api/models-tags-by-type once and re-persists.
The Action Window's third tab is Attach (formerly Tools). It shows the current chat's attachments and a single full-width "Add attachment" button. Tapping it opens AttachmentPickerDialog with two paths:
- Pick from previous chats — opens
PrevChatsPickerDialog, a full-screenDialogwith a list grouped by source chat title. Tapping a row re-attaches the document to the active chat. - Pick from storage — launches
ActivityResultContracts.OpenDocumentwith the existing MIME filter (text/*, pdf, json, xml, rtf, epub, odt, docx, pptx, xlsx).
Every attached document is stored content-addressed by SHA-256 of its bytes:
<filesDir>/chat_documents/sources/<sourceId>.bin— raw bytes, written once per unique content; multiple chats sharing the same content share the file.<filesDir>/chat_documents_meta_v1/— encrypted HXS collection holding(id, chatId, sourceId, name, mimeType, chunkCount, sizeBytes, addedAt). Sealed underHKDF(DEK, "tn.chat_documents.user_key.v1").DocumentRepository.initmigrates legacy plaintext atchat_documents/(top-level files) into the encrypted vault on first launch.sources/subdirectory is preserved during migration.<filesDir>/rag_keyword_v1/— native HXS-encrypted keyword index for hybrid retrieval. Sealed underHKDF(DEK, "tn.rag_keyword.user_key.v1"). Tokenization, inverted index construction, BM25 scoring all live in C++ (hxs/src/main/cpp/rag_keyword.{h,cpp}); only the wrapper class is in Kotlin. Inverted index is rebuilt in-RAM on every process start by scanning the HXS records (bounded by # of chunks).chat_documentsHXS collection — same TAG layout as before:(1=id, 2=chatId, 3=name, 4=mimeType, 5=chunkCount, 6=sizeBytes, 7=addedAt, 8=sourceId). Persisted across restarts. Do not calldocumentRepo.clearAll()fromRagManager.init— that's the previous (wrong) behavior that wiped doc history every boot.idis the compound<chatId>:<sourceId>. Same content attached to two chats produces two records sharing onesourceId.binblob.
RagManager.hydrateChat(chatId) re-ingests persisted records into the live RAG engine on chat-open (the engine itself is rebuilt fresh per process). It tracks ingestedDocIds: MutableSet<String> to avoid duplicate ingests; the set clears on engine.close(). Hydration also re-populates the FTS5 BM25 index for text-format documents (idempotent — keywordIndex.docCount(docId) > 0 check skips already-indexed).
RagManager.attachExisting(currentChatId, source) is the prev-chat re-attach: builds the new compound docId, re-reads <sourceId>.bin, calls engine.ingestBytes(...), persists the new record. Idempotent — if the chat already has the same sourceId, returns the existing record.
RagManager.removeDocument(docId) removes the chunks from the engine + the FTS5 keyword rows + record from HXS, and deletes <sourceId>.bin only when no other record references that sourceId (documentRepo.countWithSource(sourceId) == 0).
RagManager.augment(chatId, query, originalPrompt, maxContextTokens) returns RagAugmentation(augmentedPrompt, chunks):
- Optional multi-query — if
appPrefs.ragMultiQuery,RagQueryRewriterasks the loaded chat model to generate 3 alternative phrasings of the user's query. Falls back to single-query if the model isn't loaded or the rewriter times out (8s). - Per-query retrieval — for each query (original + variants), runs the dense engine
engine.query(q)(capped attopN = DENSE_CANDIDATES = 20) ANDRagKeywordIndex.query(q, chatId, KEYWORD_CANDIDATES = 20)BM25 lookup against the FTS5 index. - RRF fusion —
rrfFuseManyover all 2-N rankings (k = 60, identity =(docId, chunkIndex)pair). Items appearing in multiple rankings get summed RRF scores; items in only one ranking still score. ReturnsFUSED_POOL_SIZE = 12candidates. - Optional LLM rerank — if
appPrefs.ragSmartRerank,RagRerankerasks the loaded chat model to score each pooled chunk 1–5 against the query (single LLM call, 15s timeout, 256 max tokens). Returns reordered list. Falls back to RRF order if the model isn't loaded or scoring fails. - Token budget —
InferenceCoordinator.computeRagBudget(messages)derivescontextSize - maxTokens - approxHistoryTokens - 256(clamped to 256–4096).RagManager.buildAugmentedPromptwalks ranked chunks in order, summing approx tokens (chars/4), keeping until budget exhausted. Truncates the first chunk if it alone exceeds budget. Caps atFINAL_TOP_N = 8chunks. - Citation contract — the prompt instructs the LLM to cite chunks inline as
[1],[2], etc. After generation,RagCitationMatcher.match(response, chunks)parses explicit[N]markers AND runs a 4-gram overlap check (≥3 hits = cited). ResultingList<Citation>is stored on the assistantChatMessageviaChatRepository.TAG_MSG_CITATIONS = 13(JSON array). UI:CitationStriprenders chip per citation below the message bubble; tap opens anAlertDialogwith the snippet, doc name, score, and cited/possibly-used label.
RagKeywordIndex is now native — backed by hxs::RagKeywordIndex in hxs/src/main/cpp/rag_keyword.{h,cpp}. Per-chunk records are stored in HXS-encrypted collection rag_chunks with TAG layout (1=docId, 2=chatId, 3=sourceId, 4=chunkIndex, 5=text). The C++ side maintains an in-memory unordered_map<term, vector<Posting{record_id, term_freq}>> rebuilt at construction by scanning the encrypted records. Tokenizer is ASCII alphanumeric + underscore + UTF-8 bytes ≥0x80 passthrough, lowercased, length 2-64. BM25 params k1=1.2, b=0.75. JNI surface: nativeRagIngest, nativeRagQuery, nativeRagRemoveDocument, nativeRagClear, nativeRagDocCount. Replaces the prior SQLite FTS5 implementation, which broke on devices with stripped SQLite (no fts5 module) and lived plaintext at rest.
FTS5 limitation: only text-format documents are indexed. The native engine doesn't expose extracted text back to Kotlin (#329 is blocked-native), so binary formats (PDF/DOCX/EPUB/etc.) bypass BM25 — they only get dense retrieval. RagManager.isTextLike(mime, name) decides via mime-prefix text/, application/{json,xml,rtf,javascript,yaml}, or extensions txt|md|markdown|json|xml|csv|tsv|html|htm|rtf|yaml|yml|log|ini|toml|properties|kt|java|py|js|ts.
RagChunker does Kotlin-side recursive splitting for the FTS5 path (target 1024 chars, min 200, separators in priority \n\n / \n / . / ! / ? / ; / , / space). The native engine's chunking is independent — chunk indices from FTS5 do not align with native engine indices. RRF treats them as separate items by (docId, chunkIndex) identity, which is fine.
Per attached document, a "Deep Index" sparkles-icon affordance in the Attach tab triggers RagManager.deepIndex(docId). The flow:
- Read source bytes from
chat_documents/sources/<sha256>.bin(text-format docs only —RagManager.isTextLikemime/extension gate). RagDocSummarizerasks the loaded chat model to write a one-sentence document summary (≤320 chars, 30 s timeout, 200 max tokens). One LLM call per document.RagChunkersplits the source text into ~1024-char Kotlin-side chunks.- For each chunk, prepend
[Document context: <name> — <summary>]and re-ingest into the dense engine + BM25 index using compound docId${origDocId}::ctx<idx>. The native engine internally re-chunks the (context + chunk) blob into multiple sub-chunks, each carrying the doc context. The original doc remains untouched. ChatDocument.isDeepIndexed = trueis persisted (TAG 9 on the chat_documents collection); the UI shows a "Deep" badge next to the filename.RagManager.deepIndexing: StateFlow<Set<String>>exposes the in-flight set so the UI can show a spinner per row.
Inflation factor: ~Nx storage per deep-indexed doc, where N = (Kotlin chunk size + summary length) / native chunk size. For 1024-char Kotlin chunks + native chunk_size=256, ~5x more native chunks per doc.
Augment-side change: RagManager.augment strips ::ctx<n> suffixes when looking up the parent ChatDocument so citations group under the original doc, not its context-pseudo-children.
Cleanup: RagManager.removeDocument recurses through ingestedDocIds removing every ${origDocId}::ctx* from the engine + BM25 index before deleting the parent record.
Limitations: text-format docs only (PDFs/DOCX blocked by no native extract API). One doc-level summary per doc, NOT per-chunk Anthropic-style — simpler v1, marginally lower quality than per-chunk contextual retrieval but ~1 LLM call vs. N. Idempotent: skipped if already deep-indexed.
Settings → Chat & RAG → Retrieval debug opens RagDebugScreen (route NavScreens.RagDebug). VM is RagDebugViewModel (injects RagManager + ChatRepository). Renders:
- Status pill (ready + active embedding name).
- Chat dropdown to scope the test query.
- Query text field + Run button.
- Tabs: Fused (RRF result), Dense (raw native), BM25 (raw FTS5), Context (final assembled
<context>block + token count), Engine (rawengine.info()JSON). - Each hit card shows chunkIndex, score, docId, first 600 chars of text.
Backed by RagManager.debugQuery(chatId, query, budget) which returns RagDebugResult. Multi-query is NOT applied in the debug path (single-query for clarity).
model/DocExtension.kt enum maps mime + filename to a (label, tint) pair (PDF/DOCX/XLSX/PPTX/ODT/EPUB/RTF/MD/HTML/JSON/XML/CSV/TXT/OTHER). ExtensionBadge in ui/components/action_window/Attachments.kt renders a rounded card with the label centered, tinted from the entry's color. Used in the Attach tab and the prev-chats picker.
The PlusMenu's old "Documents" button is gone — attachments live entirely in the Attach tab now. PlusMenu shows only Thinking when supportsThinking; if not supported, PlusMenuCard returns null.
Replaces the prior Research pipeline (2026-05-15). Single-shot LLM-driven web search. User flips the Web Search toggle on the bottom action bar (or types /search <query>); next chat send becomes a web-search run.
Flow (viewmodel/WebSearchCoordinator.kt):
- Plan — coordinator emits
WebSearchEvent.Plan(userQuery)so the card renders immediately. - GenerateQueries — one LLM call (
WebSearchPrompts.generateQueries) asking for exactly 3 numbered queries. Regex-parsed viaQUERY_LINE_REGEX = ^\s*(?:\d+[.)\-:]|[-*•])\s+(.+)$. Failures fall through toWebSearchEvent.Failed. - Search — for each of the 3 queries,
WebSearcher.search(query, maxResults=5, idx)viaWebNative.search(DDG HTML). Per-query results are deduped against a session-wideseenUrlsset so cross-query overlap doesn't double-feed the synthesizer. Total cap: 3 queries × 5 results = 15 unique snippets. - Synthesize — one LLM call (
WebSearchPrompts.synthesize) with the user query + numbered[i]snippet list. Output is markdown with inline[1]/[2]/[3]citations and a trailing Sources section. - Done — emits
WebSearchEvent.Done(answer, sources); card renders the markdown answer + collapsible tappable source list (chip →LocalUriHandler.openUri).
No URL fetching. No document extraction. No iteration loop. The user-visible difference vs. old Research: seconds instead of minutes, single inline result instead of a "research document" archive screen.
State rides entirely on the chat message via webSearchRunId: String? (TAG_MSG_WEBSEARCH_RUN = 14) and webSearchState: String (TAG_MSG_WEBSEARCH_STATE = 15, JSON-serialized WebSearchUiState). ChatMessageList renders WebSearchCard instead of MessageBubble when webSearchRunId != null. Done runs survive process restart because the terminal state is on the message. No separate vault, no Documents archive, no DocumentViewer screen.
HomeViewModel.handleWebSearchEvent looks up (chatId, messageId) via webSearchMessages[runId], applies the event to the persisted state, and writes the updated message back. The map evicts on Done/Cancelled/Failed.
The card message stores the user's original query in msg.content (read by the card Header), and the synthesized answer in msg.webSearchState. InferenceCoordinator.buildMessagesJson is the single point that prepares chat history for the LLM — when it encounters a webSearchRunId != null message, it swaps content for WebSearchUiState.fromJson(webSearchState).answer.trim() so the model sees the synthesized markdown as the prior assistant turn (not the echoed user query). Cards with an empty answer (in-flight, cancelled, failed) are skipped entirely so the LLM doesn't get a blank assistant message.
Same pattern as the old research lockdown — webSearchActive: StateFlow<Boolean> is derived from webSearchCoordinator.activeRuns.isNotEmpty(). sendMessage, loadModel, and unloadModel all early-return while a run is active because the chat LLM is borrowed for both the GenerateQueries and Synthesize calls.
ui/screens/web_search/WebSearchCard.kt is a single Surface with:
- Header (Globe icon, "Web search", user query)
- Queries strip (3 rows with per-query progress indicators — Circle / spinner / Check + hit count)
- AnimatedContent for current phase (Plan / Queries / Search / Synthesize / Done / Cancelled / Failed)
- Stop button while in flight
- For Done: markdown answer + collapsible
N sourcesaccordion with[i]chips opening URLs externally
model/WebSearchEvent.kt— sealed event class +WebSearchHitdata class.model/WebSearchUiState.kt— phase machine + JSON serde.repo/web_search/WebSearcher.kt— thin wrapper overWebNative.search.repo/web_search/WebSearchPrompts.kt— prompt templates +QUERY_LINE_REGEX.viewmodel/WebSearchCoordinator.kt— single coordinator (no repository, stateless across runs).ui/screens/web_search/WebSearchCard.kt— the only UI.- Modified:
ChatMessage(+ webSearchRunId, webSearchState),ChatRepository(TAG 14/15 renamed),HomeViewModel(toggle, slash parse, coordinator wiring, event mirror),HomeScreen{Body,BottomBar}+ToolsPickerWindow(Web search toggle),ChatMessageList(WebSearchCard render).
Re-pivoted into scope on 2026-05-08. Drop-in port of LocalDream's catalog (xororz/sd-qnn + xororz/sd-mnn + xororz/sdxl-qnn + xororz/upscaler) onto the existing TN model store. Four user-facing tasks: Generate (txt2img), Edit (img2img), Inpaint, Upscale 4×. Tasks #5–#8 from the SDK's surface (LaMa fast removal, MobileSAM segmentation, depth, AdaIN style transfer) are implemented in the AAR's C++ but not yet bound through JNI — out of scope until the bindings ship.
data/SocBucket.kt reads Build.SOC_MODEL (API ≥ 31; pre-S falls back to "CPU" and the user only sees CPU-bucket models). chipsetModelSuffixes maps known Snapdragons to one of three buckets:
SM8475, SM8450 → "8gen1"
SM8550, SM8550P, QCS8550, QCM8550, SM8650, SM8650P, SM8750, SM8750P,
SM8850, SM8850P, SM8735, SM8845 → "8gen2" (also covers 8 Gen 3 / Elite / Elite Gen 5)
any other SM* → "min"
non-Qualcomm → null (CPU-only)
isSdxlCapable(soc) is a stricter predicate: only {SM8650, SM8845, SM8750, SM8750P, SM8850, SM8850P} get the SDXL rows. SDXL contexts are baked at a single _8gen3.zip variant (no per-bucket file).
ModelCatalog.imageModels() is computed per-call (not in BUILT_IN_MODELS const) so the Build.SOC_MODEL read picks up cleanly. When a Snapdragon bucket is available it emits 5 SD 1.5 NPU rows (AnythingV5, QteaMix, AbsoluteReality, CuteYukiMix, ChilloutMix), the 2 SDXL rows (gated on isSdxlCapable), and 2 upscaler rows (Real-ESRGAN x4 anime + UltraSharp v2 Lite). On non-Snapdragon devices it instead emits 5 SD 1.5 CPU/MNN rows from xororz/sd-mnn. qnn2.28 is baked into the URL as the SDK version token; if the AAR ever upgrades to qnn2.30 both the URL constant and the v3 upgrade marker need to bump together.
HuggingFaceModel carries new image-gen fields (isSdxl, requiresNpu, isUpscaler, featureLabel, defaultPrompt, defaultNegativePrompt, generationSize); modelType ∈ {"image_gen", "image_upscaler"} switches the download finalize path. ProviderType.IMAGE_GEN and ProviderType.IMAGE_UPSCALER are the canonical categories on ModelInfo after install.
ModelStoreViewModel.finalizeImageGenDownload extracts the QNN/MNN ZIP into <filesDir>/sd_models/<id>/ via java.util.zip.ZipFile with a hardened path-traversal check (entry's canonical path must start with the target's canonical path + File.separator). Archive deleted after extraction, ModelInfo inserted with path = the dir. finalizeImageUpscalerDownload is simpler: the upscaler is a single .bin file at <filesDir>/sd_upscalers/<id>/upscaler_<bucket>.bin, no extraction.
repo/ImageGenManager.kt is the Hilt @Singleton wrapper around StableDiffusionManager.getInstance(context). ensureRuntime() is mutex-guarded and fires StableDiffusionManager.initialize() on first use, which extracts qnnlibs.tar.xz from the AAR's bundled assets into <filesDir>/ai_sd_runtime/. Subsequent loadDiffusionModel(model, w, h) calls run model-specific load on the engine. The active model id is cached so re-entering Image Task screen with the same model is a no-op.
ui/screens/image_task/ImageTaskScreen.kt + ImageTaskTopBar.kt + viewmodel/ImageTaskViewModel.kt. Route: NavScreens.ImageTask ("image_task"). Reachable from the chat drawer's "Images" quick-link. The screen is one LazyColumn of cards:
- Task —
ActionToggleGroupsegmented switch (Generate / Edit / Inpaint / Upscale). - Image model / Upscaler — list of installed models for the picked task; tapping a row triggers
loadDiffusionModelorloadUpscaler. - Prompt — TnTextField for prompt + negative prompt (hidden in Upscale mode).
- Settings —
ActionToggleGrouprows for Steps, CFG, Scheduler, Resolution, Denoise (img2img / inpaint only). - Input image — SAF
OpenDocumentpicker, shown for Edit / Inpaint / Upscale. - Run — Generate / Edit / Inpaint / Upscale 4× button, with Stop appearing during generation. LinearProgressIndicator binds to
DiffusionGenerationState.Progress.progress. - Output — renders the final
Bitmap(or live intermediate ifshowDiffusionProcessis on).
- The
:ai_sdlibrary declarescommons-compressandxzasapideps in its module build. When consuming as a path AAR (implementation(files(...))) Gradle does NOT pull transitive deps from a POM-less file dependency, soapp/build.gradle.ktsmust declare both directly.xzwas added togradle/libs.versions.tomlasorg.tukaani:xz:1.12. app/proguard-rules.proadds-keep class com.dark.ai_sd.** { *; }and-dontwarn com.dark.ai_sd.**alongside the existing gguf_lib / ai_sherpa rules.- The AAR ships
qnnlibs.tar.xz(~200 MB compressed) insideassets/qnnlibs/. First-run setup extracts it intofilesDir/ai_sd_runtime/and is observable throughRuntimeSetupState. Don't move the runtime dir without bumping the SDK version token in catalog URLs. - Currently the debug AAR is shipped (release AAR's R8 mangled
StableDiffusionManager.Companion.getInstance). Once:ai_sd'sconsumer-rules.proadds-keep class com.dark.ai_sd.StableDiffusionManager$Companion { *; }and the AAR is rebuilt, swap to release.
To add Object Removal (LaMa), Segmentation (MobileSAM), Depth (MiDaS / Depth Anything V2), or Style Transfer (AdaIN) — the C++ already implements all four; needs a small JNI surface on SDNativeLib + matching state flows on StableDiffusionManager + new ImageTaskMode values + per-feature catalog rows pointing at the matching HF repos.
To add SDXL on devices that aren't in isSdxlCapable — pipeline-level, not gating-level. SDK has textEmbeddingSize=768 hardcoded across the C++ pipeline + SDK keeps 4-channel latents + 77-token CLIP + LDM-style weight names. SDXL needs 2048-dim, dual CLIP, additional UNet conditioning inputs (text_embeds, time_ids), and matching sd_structure.h / lora_mapping.h entries. Out of scope for this pivot.
Dedicated screen aggregating every queued / downloading download plus a persistent history of completed / failed / cancelled ones. Reachable from a Download icon in the Model Store top bar (badge dot when active count > 0). Route: NavScreens.Downloads("downloads").
Two singletons sit between HxdManager and the UI:
DownloadCoordinator(@Singleton, inrepo/) — subscribes toHxdManager.tasksonce at first construction. Holds an in-memoryConcurrentHashMap<Int, DownloadLabel>(hxdId → displayName + type). On every emission, recomputesactiveCount: StateFlow<Int>(count of QUEUED/CONNECTING/DOWNLOADING/PAUSED tasks) and detects per-task terminal transitions (COMPLETED/FAILED/CANCELLED). On a first-time terminal transition it removes the label, then writes aDownloadHistoryEntryvia the repo.recordedTerminals: Set<Int>dedupes so a task that bounces between flow emissions only gets one history row.DownloadLabel.fromUrl(url)provides a fallback when the label was never registered (e.g. a download that survived process restart).DownloadHistoryRepository(@Singleton, inrepo/) — HXS-backed at<filesDir>/download_history_v1/. Sealed underHKDF(DEK, salt=signerHash, info="tn.download_history.user_key.v2")— same v2-signer-bound pattern as every other vault. Collection namedownload_history. TAG layout1=id (UUID), 2=displayName, 3=type, 4=status (ord), 5=totalBytes, 6=completedAt, 7=error.insert()writes + caps to MAX_ENTRIES=50 newest + flushes + refreshes the StateFlow.clearAll()deletes every record. Capping happens insideinsert()so the vault can't grow unbounded.
Every enqueue site MUST call coordinator.registerLabel(hxdId, displayName, type) immediately after HxdManager.enqueue(...). Current sites:
ModelStoreViewModel.downloadModel(model)— label =model.name, type fromdownloadTypeOf(model)mapping (mmproj/vlm/gguf/ per-model-type strings).ImageGenManager.downloadRuntime()— label ="AI Image Runtime", type ="runtime".
Any new download path that calls HxdManager.enqueue MUST register a label too, otherwise the row falls back to url.substringAfterLast('/') and the history entry shows the raw filename instead of a human-readable name.
DownloadCoordinator + DownloadHistoryRepository are :app-process-only by construction. Verified at build time: neither :server (service/server/) nor :inference (service/inference/) imports them. The coordinator is reached only via ModelStoreViewModel and ImageGenManager, both of which are constructed only in :app. If a future service-process class needs to know about downloads (e.g. a notification surfacing history in :server), it MUST go through AIDL — never inject the coordinator directly, because that would attempt HXS unwrap from a non-main process and crash.
DownloadsViewModel(@HiltViewModel) — exposesactiveDownloads: StateFlow<List<ActiveDownloadItem>>(joined HxdManager.tasks + coordinator labels, filtered to non-terminal, sorted by hxdId) andhistory: StateFlow<List<DownloadHistoryEntry>>(passthrough from repo). Methods:cancel(hxdId),clearHistory().ActiveDownloadItem(hxdId, displayName, type, state).DownloadsScreenatui/screens/downloads/DownloadsScreen.kt— singleLazyColumnwith section headers (Active · NthenHistory · Nwith trash icon to clear). UsesLocalDimens/LocalTnShapes/TnIconsthroughout; no inlinedpconstants for theme tokens. Empty state when both are empty (Download icon + "No downloads yet").DownloadsTopBar— minimal GuideTopBar-style top bar; dispatched fromAppTopBar.kt'swhenblock onNavScreens.Downloads.route.
For each ActiveDownloadItem:
- Header row: type-icon badge (mapped from
typestring) + display name + cancel icon (X). - Body switches on
HxdStatus:- QUEUED → "Queued" text
- CONNECTING →
TnIndeterminateProgressBar+ "Connecting…" - PAUSED →
TnProgressBar(if totalBytes > 0) + "Paused" - DOWNLOADING →
TnProgressBar(or indeterminate if totalBytes ≤ 0) +"<pct>% · <downloaded> / <total>"left,state.speedFormattedright.
For each DownloadHistoryEntry:
- Status icon (CircleCheck primary / AlertTriangle error / X muted) + display name.
- Subtitle:
"<status> · <relative time> · <size>"(size omitted if ≤ 0). Failed/cancelled rows append the first 48 chars of the error string when present. - Relative time via
android.text.format.DateUtils.getRelativeTimeSpanString(..., MINUTE_IN_MILLIS, FORMAT_ABBREV_RELATIVE).
ModelStoreScreen adds a Download ActionButton in the TopAppBar.actions slot, always visible (not gated on tab). When viewModel.activeDownloadCount > 0, a small MaterialTheme.colorScheme.primary circle (8.dp) is drawn at top-end via Box(contentAlignment = TopEnd) + Modifier.offset(x=-4.dp, y=4.dp). Count comes from ModelStoreViewModel.activeDownloadCount which is a direct passthrough of DownloadCoordinator.activeCount.
- No retry button on failed history rows. To retry, the user re-initiates from the Store.
- No per-row delete in history. The "Clear all" trash icon in the section header is the only deletion path.
- No pause/resume controls (HxdManager exposes them, but the existing Store UI doesn't surface them either — keeping parity).
- No filtering or search in history (50-entry cap keeps the list short enough).
- No grouping by type. Single chronological list ordered newest-first.
Floating black pill that morphs into a card, drawn over every app via TYPE_APPLICATION_OVERLAY. Lives in app/src/main/java/com/dark/tool_neuron/service/island/. Foreground service (IslandOverlayService, foregroundServiceType="dataSync", stopWithTask="false") keeps the overlay alive across recents-swipe. Window params are WRAP_CONTENT + FLAG_NOT_FOCUSABLE + FLAG_LAYOUT_NO_LIMITS + gravity TOP|CENTER_HORIZONTAL + x=0; the pill is always horizontally centered at the top of the screen. WRAP_CONTENT means the window resizes in lockstep with the morph animation (pill grows into card symmetrically both sides) so unrelated taps miss the window entirely in pill mode. The user calibrates only Y (offsetYDp) via the prototype screen slider — there is no X calibration because the pill is always centered.
Compose surface is a single updateTransition-driven Surface with iOS-style squircle. Shape comes from islandShape(cornerRadius) in IslandShapes.kt — a ContinuousRoundedRectangle built on a dedicated G2Continuity profile (extendedFraction = 0.6, arcFraction = 0.4, bezierCurvatureScale = 1.2, arcCurvatureScale = 1.2) tuned for the iOS Dynamic-Island squircle look. The profile is intentionally separate from the global TnContinuity in Shapes.kt so the island's aesthetic can drift without bleeding into the rest of the app. Other spacing/dimensions route through the existing LocalDimens / LocalTnShapes: outer padding = dimens.spacingSm, card content padding = dimens.spacingLg, action-button background shape = LocalTnShapes.current.full. Pixel-sized identity constants (PILL_W/H, CARD_W/H, CARD_CORNER_DP=32, PRESS_SCALE=0.92, SWIPE_THRESHOLD_DP=48) live in IslandGeometry because they aren't generic theme tokens.
Single-progress morph for perfect frame-sync. updateTransition(expanded) drives ONE animateFloat (progress: 0f → 1f). Width, height, cornerRadius, pill-icon-alpha, and card-content-alpha are all derived via lerp(...) of that one progress value — they CANNOT desync, drift, or arrive at the target on different frames. The morph spec is a custom spring(dampingRatio = 0.85f, stiffness = StiffnessMediumLow, visibilityThreshold = 0.0005f) — chosen instead of motionScheme.defaultSpatialSpec because M3 Expressive's default has a noticeable overshoot that, when applied to corner radius / size simultaneously, reads as juttery for a Dynamic-Island-style morph. 0.85f damping gives a tiny bit of bounce on settle without overshoot; StiffnessMediumLow runs the morph at ~400ms which feels fluid. Other motion specs still come from motionScheme: press scale uses fastSpatialSpec<Float>(), mode-swap slide uses defaultSpatialSpec<IntOffset>(), all cross-fades use fastEffectsSpec<Float>().
Cross-fade timing: pillAlpha = (1f - progress * 2f).coerceIn(0f, 1f) and cardAlpha = ((progress - 0.5f) * 2f).coerceIn(0f, 1f). Pill content fades out across the first half of the morph, card content fades in across the second half. At progress=0.5 both are 0 → clean handoff with no visual overlap.
Shape allocation guard: cornerRadius.roundToInt() keys the remember { islandShape(...) } block, so the kyant ContinuousRoundedRectangle is reconstructed only at 1dp granularity (~15 allocations across a full morph instead of ~60). 1dp jumps in corner radius are visually imperceptible; preventing per-frame Shape allocation is what keeps the morph smooth on mid-tier devices.
Modes + gestures. IslandMode enum has ASSISTANT and CONTROL. State lives locally in IslandSurface since modes don't need to outlive the composable. The current mode is visible in BOTH pill and card states:
- Pill state → centered glyph icon (Sparkles for ASSISTANT, Sliders for CONTROL) rendered at
dimens.iconSm, animated viaAnimatedContentwith horizontal slide on mode change. - Card state → full layout with glyph + title + action buttons, same
AnimatedContentslide. - Tap →
onToggle()+HapticFeedbackConstants.CONFIRM(pill ↔ card). - Long-press →
onToggle()+HapticFeedbackConstants.LONG_PRESS(same logical action, stronger haptic to confirm a deliberate gesture). - Press-and-hold animation → the
onPresslambda flips apressedboolean;pressScaleanimatable shrinks the surface toPRESS_SCALE = 0.92viagraphicsLayer { scaleX = pressScale; scaleY = pressScale }and snaps back on release. - Horizontal swipe (works in BOTH pill and card state) →
detectHorizontalDragGesturesaccumulatesdragAmount. CrossingSWIPE_THRESHOLD_DP = 48swapsmode(ASSISTANT ↔ CONTROL) withHapticFeedbackConstants.GESTURE_ENDhaptic. Swiping on the pill changes the badge icon and pre-selects the mode for next expansion. Tap and drag are in separatepointerInputmodifiers — Compose's per-modifier gesture isolation lets them coexist (tap claims on no-slop release, drag claims on slop-exceeded movement). - Action button taps (mic/send/volume/settings inside the card) fire
HapticFeedbackConstants.CONFIRM; the actions themselves aren't wired to anything yet — this is still the prototype surface.
Icons via TnIcons. Assistant mode shows Sparkles + Mic/Send. Control mode shows Sliders + Volume/Settings. All icons are TnIcons (stroke-based, 24x24 viewport, tinted via Compose Icon(tint = Color.White)).
Placement is via WindowManager.LayoutParams.y, NOT Compose Modifier.offset. The service holds a single Animatable<Float> (animY) and feeds its value into windowManager.updateViewLayout(islandView, params) via a snapshotFlow collector. User-position slider changes snapTo instantly; smart-dodge changes animate with a bouncy spring (dampingRatio = 0.55f, stiffness = StiffnessMediumLow). Compose-side only renders the pill at its natural padding(8dp) position; it does NOT know about position or dodge. Reason: if you offset via Compose, the WindowManager window stays at its original rect — touches keep landing on the original location (so a menu button under the old pill stays untappable) AND the pill render gets clipped at the window's WRAP_CONTENT bounds (the pill visually disappears past the wrapped edge). Moving the window itself solves both.
IslandAccessibilityService watches TYPE_WINDOW_STATE_CHANGED + TYPE_WINDOW_CONTENT_CHANGED, coalesces 150 ms, walks rootInActiveWindow for isVisibleToUser && (isClickable || isLongClickable) nodes, and checks whether any clickable rect overlaps the pill's natural screen rect. The dodge output is a single Float Y nudge (dp) published to IslandPositionStore.dodgeY. Since the pill is always horizontally centered, the only thing the service can move is Y — there is no horizontal escape direction to choose.
The pill rect is computed manually from IslandGeometry constants + IslandPositionStore.position.value.offsetYDp + statusBarTopInsetPx() — NOT from the windows API. Manual computation is stable across animation frames; using windows would create an oscillation loop (dodge → pill moves → no overlap → dodge=0 → pill snaps back → overlap → …). Geometry: pillLeft = (screenWidth - pillWidth) / 2, pillTop = statusBar + outerPadding + offsetY. The accessibility service expands this by DODGE_MARGIN_DP = 8 for the search zone (proactive dodge before strict overlap) but uses the un-expanded rect for the actual push-down math.
Dodge math: for each clickable obstacle that strictly intersects pillRect, compute pushDown = obstacle.bottom + margin - pillRect.top. Take MAX across overlapping obstacles, clamp to [0, MAX_DODGE_DP = 96], convert to dp, publish to dodgeY. No obstacles → dodgeY = 0.
IslandOverlayService observes both position and dodgeY:
- Slider Y changes →
animY.snapTo(position + dodge)(instant; calibration tool must feel responsive) dodgeYchanges →animY.animateTo(position + dodge, dodgeSpring)(bouncy 0.55 / StiffnessMediumLow)snapshotFlow { animY.value }collector pushes each frame's value toLayoutParams.yviaupdateViewLayout. Gravity staysTOP|CENTER_HORIZONTAL+x=0always.
On TYPE_WINDOW_STATE_CHANGED (app switch) dodgeY is reset to 0f first so a stale dodge from app A doesn't leak into app B before the re-scan completes. On onUnbind (accessibility service disabled) dodgeY also resets to 0f.
The overlay window is tagged with LayoutParams.title = IslandGeometry.OVERLAY_WINDOW_TITLE ("TnIsland") for adb shell dumpsys accessibility debuggability — the accessibility service no longer uses it for pill-rect lookup (manual computation replaces that), but the title is kept because it costs nothing and helps inspect the overlay's bounds during troubleshooting.
The service is opt-in: requires BIND_ACCESSIBILITY_SERVICE (system-granted) which the user enables via Settings → Accessibility → Tool Neuron Island. The prototype screen surfaces a button that deep-links to Settings.ACTION_ACCESSIBILITY_SETTINGS. Detection of enabled state uses both IslandPositionStore.accessibilityActive (live, set in onServiceConnected/onUnbind) AND Settings.Secure.ENABLED_ACCESSIBILITY_SERVICES parsing (initial render before the service connects).
AccessibilityGuard now self-excludes context.packageName — our own service does NOT trip our own one-time RootWarningDialog. The existing allowlist (Google / Samsung / OEM accessibility services) remains intact for OTHER vendors.
IslandOverlayService.kt— foreground service, owns the WindowManager view + Compose host.IslandComposeView.kt— FrameLayout host for the inner ComposeView with one-time ViewTreeOwner attach.IslandOverlayRoot.kt—IslandSurface(expanded, position, nudge, onToggle)morph composable.IslandGeometry.kt— pill width/height/padding/dodge constants shared between composable and accessibility service.IslandPosition.kt—IslandPosition(offsetXDp, offsetYDp)+IslandNudge(dxDp, dyDp).IslandPositionStore.kt— Kotlinobjectsingleton;position,nudge,running,accessibilityActiveStateFlows;setOffset(persisted),setNudge/clearNudge(transient),setRunning,setAccessibilityActive.IslandAccessibilityService.kt— window-event coalesce + clickable-node walk + dodge math.OverlayLifecycleOwner.kt— minimal LifecycleOwner / ViewModelStoreOwner / SavedStateRegistryOwner for the overlay window.app/src/main/res/xml/island_accessibility_service.xml— accessibility service config (events, flags,canRetrieveWindowContent="true").app/src/main/res/values/strings.xml—island_accessibility_label+island_accessibility_description.
Sequence: Intro → TermsConditions (if !tcAccepted) → DevNotes (if !onboardingComplete) → SetupScreen (lock mode) → (SetupPassword if password chosen) → SetupTheme → ModelSetup → SetupRag → Home.
ScaffoldViewModel.resolveStartDestination() ordering: tcAccepted, then onboardingComplete, then securitySetupDone, then modelSetupDone, then isLockEnabled → PasswordScreen, else HomeScreen.
- Route:
NavScreens.TermsConditions("terms_conditions"). - Screen:
ui/screens/terms_conditions/TermsConditionsScreen.kt. Top bar:TermsConditionsTopBar.kt. Bottom bar:TermsConditionsBottomBar.kt. VM:viewmodel/TermsConditionsViewModel.ktwritesprefs.tcAccepted = trueon accept.BackHandler(true) {}absorbs back so users can't escape to Intro. - AppScaffold handoff:
onTermsAcceptedcallsmarkTermsAccepted()then navigates DevNotes (popping T&C). The same callback is reusable from Settings later — accept becomes a no-op + popBackStack whentcAcceptedis already true. - The screen body is plain-English use-at-your-own-risk language, not legalese. No "decline" button — close the app or accept.
The first interactive welcome screen for new users (NOT release notes for engineers). Compose-native section cards with icons covering: data stays here, chat models, voice, document attachments, vision, web search, local-network server, and a short "rough edges" honesty section. Lives at ui/screens/dev_notes/DevNotesScreenBody.kt. Replaces the previous markdown-blob version. fun DevNotesScreen(innerPadding: PaddingValues) signature is stable so TNavigation keeps working. All copy is plain-language; no jargon, no em dashes, no rule-of-three patterns.
- Route:
NavScreens.SetupTheme("setup_theme"). - Screen:
ui/screens/setup_screen/SetupThemeScreen.kt. VM:viewmodel/SetupThemeViewModel.kt(injectsThemeController). Selection commits immediately. - Continue button:
SetupThemeBottomBar.kt, dispatched fromAppBottomBar.kt. AppScaffold handoff:onSetupComplete → SetupTheme(TNavigation);SetupTheme → ModelSetup(AppBottomBar callback wires the navigation). - Top bar:
SetupScreenTopBar()dispatched fromAppTopBaronSetupTheme.route. - No "themeSetupDone" pref — defaults are valid on first launch.
ModelSetupScreen.kt shows three feature packs plus a "Custom" toggle for power users. Packs are bundles of catalog ids downloaded sequentially:
| Pack id | Includes | Approx size |
|---|---|---|
chat_only |
LFM2 350M | 200 MB |
chat_voice |
LFM2 350M + sherpa-onnx-whisper-tiny-en + vits-piper-en_US-amy-low | 310 MB |
chat_voice_large |
Qwen3 0.6B + same STT + same TTS | 530 MB |
Pack content is defined as PACK_CONTENTS: Map<String, List<String>> in ModelStoreViewModel. downloadPack(packId: String) resolves each catalog id and enqueues via downloadModel. Reuses the existing downloadByQuickStartId for chat-model resolution (preferred quant priority: Q4_K_M → Q4_K_S → Q4_0 → Q5_K_M → Q5_K_S → Q8_0 then smallest-by-size). The Custom side keeps "Browse all models" (opens ModelStore) and "Pick a local file" (SAF picker → ModelImportTypePicker).
Hub + 7 detail screens, all single-Scaffold (accept innerPadding: PaddingValues):
- Hub
AppGuideScreen.kt— three categories ("Getting started" / "Advanced AI" / "Your phone, your data"). Cards dispatch viaonOpenEntry(key). GuideDetailLayout(innerPadding, lede, icon, steps: List<GuideStep>, tips). Steps numbered, optionalvisualcomposable.GuideTopBar(title, onBack)dispatched fromAppTopBar.ktfor each guide route.- Detail screens:
GuideChatScreen,GuideModelsScreen,GuideRagScreen,GuideVlmScreen,GuideVoiceScreen,GuideSecurityScreen,GuideThemesScreen(and optionallyGuideServerScreenfor Remote Server). - Adding a feature: add
GuideEntryinAppGuideScreen.guideCategories(), key inGuideEntryKeys, route inNavScreens, detail screen,composable(...)registration inTNavigation, and awhencase inAppTopBar.kt.
49+ instrumented tests across 7 classes (PhaseOne/Two/Three/Four, ExtraHardening, Resilience, ExampleInstrumentedTest). All green on Pixel_Tablet AVD API 35. Any test mutating native global state must call PolicyEngine.resetForTesting() in @Before AND BootIntegrity.setRelaxedForTesting(true) BEFORE any hardFail path (otherwise the process _exit(1)s mid-test).
- Encrypt WAL (
hxs/src/main/cpp/wal.cpp) — plaintext even in encrypted mode. Real audit finding; needs HXS WAL format work. - Native cert pinning — low priority; offline-only scope.
- Play Integrity opt-in — conflicts with privacy-first.
- Don't re-introduce
Settings.Secure.ANDROID_ID— Keystore-attested identities only. - Don't take Argon2 below
t=4 / m=131072 / p=1. Constants inauth.h. - Don't re-add
OPENSSL_NO_ASM=1— ARM crypto matters for performance. - Don't expand the unauth feature set in
policy_engine.cpp::is_unauth_featurewithout explicit threat-model review. - Don't remove
setRelaxedForTestingwiring; tests rely on it. - Don't emit plaintext detection strings in native code — wrap them in
HXS_OBF. - Don't switch back to
verifyPassword(): Boolean. The contract isVerifyResult. - Don't route any gated feature around
PolicyEngine.isAllowed. - Don't collapse the Quick-Start quant preference list in
TNavigation.kt. The priorityQ4_K_M → Q4_K_S → Q4_0 → Q5_K_M → Q5_K_S → Q8_0then smallest-by-size keeps the "Tiny & Fast" download tiny. - Don't send VLM image bytes as
byte[]over AIDL —ParcelFileDescriptor[]only (1 MB binder limit). - Don't read images on the main thread in
InferenceService.generateVlm— PFD reads happen in thescope.launchDispatchers.IO collector. - Don't drop the VLM marker prefix when
isVlmLoaded.buildMessagesJson(messages, vlmLastUserId)must prependgetVlmDefaultMarker(). - Don't key VLM repo detection off anything other than the
mmprojsubstring (case-insensitive). Repos usemmproj-<name>-F16.gguf,*-mmproj-*.gguf, etc. - Don't re-add a manual "Load projector" UI. Auto-load is the contract.
- Don't flatten the VLM folder layout. Base + mmproj as siblings under
models/vlm/<repoLeaf>/. - Don't register the mmproj as its own
ModelInfo. Mmproj is a sibling on disk. - Don't skip
releaseVlmProjector()at the top ofModelSessionManager.loadand.unload. - Don't break the setup-flow handoff. Order is Intro → TermsConditions (if !tcAccepted) → DevNotes (if !onboardingComplete) → SetupScreen → SetupTheme → ModelSetup → SetupRag → Home.
ScaffoldViewModel.resolveStartDestination()checkstcAcceptedFIRST, beforeonboardingComplete. AppScaffold callback chain:onTermsAccepted → DevNotes,onSetupComplete → SetupTheme,onThemeSetupComplete → ModelSetup,onModelSetupComplete → SetupRag,onRagSetupComplete → Home. T&C must come before DevNotes — DevNotes is informational; T&C is the user's legal acknowledgment. Re-ordering them is a regression. - Don't fold the ModelSetup "Packs" toggle back to a single-model picker. The packs flow is the default for non-technical users;
ModelStoreViewModel.downloadPack(packId)enqueues every catalog id inPACK_CONTENTSfor that pack id and the Custom toggle covers Browse-all + Pick-local for power users. Pack ids:chat_only,chat_voice,chat_voice_large. Voice catalog ids are pulled fromModelCatalog.BUILT_IN_MODELS(sherpa-onnx-whisper-tiny-en,vits-piper-en_US-amy-low); chat repos resolve viadownloadByQuickStartIdso the existing quant-priority is preserved. - Don't drop the auto-active + auto-load contract on chat send. Three coupled rules:
(a)
ModelStoreViewModel.finalizeNonVlmDownloadMUST mark a freshly-installed GGUF model asisActive=truewhen no other GGUF is currently active. Without this, pack-based setup leaves the user with no active chat model and the next chat-send opens the manual model picker instead of generating. Voice models use the separateprefs.activeTtsModelId/prefs.activeSttModelIdfirst-install pattern infinalizeVoiceDownloadand that path stays intact. (b)HomeViewModel.sendMessageMUST (1) fall back tochatModels.value.firstOrNull()?.also { modelRepo.setActive(it.id) }whenactiveModel.valueis null but a chat model is installed, and (2) callmodelSession.load(active)then checkloadState.value is ModelLoadState.Activebefore invokingrunGenerationwhen the engine is not yet loaded. The user's typed message must already be persisted to chat history BEFORE the load coroutine kicks off so the input bar clears and nothing is lost on slow loads. Only fall back to opening_loadModelWindowwhen there are zero chat models installed at all. (c)HomeScreenBottomBar.canSendMUST betext.isNotBlank() && !isGenerating && (isModelLoaded || installedModels.isNotEmpty()). TyingcanSendtoisModelLoadedalone re-introduces the original bug where the load-model composable pops up on send instead of generating, because pack-based setup completes with the engine still unloaded. Together these three put the manual load-model composable on the rare-empty path only. The common path is: type, hit Send, the inline pill flips to Loading then Generating, the response streams. The nativeModel loaded (ctx=...)log appearing seconds after Send is normal and expected on the first send post-launch. - Don't re-restrict
canEditinChatMessageListto user messages only, and don't re-hardcodecanEdit = falseonAssistantBubble. Both user AND assistant messages are editable; the VM isHomeViewModel.editMessage(messageId, newContent)which branches by role — user message: save + delete every later message + regenerate from there (existing semantics); assistant message: save in place, do NOT touch subsequent messages and do NOT regenerate. Editing a message whosearchivedByCompactId != null(folded into a compact summary) is rejected at the VM layer. TheEditMessageDialog's confirm button uses"Save & regenerate"for user edits and"Save"for AI edits — those labels are deliberate UX signaling that the two paths behave differently. - Don't re-add
&& !archivedtocanForkinMessageBubble. Archived (folded-into-compact) messages ANDCompactSummarycards themselves must remain forkable — that's the whole point of the fork action for compacted history: lets the user spin off a new chat from a summary point or from any pre-summary turn.ChatRepository.forkChat(sourceChatId, atMessageId)is role/kind-agnostic, so the VM imposes no restriction either. TheCompactSummaryCardcarries its ownMessageActionsrow withcanFork = trueand everything else off (no edit, no delete, no regen) — adding delete to compact summaries would orphan their archived predecessors from any visible history. - Don't put back
documentRepo.clearAll()inRagManager.init. Doc records persist across restarts; the engine re-ingests lazily throughhydrateChat(chatId). Wiping breaks the prev-chats picker. - Don't generate a UUID-based docId for chat documents.
id = "$chatId:$sourceId"so re-attach is idempotent andremoveDocumentcan reference-count the source blob. - Don't ingest a chat document without first writing its bytes to
<filesDir>/chat_documents/sources/<sha256>.bin. The picker re-ingests from that file on demand. - Don't downgrade
DocumentRepositoryback toopenPlaintext. Metadata is sealed under HKDF(DEK, "tn.chat_documents.user_key.v1") atchat_documents_meta_v1/. The init's legacy migration is one-shot — re-running on already-migrated installs only deletes top-level files inchat_documents/(preservingsources/), so it's safe to leave in place. - Don't drop the BM25
RagKeywordIndexfrom the augment path. Pure dense retrieval is the regression we already fixed. TheRagManager.augmentflow is: optional multi-query (LLM rewriter) → per-query (dense + BM25) →rrfFuseMany→ optional LLM rerank → token-budget trim → top-N. Keep the order; flipping rerank before fusion gives the rerank LLM nothing useful to look at. - Don't move the BM25 index back to Kotlin / SQLite. The tokenizer + inverted index + scoring all live in
hxs::RagKeywordIndex(C++) and the records are encrypted via the existing HXS AEAD path. Reasons: (1) tamper resistance — index ranking logic is harder to manipulate when it's behind libhxs.so; (2) on-device portability — some Android OEMs strip the FTS5 module from system SQLite, breaking SQLite-based BM25 entirely; (3) privacy — chunk text was plaintext on disk indatabases/rag_keyword_v1.dbpreviously. The new vault at<filesDir>/rag_keyword_v1/is sealed underHKDF(DEK, "tn.rag_keyword.user_key.v1")like the rest of the app's data. - Don't bypass
appPrefs.ragSmartRerank/appPrefs.ragMultiQuerytoggles. Both Phase 2 features are user-opt-in (off by default) because they each cost an LLM call per query. The rerank prompt is inRagReranker.buildPrompt; the variants prompt is inRagQueryRewriter.buildPrompt. Don't strip thewithTimeoutOrNull(15s/8s)either — the chat model can hang on bad input. - Don't change the Citation TAG byte.
ChatRepository.TAG_MSG_CITATIONS = 13for assistant messages; older messages without the TAG decode withcitations = emptyList(). JSON shape:{sourceId, docId, chunkIndex, score, name, mimeType, snippet, cited}.RagCitationMatcher.matchwrites them on every assistant turn that ran through RAG augmentation;MessageBubblerenders them viaCitationStrip(chip per citation, tap → AlertDialog). - Don't pass binary-format documents through the FTS5 indexer. Native engine is the only thing that can extract their text.
RagManager.isTextLike(mime, name)is the gate — text/* mimes, structured-text mimes (json/xml/rtf/yaml/javascript), and known text extensions (txt/md/html/csv/log/code). Binary docs (pdf/docx/epub/etc.) get dense-only retrieval until a Kotlin extractor or a native API addition lands (see #329). - Don't change the RRF identity key. Items in the fused pool are keyed by
(docId, chunkIndex). Native chunks and FTS5 chunks use independent indices, so identical text from both rankers is treated as two separate hits — that's intentional, since they come from different chunk boundaries. RRF still works correctly. - Don't enable algorithmic darkening / unguarded toggles in the Settings → Chat & RAG section without a
prefs.<key>write paired with a_<flow>.value = valueupdate. Thecombine(...)flow has 13 inputs now; if the toggle's StateFlow doesn't update, the UI stays stale until process restart. - Don't drop
-Wl,-z,max-page-size=16384in any owned native CMake target. Android 15+ / Play Store requires 0x4000 LOAD alignment on arm64 + x86_64. Verify:unzip -p libs/ai_sherpa-release.aar jni/arm64-v8a/libai_sherpa.so > /tmp/s.so && readelf -l /tmp/s.so | awk '/LOAD/{getline;print $NF}'→0x4000. - Don't
secureWipethe userKey passed toHexStorage.openEncrypted/createEncrypted.hxs.cppkeeps aNewGlobalRef; zeroing turns every later AEAD op into a zero-key op (silent decrypt failure on next launch). - Don't zero
AppKeyStore.cachedDekin-place insidewipe(). The same ByteArray is held by HXS viaNewGlobalRef(it's theappKeypassed intoopenEncrypted). Filling it with zeros breaks every in-flight HXS op in the running process and crashes the app immediately after panic-PIN wipe. Just null the reference and let process-kill on the WipedScreen Restart button handle physical memory clearing. - Don't tighten
migrateLegacyIfNeeded()back to bootstrap-only. AfterhardWipe(),app_bootstrap/is empty (k.bin deleted) butapp_prefs/*still has records sealed under the now-gone DEK. The migration must fire when EITHER the bootstrap dir has stale files ORapp_prefs/has files butk.binis missing — otherwise next-launch tries to decrypt records under a fresh DEK andSecurityExceptions. - Don't downgrade
keyStore.wipe()back to deleting onlyk.bin+ the Keystore alias. The panic-PIN/wipe contract is "delete everything the user owns" — models, voice files, chat history, RAG documents, plugin state, repo config, the lot. Implementation:context.filesDir.listFiles().forEach { deleteRecursively() }+context.cacheDir.listFiles().forEach { deleteRecursively() }+ alias delete. Anything held mmap'd by:inferenceor:serversurvives via POSIX inode-after-unlink until those processes die, but the data is gone after the user taps Restart. - Don't run Argon2id on the main thread.
SecurityManager.setPassword/verifyPassword/setPanicPinare all ~1.5 s on a Pixel 8; calling them from a ComposeonSubmitlambda freezes the UI. Every call site in viewmodels must wrap withviewModelScope.launch { withContext(Dispatchers.Default) { … } }.PasswordViewModel.submitalready does this;SettingsViewModel.openPinDialog / openDisableLockDialog / openPanicPinDialogandSetupViewModel.submitPassword(onSuccess)were main-thread until 2026-04-27. - Try the panic PIN BEFORE the lockout gate in
SecurityManager.verifyPassword. A duress-PIN must work even when the user is locked out — that's the whole point. Order: (1) try panic againstpanicSalt/panicHash, hardWipe + return Wiped on match; (2) checknextAttemptAtMs > nowMsand return LockedOut; (3) try real PIN; (4) bump counter on miss. - Prime
PasswordViewModel._lockedUntilMsfromsecurity.snapshotLockoutState().nextAttemptAtMsat construction. If you initialize it to0L, an app restart while locked shows the password input again (the persisted lockout only kicks in after a SUBMIT, letting the user freely retry inputs). The countdown screen must come up immediately on launch when the lock is still active. - Don't eagerly construct
AppKeyStoreorAppPreferencesfrom any process other than main.InferenceServiceruns in:inference.TNApplication.isMainProcess()early-returns; integrity / pref Hilt fields must bedagger.Lazy<T>. - Don't wrap individual screens in their own
Scaffold. One Scaffold =AppScaffold. Per-route bars go inAppTopBar.kt/AppBottomBar.ktwhenblocks. - Don't set
isMinifyEnabled = trueon any library module. Only:appminifies. Library minification collides onType a.a is defined multiple timesagainst pre-minified prebuilts. - Don't remove the per-step
visualcomposables in guide detail screens. Update them if the real UI changes. - Don't key the TOFU
.somanifest by absolute path. Filenames +nativeLibraryDirresolve. - Don't verify the
.somanifest across app updates without rebinding to install identity. Mismatched{signerHash, longVersionCode, lastUpdateTime}triggers re-TOFU, not hard-fail. - Don't re-add a root hard-fail. One-time
RootWarningDialog, gated onrootWarningShown. - Don't re-add a hard-fail for
FAIL_XPOSEDorAccessibilityGuard.SuspiciousAttached.scan_process_environmentreturns a bitmask;TNApplication.onCreateonly hard-fails onFAIL_DEBUGGER | FAIL_FRIDA. TheFAIL_XPOSEDbit lands inTNApplication.softEnvReasonsandaccessibilityGuard.scan()results land inScaffoldViewModel.resolveInitialRootWarning; both surface as additional paragraphs in the existingRootWarningDialog. Reason: rooted users almost universally also have LSPosed and/or one third-party a11y service installed (banking-overlay-blocker, password manager, Tasker, Shizuku-helper). Hard-failing the whole app on first launch is a worse user experience than warning once and letting them opt in. The active-attack tools (debugger,frida-server) still hard-fail because those are not "device customisation", they're "someone is poking at the running process right now". - Don't bind
:inferencefrom:appwithBIND_IMPORTANT. That flag elevates:inferenceinto the same OOM-priority bucket as:app(or higher), so on low-RAM devices like Snapdragon 662 / 4 GB phones the kernel's lowmemorykiller picks:appto evict instead of the process actually holding the multi-GB model mmap. PlainBIND_AUTO_CREATEkeeps:inferenceon the foreground-service path (it callsstartForegroundinonCreate), so it's still well-protected — but:appis no longer demoted relative to it.InferenceClient.performBindusesBIND_AUTO_CREATEonly. - Don't leave
_service.first { it != null }un-timed in thegenerate/generateMultiTurn/generateVlmcallbackFlows. Wrap withwithTimeoutOrNull(BIND_TIMEOUT_MS)and emitInferenceEvent.Error("Inference service unavailable")on null — otherwise a permanently failed rebind hangs the UI in_isGenerating = trueforever with no surfacing. - Don't drop the
failPendingLoadscall fromonNullBinding. All fourServiceConnectiondeath paths (onServiceDisconnected,onBindingDied,onNullBinding,unbind) must drainpendingLoadsand resume them withResult.failure, otherwise a load coroutine that started right before service-process death suspends forever. - Don't wire the foreground notification's Stop button as
PendingIntent.getService(... InferenceService, ACTION_STOP). That kills:inference, but:app'sBIND_AUTO_CREATEbinding is still alive —onServiceDisconnectedfires, our handler callsrebind(), and Android respawns:inferenceimmediately. The user's Stop appears to do nothing. The correct wiring isPendingIntent.getBroadcast(... InferenceStopReceiver, NOTIFICATION_STOP)→ receiver runs in:app→ callsInferenceClient.requestUserStop(context)→ setsuserStopRequested = trueANDunbind()s first (removes the BIND_AUTO_CREATE anchor so respawn can't happen) → THENstartService(ACTION_STOP)so:inferenceruns itsunloadEverything + stopForeground + stopSelf + killSelfProcessteardown. TheuserStopRequestedflag also short-circuits anyonServiceDisconnected/onBindingDiedthat races the unbind, so even the race window can't trigger a respawn. Both belts and the suspenders are required — the flag alone doesn't help if the binding is still alive; the unbind alone has a tiny window between the kill and the unbind delivering on the binder thread. - Don't drop the
try { modelSession.load(active) } catch (Exception)wrapper inHomeViewModel.sendMessage's auto-load branch.viewModelScopeisSupervisorJob-backed, but an unhandled throw from a child coroutine still reaches the thread's default uncaught handler and crashes:app. The existingResult<String>contract fromInferenceClient.loadModelkeeps the happy path safe; the wrapper is defense-in-depth for any future code that throws here. - Don't drop the pre-load RAM check in
InferenceService.loadModel.File(path).length()vsMemAvailable:from/proc/meminfo: ifmemAvail in 1 until (modelSize * 6 / 5), surface a clear "Not enough free memory" error to the client immediately. Without this guard, low-RAM devices try to mmap a model larger than physical RAM, the kernel page-faults loop, and:inferencegets killed mid-load — visible to the user as a generic "service died" with no actionable message. Skip the check whenmemAvail <= 0(read failed) so the gate doesn't accidentally block valid loads. - Don't strip the
logDeviceProfile()call at the top ofInferenceService.loadModel. LogsBuild.MODEL,Build.SOC_MODEL,Build.SUPPORTED_ABIS,MemTotal, and theFeatures:line from/proc/cpuinfoexactly once per service-process lifetime. This is the only telemetry that lets remote debugging triage a "model load crash on $unknown_device" report — specifically whetherdotprod/i8mm/fp16are present (Snapdragon 662 / Cortex-A53 lacks all three; many llama.cpp dispatch paths assume at least asimd-dotprod). - Don't re-add a plaintext HXS container for the bootstrap DEK. Raw XOR-masked
k.binonly. - Don't skip
migrateLegacyIfNeeded()inAppKeyStore.init. - Don't drop the signer-binding salt from any user-key derivation. Every
encryptor.deriveKey(ikm = dek, salt = ?, info = ?)call in app code MUST usesalt = keyStore.installSignerHash()(notsalt = dek, notsalt = null). Sites:AppPreferences.init,AppPreferences.deriveAuthKey,DocumentRepository.init,RagKeywordIndex.init,ChatRepository.init,SourceFileVault.keyFor. Without this salt, a same-device replaced-APK attack (root + repack with attacker's cert) can unwrap the DEK via the inherited uid-scoped Keystore alias and decrypt every vault. Salting with the signer hash means the patched APK derives a different user-key and AEAD fails — the data stays sealed. - Don't add a "fallback to all zeros" or "return empty bytes on failure" path in
AppKeyStore.computeSignerHash()/installSignerHash(). If the platform can't read the install signer, throwSecurityExceptionand let the app refuse to bootstrap. A zero-fallback collapses the binding for any device with a broken signature lookup, which is exactly the path an attacker would try to engineer. - Don't bump user-key info strings without bumping the version suffix (
v2→v3). Bumping invalidates existing v2 vaults — the open-or-rebuild helper detects the AEAD failure onopenEncryptedand wipes the dir. Documented loss is intentional; a silent loss because the info string was edited inline is not. - Don't downgrade
ChatRepositoryback toopenPlaintext. Chats and messages are sealed undertn.chats.user_key.v2. Plaintext on disk was the cross-device readability hole — closed in this build. - Don't bypass
SourceFileVaultforchat_documents/sources_v2/reads or writes. Every byte of every attached RAG document is AEAD-sealed per-file under a key bound to (DEK, signerHash, sourceId), with AAD = sourceId. This means: (a) cross-device read fails (no DEK unwrap), (b) cross-build read fails (different signer), (c) renaming a file breaks decryption (AAD mismatch — defends against record-substitution). DirectFile(...).readBytes()/writeBytes()is forbidden for source files. - Don't preserve the legacy
chat_store/(plaintext) orchat_documents/sources/(plaintext) directories on a v2 build.ChatRepository.initandSourceFileVault.initbothdeleteRecursively()them on first launch — this is the migration path that closes the historical leakage. Re-introducing those dirs (e.g. for a "compatibility shim") puts plaintext back on disk. - Don't unwrap the DEK in
:serveror:inference.:serverruns in its own process and never opens HXS — token, model path, and config are pushed via AIDLstart(configJson).:inferenceis similar. Only:app(main process) holds the DEK, and only the main process derives signer-bound user-keys. Cross-process key handoff would defeat the binding. - Don't link
:native-serveragainst BoringSSL / OpenSSL / zlib. Header-only httplib + getrandom(2). - Don't add a new HTTP route without auth pre-routing. Only
/,/index.html,/webui,/healthare inauth::is_public_path. Never make/v1/modelsor/v1/chat/completionspublic. RemoteServerServicelives in:server(its own process). Don't fold it back into:app.:serverMUST NOT open HXS — token / model path / config / asset HTML are passed in via AIDLstart(configJson). Token rotation pushes from:appviarotateToken(newToken).- Don't remove the
startService(Intent(this, RemoteServerService::class.java))call insidehandleStart(just beforestartForeground). The service is otherwise bind-only and gets destroyed when:appdies / swipes from recents. The self-start transitions it to the "started" lifecycle so a foreground notification +stopWithTask="false"keeps it alive across client death. - Don't let any UI escape the server lockdown.
LaunchedEffect(serverRunning, currentRoute)withpopUpTo(0) { inclusive = true };BackHandler(enabled = running) {}in ServerScreen; drawer gated byshowDrawer = currentRoute == HomeScreen.route && !serverRunning. - Don't persist the server token outside the encrypted HXS
app_prefsvault. /v1/modelsreturns the full enabled-engine catalog (every installed model acrossgguf/vlm/embedding/tts/stt/image_gen/image_upscaler) with per-entrytype+owned_by. Each model's JSON entry must carryid,path,type,config_json; forvlmrows alsommproj_path. Don't shrink this back to "currently loaded only" — clients pick the model per request via the OpenAImodelfield, and the registry lazy-loads on first use. Don't silently expose models withpathType == CONTENT_URI; they're filtered inServerController.buildEnginesCatalogbecause the server has no SAF trampoline.- Don't bypass
ServerEngineRegistryfrom any new route. Per-kind locks (Mutexfor the suspend-based loaders, plainsynchronizedfor the sherpa-onnx ones) serialise concurrent loads — a second request for the same engine waits for the first load to complete instead of racing it. New endpoints add a new typedKindtoserver_models::Kind, a new wrapper class, a new registry method, a newInferenceBridge.start<X>upcall, and a newnativeFeedReply*consumption path. Don't shortcut by reaching intoGGMLEngine/StableDiffusionManagerdirectly from the route handler — every engine touch must go through the registry so loading discipline is preserved. - Don't route
/v1/chat/completionsto the VLM engine unless the request has at least oneimage_urlcontent part.oai::extract_images_from_messagesis the single decision point — it setsrequest.has_images = trueonly when a part of typeimage_urlis detected in any message. Don't move that detection upstream into the rate-limit / auth layer; pre-routing must stay pure. - Don't accept network URLs in
image_urlparts on the server. Privacy / offline-only scope: onlydata:image/...;base64,...is parsed;http(s)://...returns 400invalid_image. Adding outbound fetch from:serverwould mean adding curl-impersonate to the native server, which doubles its native deps and breaks the "no BoringSSL/OpenSSL/zlib in:native-server" rule. - Don't pass big binary payloads (TTS audio, generated images, multipart image/mask inputs, multipart WAV) as
byte[]over JNI. The contract is: write the bytes to<cacheDir>/server-staging/tn_<rand>_<name>viaserver_staging, hand the path across JNI as a Java string. The C++ route reads/writes the file directly. Cleanup happens after the response is sent (staging::unlink_safe) and at every server stop (staging::purge_all). JNI byte[] copies for 4 MB images are measurable overhead and have caused OOMs in adjacent products. - Don't change
reply_session.wait(timeout_ms)to take 0 or -1 unconditionally. The defaults exist for a reason: embeddings 120s, TTS 180s, STT 180s, image gen 600s, image upscale 300s. On a Snapdragon 8 Gen 1 a 20-step SDXL run can flirt with 5 min; the 600s ceiling is the cliff before we say "something's wedged". - Don't drop the in-process
ServerImageEngine.ensureRuntime()check that gates on<filesDir>/ai_sd_runtime/qnnlibs.tar.xzexisting. The user must have triggered the SD runtime download via the in-app Image Task screen at least once before the server's image endpoints work. Image gen routes return a clean 500 with "image engine unavailable or runtime not installed" if not. - Don't share
StableDiffusionManagerstate between:appand:server. Each process has its owngetInstance(context)(class loaders are per-process). They cooperate only via the shared on-disk<filesDir>/ai_sd_runtime/directory; whichever process initialises first extracts qnnlibs, the other sees the existing files and skips re-extraction. Don't add IPC between the two image stacks. - Don't fold the
:server-sideServerTtsEngine/ServerSttEngineback into usingInferenceClientAIDL.:serverMUST own its sherpa-onnx instances directly — the AIDL hop would mean (a) crossing into:inferencewhich has its own active TTS/STT for in-app voice, (b) yanking that state under the user's feet when the chat-side mic is in flight. The two stacks are intentionally independent. - Don't read voice / embedding "active" preferences from inside
:server. The server doesn't open HXS.ServerTtsEngine.ttsFor(modelId)falls back tocatalog.firstOf(TTS)when the requested id isn't found — which is "whatever's listed first inModelRepository.models" (install order). If the request omitsmodel, the server picks the first installed engine of that kind. Clients that want a specific voice MUST sendmodelexplicitly. - Don't broaden the OpenAI streaming contract to non-chat routes. Embeddings, TTS, STT, image gen are all single-response. Adding
stream:truesupport to them would require either a new SSE schema (no OpenAI precedent for images), or partial-bytes-over-chunked-transfer (sherpa-onnx and SD AAR are batch-only; no per-step callback exposes individual samples). Stay aligned with OpenAI's published shapes. - Don't trip the
server_->set_payload_max_lengthsetting back to 1 MB. The 64 MB cap is sized to fit base64-encoded VLM image_urls (4 MB raw ≈ 5.4 MB b64) plus multipart audio uploads (whisper-friendly 30s WAV at 16kHz 16-bit ≈ 960 KB) plus 4× upscale inputs. Same applies to the read/write timeouts (60s / 120s) — TTS synthesis of a single sentence on a Tensor G3 takes ~3s, but a 200-character paragraph runs longer. - Don't add audio transcoding to
/v1/audio/transcriptions. WAV-only is the documented contract. Bringing in ffmpeg or symphonia would balloon the native footprint. If clients want MP3/AAC support, they decode client-side before upload (the bundled web UI's STT panel already only accepts.wav). - Don't bind the server only to the Wi-Fi IP by default.
ALL_INTERFACES(0.0.0.0) is the default so the loopback URL is reachable from the device's own browser regardless of Wi-Fi state. Display two URLs: loopback (always works) + LAN (when Wi-Fi is up). - Don't display
serverPortfrom raw HXS without validation. Getter validates [1024..65535]; setter clamps. Effective port (post-bind) is written back fromnativeBoundPort(). - Don't drop the
serverController.isBusygating on chat-side load/unload/send. The server owns the loaded model; uncontrolled chat-side reload would yank state mid-request. - Don't add a
modelType: Stringfield toModelInfo.ProviderTypeis canonical.HuggingFaceModel.modelType: Stringis the pre-install hint mapped at insert time. - Don't add a streaming-synthesize AIDL method. The AAR's
OfflineTts.generateis synchronous. Streaming TTS = client-side text chunking. - Don't record STT at anything other than 16 kHz mono
ENCODING_PCM_FLOATfromMediaRecorder.AudioSource.VOICE_RECOGNITION. - Don't skip the mid-chunk cancellation check in
TtsPlayer's write loop. - Don't pass
dataDir/espeak-ng-dataascontent://. sherpa-onnx wants filesystem paths. - Don't resurrect a BYOM / SAF directory import path for voice. Store-only.
- Don't make the Mic button conditional on
voiceSttAvailable— always rendered; the click handler routes to Store if no STT model is installed. - Don't skip
voiceManager.unloadStt()inHomeViewModel.stopRecordingAndTranscribe'sfinally. - Don't auto-request
FOREGROUND_SERVICE_MICROPHONE. - Don't cram more than three quick-links into a single drawer
SpaceEvenlyrow. The drawer layout is two separate rows: a "chat tools" row under the New Chat button (Store / Docs / Server) and an "info" row at the bottom (Guide / Dev Notes / Credits). Settings sits as a gearActionButtonin the drawer header next to the title — it is not in either row because it is global, not chat-related. Keep three-and-three; pushing four into either row reintroduces the touch-target squashing the original 6-in-a-row had. - Don't move the Credits screen out of fullscreen. Route is
NavScreens.Credits("credits").AppScaffold.isFullscreenincludes it alongside Intro and Password so the AppTopBar / AppBottomBar are hidden — the screen owns the full viewport and draws its own theme-coloured background. Audio isR.raw.credits(mp3 inapp/src/main/res/raw/);MediaPlayer.createis acquired viaremember, started inDisposableEffect, released on dispose.setOnCompletionListener { onExit() }exits when the audio ends. Scroll is averticalScroll(scrollState, enabled = false)Column animated byscrollState.animateScrollTo(maxValue, tween(durationMillis = mediaPlayer.duration, easing = LinearEasing))keyed onscrollState.maxValueanddurationMsso the first emission with non-zeromaxValuetriggers the crawl. User scroll is disabled to keep the timing deterministic; tap or back exits. Colours pull fromMaterialTheme.colorScheme(surfacebackground,primarytitle,onSurfacelines,onSurfaceVariantsection labels) — adapts to dark/light theme. - Don't let
VoiceModelManagerconstructAppPreferenceseagerly.dagger.Lazy<AppPreferences>. - Don't add
*.mdspec / plan / research / TODO docs at the repo root. Project memory lives here. Implementation roadmaps belong in conversation context. - Don't auto-scroll on every streaming token via
LazyListState.scrollToItem(index). That fights the user's drag and locks manual scroll mid-generation. The pattern is: trackstickToBottomfromsnapshotFlow { isScrollInProgress }(re-evaluate when scroll settles via!canScrollForward), gate the auto-scrollLaunchedEffectonstickToBottom && !isScrollInProgress, and usescrollToItem(last, scrollOffset = Int.MAX_VALUE)so the bottom of the growing streaming bubble stays in view. - Don't drop the
:native-serverconsumer-rules.pro keeps forNativeServer,NativeServer$*,InferenceBridge,BindMode. The native HTTP server's JVM bridge invokesInferenceBridge.startGeneration / cancelGeneration / onRequestEventvia JNI on aNewGlobalRef'd jobject; renaming or stripping breaks dlsym at runtime. - Don't drop the
com.dark.networking.**keep +dontwarnblock fromapp/proguard-rules.proor theWebNative/WebResponse/WebBytesResponse/WebSearchResultkeeps fromnetworking/consumer-rules.pro.WebNativeis a Kotlinobjectwith@JvmStatic external fun nativeFetch / nativeFetchBytes / nativeSearch / nativeHasBackend / nativeBackendName / nativeSetCaBundle / nativeSetProfile; the JNI binding isJava_com_dark_networking_WebNative_nativeFetchetc., so the class FQCN must survive R8. Without these keeps, every HF Explorer / web-search / model-catalog HTTP call dies on release withUnsatisfiedLinkError(build is green; runtime crash). Verify post-R8:grep com.dark.networking.WebNative app/build/outputs/mapping/release/mapping.txtshould showWebNative -> com.dark.networking.WebNative(identity mapping). - Don't keep
com.dark.ai_sd.**inapp/proguard-rules.pro. The AAR was removed in commit9d79a3b— the rule is dead weight. - Don't drop the
com.dark.gguf_lib.**/com.dark.ai_sherpa.**keep +dontwarnblock. The prebuilt AARs are already minified and rely on specific class+method names for JNI lookup. - Don't strip
lockedUntilMsandwipedfromPasswordScreen. Both flow throughPasswordViewModelfromVerifyResult.LockedOut(retryAtMs)andVerifyResult.Wiped. The screen brancheswiped → WipedScreen → "Restart" → finishAffinity + Process.killProcess(myPid)(post-hardWipe,PolicyEngine.markTampered()is latched and the process is unrecoverable in-place).lockedUntilMs > now → LockedOutScreenwith a 500 ms-tick countdown that clears itself once the timestamp passes. - Don't add a "set panic PIN" path that doesn't go through
SecurityManager.setPanicPin. It gates onsecurityMode == APP_PASSWORD(NOTsession.isAllowed(AUTH_DISABLE)— that was a non-deterministic timing trap; see Panic PIN section) and writes the second Argon2id hash into the same encryptedAuthStateblob. UI lives inSettings → Privacyand only renders whenisLockEnabled.SettingsViewModel._panicPinSetmirrorssecurity.hasPanicPinand must reset to false onsetPassword,disableLock, andWiped. Same gate change applies toclearPanicPinanddisableLock. - Don't fan out
RemoteCallbackListbroadcasts fromInferenceServicewithout holdingsdBroadcastLock.RemoteCallbackList.beginBroadcast()is not nestable — calling it from one thread while another is betweenbeginBroadcast()andfinishBroadcast()throwsIllegalStateException: beginBroadcast() called while already in a broadcastand kills:inference.startSdForwardinglaunches five parallel collectors (backend / generation / isGenerating / upscale / runtimeSetup) onDispatchers.IO— without serialisation they race onsdClients.beginBroadcast()and the service crash-loops immediately. The fix issynchronized(sdBroadcastLock)around the entirefanoutSdbody; the same lock can servetnClientsif a similar pattern is ever added there. - Don't bump
TAG_MSG_WEBSEARCH_RUNaway from14orTAG_MSG_WEBSEARCH_STATEaway from15. Older messages without these tags decode withwebSearchRunId = nullandwebSearchState = "". New chat-message fields must use TAG ≥ 16. - Don't reintroduce a runtime-only
webSearchEvents: Map<String, WebSearchEvent>in HomeViewModel. The card's state must come fromWebSearchUiState.fromJson(message.webSearchState)because (a) opening a different chat while a run is in flight should NOT bleed the running run's state into a completed chat's card; (b) after process restart, completed web-search cards must keep showing their Done state.HomeViewModel.handleWebSearchEventis the single write point — looks up(chatId, messageId)viawebSearchMessages[runId], reads the message, applies the event, writes back. - Don't lift the web-search lockdown. While
webSearchActive.value,HomeViewModel.sendMessage / loadModel / unloadModelall early-return — the chat LLM is borrowed for both the GenerateQueries and Synthesize calls. - Don't drop the web-search content swap in
InferenceCoordinator.buildMessagesJson. Web-search cards store the user's query inmsg.content(used by the card Header) and the synthesized answer inmsg.webSearchState. The single point of LLM-history assembly MUST swapcontentforWebSearchUiState.fromJson(webSearchState).answer.trim()whenwebSearchRunId != null, and SKIP the message entirely when the answer is blank (in-flight / cancelled / failed). Without the swap the model seesassistant: "<echoed user query>"instead of the actual research, and the next chat turn proceeds as if the search never happened. Without the skip, in-flight / failed cards inject an empty assistant turn that confuses the model. - Don't replace
WebNative.searchwithHttpURLConnectionor any other client. The:networkingcurl-impersonate Chrome116 fingerprint is the single allowed pipe for DDG. The 3 queries × 5 results contract assumes that backend. - Don't expand the web-search flow back into a multi-iteration pipeline. The user-facing contract is "3 queries, snippets, answer, done". Adding iterations / fetches / per-page extraction is what the old Research pipeline was; it was deleted on 2026-05-15 because it was minutes-slow and barely better than snippet-only synthesis.
- Don't switch
WebSearchCoordinatorfromtryEmitto suspendingemit(...)in thecatch (CancellationException)/catch (Throwable)blocks. The catch fires on a cancelled Job, andwithContext(Dispatchers.Default)throws CE immediately on a cancelled context — so the Cancelled event never reaches the SharedFlow and the card freezes mid-flight._eventshas 64-slot buffer; tryEmit always succeeds. - Don't change
WebSearchPrompts.QUERY_LINE_REGEX. The synthesis prompt asks for the format1. <query>/2. <query>/3. <query>and the regex^\s*(?:\d+[.)\-:]|[-*•])\s+(.+)$parses that AND tolerates common LLM deviations (-/*/•bullets,:after the number). Tightening the regex breaks smaller models that don't follow numbered-list instructions perfectly; loosening it picks up the LLM's preamble lines. - Don't fall back to
java.net.HttpURLConnectionfor any HuggingFace API call — search, model info, tree, raw README, tags-by-type, trending, quicksearch, resolve. Every HF request goes through:networking(WebNative.fetch) so it inherits the curl-impersonate Chrome116 fingerprint + bundledcacert.pem+ strict cert verify. The hub isrepo/HuggingFaceApi.kt(Hilt singleton class, not an object);repo/hf/HfClient.ktbuilds typed endpoints on top.ModelCatalogandRepositoryValidatorinjectHuggingFaceApi. Same rule applies for any future HF or non-HF HTTP target —:networkingis the only allowed pipe. - Don't change
WebNative.fetchback toResult<String>. The contract isResult<WebResponse>whereWebResponse(status: Int, body: String, error: String?). Result.failure is reserved for transport-layer issues (DNS, TLS handshake, native call collapse). HTTP non-2xx comes back asResult.success(WebResponse(status=4xx, ...))— callers can react to 429 (rate limited) vs 404 (not found) vs 401/403 (auth). Old behavior of returningnullon non-2xx silently masked HF API bugs (e.g. invalidexpand=params returning 400) for years. - Don't log full URLs (with query string) to
ANDROID_LOG_WARNfromnet_jni.cpp. Use thehost_of(url)helper. Search queries are user PII (typed model names, sometimes sensitive). Status code + host is the maximum log surface. - Don't add per-keystroke quicksearch / autocomplete to the HF Explorer search bar. Search fires on the Search button, the IME
Searchaction, or a filter chip touch — never on every typed character. The HF API has a 500-call/5min unauthenticated rate limit per IP; per-keystroke autocomplete burns it on a single typing session. Slider drags fire only ononValueChangeFinished(one call per drag). Post-filter sliders (minDownloads,minLikes,recentDaysinHfPostFilters) updatevisibleResults()locally without an API call.HfClient.quickSearchexists for future use but the UI must not wire it to typing. - Don't write the HF tags catalog (
/api/models-tags-by-typepayload) anywhere outside the encryptedapp_prefsvault. Keys arehf_tags_catalog_v1(JSON string) andhf_tags_catalog_v1_at(Long unix-millis), 24h TTL. The catalog feeds the dynamic filter chips; falling back toHfFilterTaxonomyconstants is OK but only for the brief window before the catalog hydrates. Plaintext-on-disk is forbidden — use the encrypted prefs API only. - Don't pass any device-identifying value to
WebNative.fetch'sheadersmap. The map is intended for protocol headers (Accept,Accept-Encoding, futureAuthorization). AddingX-Install-Id,User-Agentoverrides with TN-identifying suffixes, or anything that would let HF (or any future server) fingerprint a specific install is a privacy regression. - Don't add the
expand=tags,expand=downloads, etc. parameters back tosearchUrl. Those query params are for/api/models/{id}/tree/..., NOT/api/models?search=.... HF returns 400 when they're present on the list endpoint. The minimal list response already includesid,author,gated,tags,pipeline_tag,downloads,likes,lastModified,createdAt— sufficient forHfModelSummarywithoutfull=true. - Don't snake_case the
sortURL param. HF API wants camelCase:trendingScore,lastModified,createdAt,downloads,likes. The legacy code emittedtrending_score/last_modified/created_atand HF returned 400. Source of truth isHfSort.apiKeyinrepo/HfFilters.kt. - Don't add speculative URL params to
HuggingFaceApi.searchUrlwithout verifying they're documented for/api/models.apps=,inference_provider=,inference=warm,filter=region:us(withregion:prefix),filter=dataset:foo(withdataset:prefix),pipeline_tag=— all of these were added historically and have been removed because they trigger HF 400. Only documented params are kept:search,author,filter(plain tag values, stackable),gated,num_parameters,sort,limit,skip. If you need a new filter, verify it works against a curl-built URL first; don't add it to the URL builder on the assumption that HF tolerates it. - Don't reintroduce post-filter sliders (
minDownloads,minLikes,recentDays) intoHfFilters/HfPostFilters/ the VM. They were UI clutter without unlocking new searches —sort=downloadsalready gets the user "popular" results in the right order, and "recent" issort=lastModified. The user explicitly asked to drop them; bringing them back without consent is a regression. - Don't change the SoC-bucket mapping in
data/SocBucket.ktwithout first verifyingxororz/sd-qnnstill uses the same_8gen1.zip/_8gen2.zip/_min.zipfilename suffixes. We pull our QNN model archives directly from LocalDream's HF repo, so a bucket rename or new chip class needs a re-validation against the actualtree/mainlisting. 8gen3 / 8 Elite intentionally route to the8gen2bucket because Qualcomm's HTP V73 contexts are forward-compatible — don't add a new "8gen3" bucket without uploading new archives first. - Don't show NPU image-gen rows on non-Snapdragon devices.
imageModels()is gated onSocBucket.bucket(soc) != null. Falling back through to NPU rows on Tensor / Dimensity / Exynos would download QNN contexts that can't load — surface only thexororz/sd-mnnCPU/MNN variants on those devices. - Don't show SDXL rows on a SOC that's not in
SocBucket.SDXL_ELIGIBLE_SOCS. The SDXL contexts ship only as_8gen3.zipand Qualcomm AI Hub hasn't compiled them for older NPUs. The rest of the pipeline still uses 768-dim CLIP under the hood, so even if you forced the download, generation would crash on the dimension mismatch — keep both gates (SDXL row visibility + 2048-dim future pipeline work) in lock-step. - Don't bypass the path-traversal check in
unzipInto. Each entry's canonical path must start withtarget.canonicalPath + File.separator(or equaltarget.canonicalPathitself for the top-level dir). Skip..entries pre-canonicalization too. The QNN ZIPs fromxororz/sd-qnnhave flat layouts today, but a malicious mirror could craft../../files/key.binentries; the check is the only line of defense. - Don't unwrap the SDK runtime onto an external dir.
<filesDir>/ai_sd_runtime/is the correct location — internal storage, app-private, survives backups (allowBackup=falseis set elsewhere). The QNN.sos extracted there are device-specific and shouldn't roam. - Don't open a fresh
StableDiffusionManagerper request. It's a process singleton (getInstance(context)), wrapped by Hilt'sImageGenManager. The init-mutex insideensureRuntime()covers the qnnlibs.tar.xz extraction. Callinginitialize()twice is a no-op but rebuilding the manager would tear down the persistent native sessions used across generations. - Don't ship the release AAR yet.
ai_sd-release.aarran R8 on the SDK side and renamedStableDiffusionManager.Companion.getInstancepast Kotlin's compile-time visibility. Keepai_sd-debug.aarcopied aslibs/ai_sd-release.aaruntil:ai_sd'sconsumer-rules.proadds-keep class com.dark.ai_sd.StableDiffusionManager$Companion { *; }and the AAR is rebuilt. - Don't remove the standalone QNN upscaler implementation. The original AAR's
nativeLoadUpscalerfor QNN was a stub that only stashed the model path —nativeUpscaleImagewould then fail with "Upscaler model not provided" because the QnnModel was never built. Filled in 2026-05-08 by porting LocalDream's per-request load pattern:sd_pipeline::loadStandaloneQnnUpscaler(modelPath)inmodel_loader.cppcallscreateQnnModel(path, "upscaler")+initializeQnnApp("Upscaler", upscalerApp)and assigns to the globalupscalerApp, mirroringmain.cpp:3203in LocalDream. Prerequisite:sd_pipeline::ensureQnnSystemReady(systemLibPath, backendPath)must be called first to populateg_qnnSystemFuncs+g_backendPathCmd—ai_sd_jni.cpp::nativeInitRuntimedoes this best-effort using<libDir>/libQnn{System,Htp}.so. The Kotlin caller (ImageGenManager.loadUpscaler) just callssdk.loadUpscaler(path, useMnn=path.endsWith(".mnn"), useOpenCL=...)and the AAR's JNI dispatches to the right load path. Don't restore the .mnn-only IllegalStateException guard — the QNN path works now. - Don't lift the upscaler input cap above 1024 max-edge in
ImageTaskViewModel.runUpscale. 4× output of 2048² is 8192²×4 ≈ 256 MB which OOMs on bitmap allocation inDiffusionManager.createBitmapFromRgbeven withlargeHeap=true. The 1024 cap produces 4096²×4 ≈ 64 MB which fits comfortably. Combined withandroid:largeHeap="true"in the manifest, larger inputs MIGHT work on flagship devices, but the failure mode (OOM during bitmap return) is silent + crashy, so keep the cap and let users downscale beforehand if they need higher fidelity. - Don't declare
commons-compressandxzas anything weaker thanimplementationinapp/build.gradle.kts. They are required by the AAR's runtime extraction path;implementation(files(...))AAR consumption does NOT pull transitive POM deps, so without explicit declarations the app crashes withNoClassDefFoundError: org.tukaani.xz.XZInputStreamon first init. - Don't switch image-gen tasks to a separate ViewModel per task.
ImageTaskViewModelis the single VM for all four modes (Generate, Img2Img, Inpaint, Upscale) — sharing prompt / model selection / preview state across modes is intentional so the user can tweak a prompt then quickly switch from Generate to Edit without re-typing. - Don't reuse the chat model picker for image-gen models. They're separate
ProviderTyperows (IMAGE_GEN,IMAGE_UPSCALER) onModelInfoand live in<filesDir>/sd_models//<filesDir>/sd_upscalers/, never in the GGUF chat model dir. The store routes them throughfinalizeImageGenDownload/finalizeImageUpscalerDownloadand they should not appear inchatModelsfilters anywhere. - Don't drop the
context.packageNameself-exclusion inAccessibilityGuard.scan. Our ownIslandAccessibilityServiceregisters under our own package; without the self-skip every user who enables the smart-dodge accessibility service would trip our own one-timeRootWarningDialog("Suspicious accessibility services attached: com.dark.tool_neuron"). The exclusion is targeted — only OUR pkg is allowed; every other non-allowlisted service still surfaces in the dialog. - Don't duplicate the island pill / dodge geometry constants between
IslandOverlayRoot.ktandIslandAccessibilityService.kt. Both read fromIslandGeometry(PILL_W_DP, PILL_H_DP, OUTER_PADDING_DP, DODGE_MARGIN_DP, MAX_DODGE_DP). If the pill grows or shrinks, the service's pillRect computation diverges from where the overlay actually draws → dodge will compute against the wrong rect and either over-dodge (visible jitter) or under-dodge (pill stays on top of buttons). - Don't call
HxdManager.enqueue(...)from any new code path without immediately following it withdownloadCoordinator.registerLabel(hxdId, displayName, type). The Downloads screen's history is built from those labels — skipping the call means the history row falls back tourl.substringAfterLast('/'), which is unreadable for the user (full hex repo IDs, etc.). The two enforced sites today areModelStoreViewModel.downloadModelandImageGenManager.downloadRuntime; new sites must follow the same pattern. - Don't inject
DownloadCoordinatororDownloadHistoryRepositoryfrom any class that runs in:serveror:inference. Both are:app-process-only by construction (HXS access lives in main process; cross-process injection would attempt to unwrap the DEK from a process that doesn't have the Keystore alias scoped to it and crash). If a service-process feature needs download state, route through AIDL. - Don't bump
DownloadHistoryRepository.MAX_ENTRIESpast 50 without a UI plan for the list growing. 50 entries × ~256 bytes is ~13 KB at rest — trivial — but the screen's LazyColumn isn't paginated and the section header has no count limit. Larger caps need at least a "show more" affordance or per-day grouping. - Don't write into the
tn.download_history.*info-string namespace from anywhere other thanDownloadHistoryRepository. The v2 suffix is also load-bearing — bumping to v3 invalidates every existing user's history vault (the open-or-rebuild helper will wipe it). If the schema needs a breaking change, write a migration that reads via the v2 key and re-inserts under v3, then bump. - Don't compute the AccessibilityService's pill rect against the current
nudgevalue. Use the calibratedIslandPositionStore.position.valueonly. Dodge math must answer "where should the pill move to be clear", not "where would the pill stay if it's already nudged" — feeding nudge back into the calc creates an oscillation loop (the pill dodges off the button → button no longer overlaps → nudge resets to 0 → button overlaps again → dodge → …). - Don't clear the
IslandPositionStore.nudgeonTYPE_WINDOW_CONTENT_CHANGED. OnlyTYPE_WINDOW_STATE_CHANGED(app/activity switch) resets the nudge before re-scanning. Content changes (RecyclerView scrolls, dialog toggles, text edits) fire dozens per second; clearing on every one would make the pill flash back to its calibrated home position constantly. The scan that follows each content event publishes a fresh nudge if needed; idempotent if the new nudge equals the old one. - Don't lift the 150 ms coalesce on the accessibility-event handler.
TYPE_WINDOW_CONTENT_CHANGEDcan fire many times per second in any UI with animations or live updates (chat message streaming, video player UI, progress bars). Walking the entire node tree per event will pin a CPU core; debouncing to one scan per 150 ms keeps the smart dodge under ~1 % CPU on a mid-tier device. - Don't hard-code
node.recycle()calls inIslandAccessibilityService.collectClickableRects. On API ≥ 33recycle()is a no-op (or worse, can throwIllegalStateExceptionif the node was already pooled); minSdk 29 means most install bases are post-33 by now. GC is the right cleanup path. If you ever need to back-port to API < 30, the recycle should be guarded byBuild.VERSION.SDK_INT < Build.VERSION_CODES.TIRAMISU. - Don't widen
IslandAccessibilityService's event filter beyondtypeWindowStateChanged|typeWindowContentChanged. The XML config gates the system from delivering us anything else;typeViewClicked,typeViewFocused, etc. would fire on every user interaction and add no value (we only care about WHERE clickable things are, not WHEN they're touched). Privacy: narrowest event set = least screen-content exposure. - Don't bind the IslandAccessibilityService to anything other than the system's
BIND_ACCESSIBILITY_SERVICEpermission gate. The manifest entry isandroid:exported="true"(required by OS binder) withpermission="BIND_ACCESSIBILITY_SERVICE"(only the system has it). Removingexported="true"breaks the bind; removing the permission opens the service to arbitrary IPC. - Don't move the island pill via Compose
Modifier.offsetfor placement (user offset or smart-dodge nudge). The pill's position must come fromWindowManager.LayoutParams.x/yviawindowManager.updateViewLayout. Two specific failures when you offset via Compose: (1) the WindowManager window stays at the original WRAP_CONTENT rect, so touches keep landing on the original screen location — a back/menu button under the original pill spot remains untappable even after the pill visually moves; (2) the pill Surface renders past the wrapped window bounds and gets clipped (visually disappears or half-cuts) once the offset exceeds the wrapped content height. The morph animation (pill → card) still grows the WRAP_CONTENT window correctly because that's a size change, not a position change. - Don't drop the
flagRetrieveInteractiveWindowsfromxml/island_accessibility_service.xml. The accessibility service needsgetWindows()access to find our own overlay's actual on-screen rect viaAccessibilityWindowInfo.getBoundsInScreen(). Without this flag,windowsreturns null and the service falls back to the manual rect computation (calibrated position + status-bar inset + padding), which is fragile across OEMs that handle status-bar / display-cutout differently. - Don't make slider drags animate the WindowManager position.
IslandPositionStore.position.drop(1).collect { animY.snapTo(...) }—snapTonotanimateTo. The slider is a calibration tool; animating each tick lags the drag and feels sluggish. Only the smart-dodgedodgeYflow usesanimY.animateTo(target, dodgeSpring). Both observers feed the same Animatable, so the snapshotFlow → updateViewLayout collector handles whichever wrote last. - Don't merge
IslandContinuity(inIslandShapes.kt) with the globalTnContinuity(inShapes.kt). They are intentionally separate: the island wants a slightly more "puffed out" iOS-squircle profile (bezierCurvatureScale = 1.2, arcCurvatureScale = 1.2, extendedFraction = 0.6) than the rest of the app's surfaces. Sharing them means tuning one changes the other — and the island's aesthetic should be free to drift with iOS / OneUI Dynamic-Island design trends without dragging buttons / cards / chips along. - Don't put
dpliterals for outer padding, card padding, action-button background, or column spacing inIslandOverlayRoot. UseLocalDimens.current.spacingSm/.spacingLg/.actionIconSize/.iconMd/.iconLgandLocalTnShapes.current.fullfor the action button surface. TheIslandGeometrystatic constants are reserved for pixel-sized identity values (PILL_W/H, CARD_W/H/CORNER, PRESS_SCALE, SWIPE_THRESHOLD) that genuinely don't belong in the global theme system, and for service-side values (OUTER_PADDING_DP, DODGE_MARGIN_DP, MAX_DODGE_DP) the AccessibilityService needs outside Compose. - Don't fragment the morph back into multiple
transition.animateFloatcalls — one per value with its own spec. The morph is ONEprogress: Floatanimated value; width / height / cornerRadius / pillAlpha / cardAlpha are alllerp(...)-derived from progress in the same composition pass. Reasons: (a) any two independentanimateFloatcalls with different visibility thresholds can land on their target on different frames, producing a "settle stutter" where the surface jitters at the end; (b) when the morph value derivations all read from one State, Compose snapshots them atomically — no torn frame where width is at the target but corner is still mid-flight. The lerp-from-progress pattern also forces all springs to share a single dynamic, which is the only way to guarantee they reach the target together. - Don't use
motionScheme.defaultSpatialSpecfor the morph progress. The customspring(dampingRatio = 0.85f, stiffness = StiffnessMediumLow)was picked specifically over the M3 Expressive default because the default's overshoot, when applied simultaneously to corner radius and size, reads as a jittery "ripple" instead of a smooth squircle morph. Press scale, mode-swap slide, and cross-fades still usemotionScheme.*Spec; only the main progress driver overrides. If you swap the override back, the morph will feel choppy on mid-tier devices regardless of how fast the frame timing is. - Don't drop the
cornerRadius.roundToInt()key on theremember { islandShape(...) }block. Animating shape allocation per-frame (60 allocations per second during the morph) churns the kyantContinuousRoundedRectanglepath computation and causes visible jitter on devices without aggressive Skia caching. 1dp granularity is visually imperceptible at this scale and reduces allocations by ~4x. - Don't drop the pill-mode badge or restrict swipe to expanded state. The mode glyph must render in BOTH states (Sparkles for ASSISTANT, Sliders for CONTROL) so the user knows what mode the next expansion will open into. Swipe must work in BOTH states for the same reason — the user should be able to pre-select a mode while the pill is collapsed, then tap to expand directly into that mode. Gating swipe on
expandedmakes the pill feel like a black-box icon with no obvious affordance. - Don't put the cross-fade alpha calc on its own
animateFloat.pillAlpha = (1f - progress * 2f).coerceIn(0f, 1f)andcardAlpha = ((progress - 0.5f) * 2f).coerceIn(0f, 1f)are derived from the SAME progress as size/corner. The handoff at progress=0.5 (both alphas at 0) is exact, frame-perfect, and never visible. Adding a separate alpha animation re-introduces the desync problem. - Don't drop the
pressedscale feedback. TheonPresslambda indetectTapGesturesflipspressed = true, awaits release viatryAwaitRelease()in atry { ... } finally { pressed = false }block, andpressScaleanimates between 1.0 andIslandGeometry.PRESS_SCALE = 0.92viagraphicsLayer. Without thefinallyblock, a cancelled gesture (e.g. swipe initiated mid-press) leaves the surface visually pressed forever. - Don't put tap and drag detectors in the same
pointerInputblock — keep them in separate modifier calls. Compose's per-modifier gesture isolation letsdetectTapGesturesanddetectHorizontalDragGesturescoexist on the same surface: tap claims on no-slop release, drag claims on slop-exceeded movement. Combined into one block they'd race for the same pointer stream and one would always lose. The drag block must early-return when!expandedso pill-mode taps aren't swallowed by stillborn drag-detector setup. - Don't drop the haptic feedback calls (
view.performHapticFeedback(HapticFeedbackConstants.*)) on any island gesture:CONFIRMon tap and action-button press,LONG_PRESSon long-press,GESTURE_ENDon mode-swap completion. Haptic is the only feedback that confirms the island registered the gesture — the visual scale is too subtle on its own. The constants are HapticFeedbackConstants, not the older deprecatedVirtualKey/LongPressIDs. - Don't make
IslandModestate live in the service orIslandPositionStore. It's purely UI state — localvar mode by remember { mutableStateOf(IslandMode.ASSISTANT) }insideIslandSurface. The mode resets to ASSISTANT each time the overlay is re-attached because the user almost always wants the default mode on first interaction. If a persistent "preferred mode" preference is ever needed, store it in HXS / SharedPreferences and seed themutableStateOffrom it — don't bring the service in to mediate. - Don't reintroduce X-axis movement (slider, dodge, or otherwise). The pill is permanently centered horizontally via
gravity = TOP|CENTER_HORIZONTAL + x=0. The dodge primitive is Y-only: if obstacles overlap the center-top pill rect, push down; otherwise no movement. Adding X back would require a separate horizontal slot decision (left? right? cheaper distance?) — exactly the heuristic we dropped because it's not deterministic across OEMs / app layouts. Center-top is reliably free space on almost every app (apps avoid centering buttons there because of the camera cutout / status bar); on the rare apps that DO have a center-top button, push-down handles it. - Don't use
kotlinx.coroutines.Dispatchers.Mainfor the IslandOverlayService scope. ComposeAnimatable.animateTo()requires aMonotonicFrameClockin the coroutine context, and plainDispatchers.Main(kotlinx) doesn't carry one —androidx.compose.ui.platform.AndroidUiDispatcher.Maindoes (it's the Choreographer-backed dispatcher Compose itself uses). Without this the service crashes withIllegalStateException: A MonotonicFrameClock is not available in this CoroutineContextthe first time it tries to animate the placement. Same rule applies to any future Service that wants to driveAnimatable.animateTofrom a service-owned CoroutineScope. - Don't compute the pill rect for the accessibility service via the
windowsAPI. Use the manual computation inIslandAccessibilityService.computeNaturalPillRect—screenWidth,IslandGeometry.PILL_W_DP,statusBarTopInsetPx(), andIslandPositionStore.position.value.offsetYDpgive the natural rect deterministically. Usingwindowsreturns the CURRENT animated position which oscillates: dodge → pill moves down → next scan finds no overlap → dodge=0 → pill snaps back → overlap → dodge → … . Manual computation answers "would the pill overlap at its natural position", which is the right question. - Don't drop the
setDodgeY(0f)clear onTYPE_WINDOW_STATE_CHANGEDinIslandAccessibilityService.onAccessibilityEvent. Window-state-changed is the app-switch boundary; clearing the dodge first means the pill snaps back to centered position immediately when the user switches apps, then the next scan establishes the correct dodge for the new app. Without the clear, a dodge from app A leaks into app B for the ~150 ms coalesce window — visually the pill stays pushed down when you switch to an app that doesn't need it. - Don't drop the launcher skip in
IslandAccessibilityService.scanAndPublish. Any foreground package whose name contains"launcher"(case-insensitive substring) → publishdodgeY = 0fand return without scanning. Launchers are entirely clickable surfaces (app icons / widgets / search bars / quick toggles) — virtually every pixel under the pill is a clickable node, so the dodge math would push the pill down to the cap (MAX_DODGE_DP = 96) and leave it there permanently on the home screen. The pill belongs at center-top on the launcher because the user can just move it themselves if they need to tap something specific. Substring match misses launchers without "launcher" in the package (e.g.com.miui.home,com.sec.android.app.launcheractually matches,com.huawei.android.launchermatches); if a specific OEM launcher needs to be added later, expand the check rather than swapping to aPackageManager.resolveActivity(CATEGORY_HOME)query — the substring is cheap and runs on every coalesced scan (don't make a PackageManager round-trip hot-path code).
Whenever you change anything on the list below, update this file as part of the same change:
- Security architecture or threat model
- Any auth flow or API surface (SecurityManager, SessionHolder, PolicyEngine, AuthNative, BootIntegrity)
- Any sealed state layout (AuthState, NativeIntegrity manifest, license blob)
- New feature IDs or reshuffling of the pro-feature range
- New persistent keys in HXS (
app_prefsorapp_bootstrap) - New integrity checks, obfuscation scheme, crypto primitives
- New DI bindings touching the security graph
- Changes to release-build hardening (ProGuard, signing, manifest flags)
- Anything in "Things still deferred" moving in or out of scope
- Any new "Things NOT to regress" item discovered along the way
If the CLAUDE.md update isn't part of your diff, the change isn't finished.