Skip to content

Commit 857cfe0

Browse files
committed
docs: refresh cycle 13 research queue
1 parent 3df1e5b commit 857cfe0

3 files changed

Lines changed: 125 additions & 4 deletions

File tree

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Cycle 13 Findings - 2026-06-04
2+
3+
## Scope
4+
5+
- Repository: `SwiftFloris`
6+
- Baseline: clean detached worktree at pushed `master` `3df1e5b`
7+
(`docs: refresh cycle 12 research queue`), described as
8+
`v1.8.246-1-g3df1e5b`.
9+
- Sync: `git pull --rebase origin master` reported up to date before this
10+
cycle.
11+
- Constraint: research/docs only. No feature source, tests, build files, or
12+
assets were edited.
13+
14+
## Anti-Duplicate Checks
15+
16+
- Did not duplicate R12-1. R12-1 targets the temp-file replacement primitive;
17+
this cycle targets read/reset interleaving in the stats count path.
18+
- Did not duplicate v1.8.234 post-hotfix regression coverage. That release
19+
covers locale-scoped flush behavior, not `totalEntryCount()` serialization
20+
with reset cleanup.
21+
- Did not reopen the broader personal n-gram `ConcurrentHashMap` and
22+
pending-commit fixes from the pushed pass-2 audit commits.
23+
- Left the trigram tab/control-character normalization audit for a later cycle
24+
so this row stays focused on stats/reset consistency.
25+
26+
## Local Evidence
27+
28+
- `PersonalBigramStore.kt:224-242` builds a locale-tag set from persisted
29+
`personal_bigrams_*.tsv` files and `tablesByLocale.keys`, then calls
30+
`ensureLoaded(localeTag)` for each tag without holding `loadGuard`.
31+
- `PersonalBigramStore.kt:367-376` clears in-memory bigram state and deletes
32+
`personal_bigrams_*` files under `loadGuard`.
33+
- `PersonalTrigramStore.kt:229-245` and `PersonalTrigramStore.kt:368-375`
34+
repeat the same count/reset shape for persisted trigram files.
35+
- `TypingStatsScreen.kt:137-143` displays the personal bigram and trigram
36+
counts in Settings, making a stale or resurrected count user-visible.
37+
- `PersonalNgramFlushIsolationTest.kt:64-68` checks that reset is the only broad
38+
cleanup path, but it does not require `totalEntryCount()` to share the reset
39+
serialization boundary.
40+
- `docs/AUDIT_2026-05-28.md:58-60` records the race between
41+
`totalEntryCount()` and `resetAndAwait()`.
42+
43+
## Roadmap Changes Fed
44+
45+
- R13-1: Serialize personal n-gram stats counting with reset cleanup. The
46+
implementation should make file enumeration, loaded-key collection, and
47+
`ensureLoaded()` run under the same reset-safe boundary in both stores or
48+
compute from a reset-safe snapshot, with focused coverage that fails if stats
49+
counting can reload a locale around reset deletion.
50+
51+
## Non-Adds
52+
53+
- No source fix was made in this cycle.
54+
- No new dictionary retention, export, permission, or network behavior was
55+
proposed.
56+
- No broad personal-dictionary or typing-stats refactor proposed. The target is
57+
the existing `totalEntryCount()` / `resetAndAwait()` consistency contract in
58+
the two personal n-gram stores.

RESEARCH_REPORT.md

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# SwiftFloris Research Report
22

3-
This report summarizes current research conclusions. The full 2026-05-25 research plan is archived at `docs/archive/research/RESEARCH_FEATURE_PLAN_2026-05-25.md`. Deep-research pass refreshed **2026-06-03** (post-v1.8.204), with 2026-06-04 freshness notes through Cycle 12 and v1.8.246 implementation notes.
3+
This report summarizes current research conclusions. The full 2026-05-25 research plan is archived at `docs/archive/research/RESEARCH_FEATURE_PLAN_2026-05-25.md`. Deep-research pass refreshed **2026-06-03** (post-v1.8.204), with 2026-06-04 freshness notes through Cycle 13 and v1.8.246 implementation notes.
44

55
2026-06-04 implementation note: v1.8.241 closed R4-3. `MimeTypeFilter`
66
constructor stdout logging is removed, aggregate helper semantics are documented
@@ -38,6 +38,13 @@ must remain local generated output rather than review evidence.
3838
AndroidX Core `1.19.0` remains blocked on the API 37 behavior-gate because the
3939
published `core-1.19.0.aar` metadata declares `minCompileSdk=37`.
4040

41+
2026-06-04 Cycle 13 note: after the Cycle 12 docs push, `master` is clean at
42+
`3df1e5b` (`v1.8.246-1-g3df1e5b`). Cycle 13 rechecked the deferred
43+
`totalEntryCount()` / `resetAndAwait()` audit against live personal bigram and
44+
trigram stores plus the typing-stats UI. This cycle adds R13-1: serialize
45+
personal n-gram stats counting with reset cleanup so a stats refresh cannot
46+
reload or report stale locales around a reset.
47+
4148
2026-06-04 Cycle 12 note: after the Cycle 11 docs push, `master` is clean at
4249
`8b68d3e` (`v1.8.238-1-g8b68d3e`). Cycle 12 rechecked the personal n-gram
4350
persistence data-loss audit against live `PersonalBigramStore`,
@@ -196,6 +203,7 @@ Top opportunities (one line each):
196203
24. **Editor content-generation lifecycle** — delayed start/selection content jobs are cancelled or superseded across reset/finishInput and input-connection switches before they can publish state or touch a captured `InputConnection` (R10-1). [Closed]
197204
25. **Preference-store init splash recovery** — async `initAndroid` failures now stage a crash report, unblock the splash wait, and redirect to crash recovery before normal Settings content renders (R11-1). [Closed]
198205
26. **Personal n-gram file replacement** — bigram/trigram flush fallback deletes the live file before a successful replacement exists (R12-1, P2). [Verified]
206+
27. **Personal n-gram stats/reset serialization**`totalEntryCount()` can enumerate/load persisted bigram/trigram locales outside the reset lock while `resetAndAwait()` clears and deletes those files (R13-1, P2). [Verified]
199207

200208
No Critical or Major reliability/security defects were found that are not already on the roadmap or in the deferred audit lists. The remaining heavy work (glide model training, Vosk addon, F-Droid submission, device-only visual verification) stays maintainer-gated as the existing roadmap records.
201209

@@ -277,7 +285,9 @@ Privacy-first multilingual IME. `:app` is Apache-2.0-ceiling, no network permiss
277285
- **Personal n-gram persistence (partial):** locale-scoped flushes and
278286
concurrency guards are in place, but bigram/trigram file replacement still
279287
deletes the destination before a second rename attempt. R12-1 keeps the
280-
last-known-good n-gram file until replacement succeeds. [Verified]
288+
last-known-good n-gram file until replacement succeeds. R13-1 adds the
289+
adjacent stats/reset consistency gap: `totalEntryCount()` should not reload or
290+
report stale locales around `resetAndAwait()` cleanup. [Verified]
281291
- Established surfaces (autocorrect/SymSpell, glide classifier, clipboard, addons, voice handoff, sync, MCP, hardware-keyboard import) are covered by `COMPLETED.md` and the audits; no net-new gap surfaced beyond what the roadmap already tracks.
282292

283293
## Competitive Landscape
@@ -316,6 +326,10 @@ Privacy-first multilingual IME. `:app` is Apache-2.0-ceiling, no network permiss
316326
- **[Medium] Personal n-gram atomic replacement** → R12-1. Replace bigram and
317327
trigram TSV files without deleting the live destination before a successful
318328
replacement exists.
329+
- **[Medium] Personal n-gram stats/reset serialization** → R13-1. Move
330+
bigram/trigram `totalEntryCount()` file enumeration and `ensureLoaded()` under
331+
the reset-safe boundary, or compute counts from a reset-safe snapshot, so
332+
Settings stats cannot resurrect or display stale learning counts after reset.
319333
- **[Closed v1.8.219] Remaining diagnostic `printStackTrace()` paths** → R2-2. `RestoreScreen` failure diagnostics now use `flogError`, restore UI copy falls back to the existing "Unknown error" string for null/blank throwable messages, and `CrashUtility.writeToFile` logs through `LogTopic.CRASH_UTILITY`.
320334
- **[High] Local release ledger drift** → R3-1. Three code-fix commits after
321335
the v1.8.225 docs marker are untagged and absent from the release ledger.
@@ -407,7 +421,9 @@ Privacy-first multilingual IME. `:app` is Apache-2.0-ceiling, no network permiss
407421
- **Personal n-gram durability boundary:** `PersonalNgramFlushIsolationTest`
408422
pins locale-scoped flush behavior, but the stores still need a shared
409423
atomic-replace contract so persistence failures cannot destroy the previous
410-
locale file.
424+
locale file. Cycle 13 adds the related read/reset boundary: stats counting
425+
should share reset serialization or use a reset-safe snapshot before it can
426+
call `ensureLoaded()` on persisted locale files.
411427
- **User-dictionary navigation policy:** `UserDictionaryEntryPolicy` correctly
412428
centralizes leave/mutation/transfer gates. v1.8.232 keeps that policy and
413429
adds a visible response when Compose back handling blocks the gesture during
@@ -420,7 +436,7 @@ Privacy-first multilingual IME. `:app` is Apache-2.0-ceiling, no network permiss
420436

421437
## Security / Privacy / Data Safety
422438

423-
No net-new permission or data-egress finding. The settings-search additions are display/navigation only; the no-results Browse all settings action (RA-2), synonym keyword coverage (RA-3), and query-change scroll reset (RA-10) do not weaken the no-network posture. R2-1 and R2-2 closed as local diagnostic-safety work without adding network, telemetry, or broad file export. R11-1 closes the async side of startup diagnostics by surfacing preference-store init failures through the existing local crash recovery path without adding storage, permissions, or outbound data. R12-1 is local personal-prediction durability hardening and does not change dictionary retention, export, permissions, or outbound data. R3-2 is also local-only clipboard filtering. R3-3 closed as sync-crypto contract hardening before transport activation, with no new permission or native dependency. R4-1/R4-2/R4-3/R4-4 are closed local correctness/a11y/API-contract work. WS12 and WS10/WS15 are docs/resource-only and do not change permissions, retention, or storage behavior. R5-1 closed as trust-boundary hardening for optional addon APKs: it keeps the no-network addon screen but requires explicit trust before non-co-signed packages become active. R6-1 is local editor critical-section hardening and does not change storage, permissions, or outbound data. R7-1 closed as privacy posture hardening for the existing incognito mode and `FLAG_SECURE` contract, not a permission change. R9-1 is privacy-state hardening for existing local suggestion and smart-compose paths: it keeps the no-network posture and ensures `IME_FLAG_NO_PERSONALIZED_LEARNING` / incognito decisions are request-scoped across async work. R10-1 is local editor-session lifecycle hardening and does not change storage, permissions, or outbound data. R8-1 is UI feedback for an already-blocked dictionary operation path and does not change data retention, dictionary mutation, or export/import permissions. WS13 now explicitly includes the deferred `StickerMediaProvider.openFile` SAF allow-list validation so forged encoded sticker URIs are rejected without broadening file access. The deferred audit lists (`docs/AUDIT_2026-06-02.md`) remain the authority for crypto/parsing/lifecycle hardening; this pass does not duplicate them.
439+
No net-new permission or data-egress finding. The settings-search additions are display/navigation only; the no-results Browse all settings action (RA-2), synonym keyword coverage (RA-3), and query-change scroll reset (RA-10) do not weaken the no-network posture. R2-1 and R2-2 closed as local diagnostic-safety work without adding network, telemetry, or broad file export. R11-1 closes the async side of startup diagnostics by surfacing preference-store init failures through the existing local crash recovery path without adding storage, permissions, or outbound data. R12-1 is local personal-prediction durability hardening and does not change dictionary retention, export, permissions, or outbound data. R13-1 is local stats/reset consistency hardening for the same personal n-gram files and likewise does not change retention, export, permissions, or outbound data. R3-2 is also local-only clipboard filtering. R3-3 closed as sync-crypto contract hardening before transport activation, with no new permission or native dependency. R4-1/R4-2/R4-3/R4-4 are closed local correctness/a11y/API-contract work. WS12 and WS10/WS15 are docs/resource-only and do not change permissions, retention, or storage behavior. R5-1 closed as trust-boundary hardening for optional addon APKs: it keeps the no-network addon screen but requires explicit trust before non-co-signed packages become active. R6-1 is local editor critical-section hardening and does not change storage, permissions, or outbound data. R7-1 closed as privacy posture hardening for the existing incognito mode and `FLAG_SECURE` contract, not a permission change. R9-1 is privacy-state hardening for existing local suggestion and smart-compose paths: it keeps the no-network posture and ensures `IME_FLAG_NO_PERSONALIZED_LEARNING` / incognito decisions are request-scoped across async work. R10-1 is local editor-session lifecycle hardening and does not change storage, permissions, or outbound data. R8-1 is UI feedback for an already-blocked dictionary operation path and does not change data retention, dictionary mutation, or export/import permissions. WS13 now explicitly includes the deferred `StickerMediaProvider.openFile` SAF allow-list validation so forged encoded sticker URIs are rejected without broadening file access. The deferred audit lists (`docs/AUDIT_2026-06-02.md`) remain the authority for crypto/parsing/lifecycle hardening; this pass does not duplicate them.
424440

425441
## UX & Accessibility
426442

@@ -448,6 +464,8 @@ The keyboard surface already has a strong a11y baseline (`ACCESSIBILITY.md`, `To
448464
the async failure contract is covered by focused JVM/Robolectric tests.
449465
7. R12-1 needs a focused file-replacement/flush test; no maintainer product
450466
decision is required.
467+
8. R13-1 needs a focused personal n-gram stats/reset test; no maintainer
468+
product decision is required.
451469

452470
## Archived Evidence
453471

ROADMAP.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,51 @@ These are genuine blockers — each needs an account, key, sibling repo, ML infr
176176

177177
## Research-Driven Additions
178178

179+
### Researcher Queue (Cycle 13 - 2026-06-04)
180+
181+
- [x] 🔬 `personal-ngram-stats-reset-race-recheck-2026-06-04` - synced
182+
`master` after the Cycle 12 docs push, rechecked the deferred
183+
`totalEntryCount()` / `resetAndAwait()` audit against live bigram/trigram
184+
stores and the typing-stats screen. This cycle adds one focused row for
185+
serializing personal n-gram stats counting with reset cleanup.
186+
187+
#### Personal n-gram stats consistency
188+
189+
- [ ] 🤖 P2 — Serialize personal n-gram stats counting with reset cleanup (R13-1)
190+
- Why: Settings -> Typing stats refreshes personal bigram/trigram totals from
191+
`totalEntryCount()`, but each store builds its persisted-locale set outside
192+
`loadGuard` and then calls `ensureLoaded(localeTag)`. `resetAndAwait()`
193+
clears in-memory tables and deletes matching files under `loadGuard`, so a
194+
stats refresh can observe stale filenames or reload a locale around a reset
195+
and make a just-cleared learning store appear non-empty. The bug is distinct
196+
from R12-1's file replacement durability: this is the read/reset
197+
interleaving visible to the Settings stats UI.
198+
- Evidence: `PersonalBigramStore.kt:224-242` lists
199+
`personal_bigrams_*.tsv`, merges `tablesByLocale.keys`, and then calls
200+
`ensureLoaded(localeTag)` for each tag without holding `loadGuard`;
201+
`PersonalBigramStore.kt:367-376` clears bigram tables and deletes matching
202+
files under `loadGuard`; `PersonalTrigramStore.kt:229-245` and
203+
`PersonalTrigramStore.kt:368-375` repeat the same count/reset shape for
204+
trigrams; `TypingStatsScreen.kt:137-143` displays both counts immediately in
205+
Settings; `PersonalNgramFlushIsolationTest.kt:64-68` only checks that reset
206+
is the broad cleanup path, not that stats counting is serialized with it;
207+
the deferred audit records the race in `docs/AUDIT_2026-05-28.md:58-60`.
208+
- Touches: `PersonalBigramStore.kt`, `PersonalTrigramStore.kt`, and a focused
209+
JVM/source contract test such as `PersonalNgramFlushIsolationTest` or a new
210+
personal n-gram stats/reset test. Keep the R12-1 atomic-replace work and the
211+
v1.8.234 per-locale flush isolation intact.
212+
- Acceptance: both stores compute `totalEntryCount()` under the same
213+
serialization boundary as `resetAndAwait()` or from a reset-safe snapshot;
214+
file enumeration, `tablesByLocale` key collection, and `ensureLoaded()` do
215+
not interleave with reset deletion; a reset followed by or racing with a
216+
stats refresh returns zero rather than resurrecting deleted locales; tests
217+
fail if either store reintroduces unlocked file enumeration plus
218+
`ensureLoaded()` in `totalEntryCount()`.
219+
- Verify: `./gradlew.bat :app:testDebugUnitTest --tests
220+
"dev.patrickgold.florisboard.ime.dictionary.PersonalNgramFlushIsolationTest"`
221+
or the new focused stats/reset test class.
222+
- Complexity: S
223+
179224
### Researcher Queue (Cycle 12 - 2026-06-04)
180225

181226
- [x] 🔬 `personal-ngram-atomic-replace-recheck-2026-06-04` - synced

0 commit comments

Comments
 (0)