Suggestions: fix fuzzy skip bug and add distance-2 diacritic fallback#4
Open
pashol wants to merge 1 commit into
Open
Suggestions: fix fuzzy skip bug and add distance-2 diacritic fallback#4pashol wants to merge 1 commit into
pashol wants to merge 1 commit into
Conversation
Two improvements to suggestion accuracy: 1. Fix off-by-one in fuzzy-skip guard: `i + 1 >= max_count` → `i >= max_count`. Previously, distance-1 fuzzy search was disabled whenever only 1 Cdict slot remained (e.g. when UserDictionary filled 2 slots), silently blocking typo correction in that case. 2. Add distance-2 fallback for diacritic substitutions. The cdict distance function operates on raw UTF-8 bytes; ö encodes as 2 bytes, so "oppis" is byte-distance 2 from "öppis" — out of reach for the existing distance-1 call. The new block fires only for pure-ASCII input (≥5 chars) with empty suggestion slots, keeping noise low while surfacing accented candidates.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
i + 1 >= max_count→i >= max_countinquery_suggestions. The old condition disabled distance-1 typo correction whenever only 1 suggestion slot remained — e.g. when UserDictionary already filled 2 slots. Now fuzzy fires whenever any slot is open.distance()function works on raw UTF-8 bytes.öencodes as 2 bytes, sooppisis byte-distance 2 fromöppis— unreachable with the existing distance-1 call. A new block after the main loop callsdict.distance(word, 2, remaining)as a last-resort filler, guarded to pure-ASCII input (≥5 chars) so it only fires in the "user omitted an accent" scenario and doesn't produce noisy candidates otherwise.is_pure_asciihelper: O(n) guard used by the distance-2 block.Test plan
oppis→öppisappears as a suggestionhon, typehoneowner→homeownerappears (previously blocked by the off-by-one)ö→ distance-2 block does NOT fire (is_pure_asciiguard)