fix: return empty list for non-positive top_k in cosine_topk#423
Open
koriyoshi2041 wants to merge 1 commit into
Open
fix: return empty list for non-positive top_k in cosine_topk#423koriyoshi2041 wants to merge 1 commit into
koriyoshi2041 wants to merge 1 commit into
Conversation
cosine_topk selected results with argpartition using actual_k = min(k, n). For k == 0 this evaluated the [-0:] slice and returned the entire corpus instead of nothing, and k < 0 returned n-1 results. top_k flows unvalidated from the retrieve config, so a tier set to top_k=0 silently retrieved every memory. Return [] for k <= 0, and guard cosine_topk_salience the same way (where k < 0 likewise sliced off the wrong entries).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
cosine_topkreturns the entire corpus whentop_kis0, andn-1results whentop_kis negative, instead of returning nothing.src/memu/database/inmemory/vector.py:For
k == 0,actual_kis0, and theelsebranch slices[-0:]which is[0:]— the whole array. Fork == -1it slices[1:](n-1items).top_kreachescosine_topkunvalidated fromretrieve_config.{category,item,resource}.top_k, so a user setting a tier'stop_kto0to disable it silently gets the full corpus back rather than an empty result — an over-retrieval bug that also inflates token usage downstream.Fix: return
[]fork <= 0.cosine_topk_salienceis guarded the same way —scored[:0]already returned[], butscored[:-1](fork = -1) incorrectly dropped the last entry, so both now behave consistently.Added
tests/test_vector.pycoveringk=0/k=-1(empty) and a positive-kordering case for both functions.pytest tests/test_vector.pypasses;ruffclean.