LCORE-1470: Inline BYOK uses user-facing ids for LLS RAG db filtering#1325
Conversation
WalkthroughThe Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can scan for known vulnerabilities in your dependencies using OSV Scanner.OSV Scanner will automatically detect and report security vulnerabilities in your project's dependencies. No additional configuration is required. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/utils/vector_search.py (2)
353-354: Misleading comment: Solr exclusion now applies to all cases, not just request-level overrides.The comment "Request-level override: filter out Solr store" is outdated. With the unified resolution path, this filter is applied regardless of whether IDs came from
vector_store_idsor config. Consider updating to reflect the actual behavior.📝 Suggested comment fix
- # Request-level override: filter out Solr store, use the rest + # Exclude Solr store from BYOK RAG queries (Solr is handled separately by _fetch_solr_rag) vector_store_ids_to_query = [🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/vector_search.py` around lines 353 - 354, The comment above the vector_store_ids_to_query assignment is misleading—update it to state that the Solr store is always excluded (applies to both request-level vector_store_ids and config-derived IDs) rather than saying "Request-level override"; locate the vector_store_ids_to_query declaration in src/utils/vector_search.py and replace the comment to clearly describe the unified resolution path and that Solr is filtered out for all cases.
320-330: Docstring could clarify thatvector_store_idsare user-facing IDs that get translated.The inline comment
# User-facingis helpful, but the docstring (lines 328-330) doesn't mention the translation behavior. Future maintainers may not realize these IDs are resolved to internal llama-stack IDs.📝 Suggested docstring update
Args: client: The AsyncLlamaStackClient to use for the request query: The search query - configuration: Application configuration - vector_store_ids: Optional list of vector store IDs to query. - If provided, only these stores will be queried. If None, all stores - (excluding Solr) will be queried. + vector_store_ids: Optional list of user-facing rag_ids to query. + These are translated to internal llama-stack vector_db_ids before + querying. If provided, only these stores will be queried. If None, + configured inline stores (excluding Solr) will be queried.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/vector_search.py` around lines 320 - 330, Update the docstring for the function that returns tuple[list[RAGChunk], list[ReferencedDocument]] to explicitly state that the vector_store_ids parameter is a user-facing list of IDs which will be resolved/translated into internal llama-stack vector-store IDs before querying; clarify that when vector_store_ids is provided only those user-facing stores (after translation) are queried and when None all stores except Solr are queried. Mention the translation step and its purpose so maintainers understand that vector_store_ids are not raw internal IDs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/utils/vector_search.py`:
- Around line 353-354: The comment above the vector_store_ids_to_query
assignment is misleading—update it to state that the Solr store is always
excluded (applies to both request-level vector_store_ids and config-derived IDs)
rather than saying "Request-level override"; locate the
vector_store_ids_to_query declaration in src/utils/vector_search.py and replace
the comment to clearly describe the unified resolution path and that Solr is
filtered out for all cases.
- Around line 320-330: Update the docstring for the function that returns
tuple[list[RAGChunk], list[ReferencedDocument]] to explicitly state that the
vector_store_ids parameter is a user-facing list of IDs which will be
resolved/translated into internal llama-stack vector-store IDs before querying;
clarify that when vector_store_ids is provided only those user-facing stores
(after translation) are queried and when None all stores except Solr are
queried. Mention the translation step and its purpose so maintainers understand
that vector_store_ids are not raw internal IDs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ffe5e818-ef89-4e59-bc94-cfd202376d31
📒 Files selected for processing (2)
src/utils/vector_search.pytests/unit/utils/test_vector_search.py
Description
In the method
_fetch_byok_ragwhich is called when performing inline BYOK, we pass the user facing rag_ids (vector_store_idsfield in request) which should be translated to llama-stackvector_store_ids.This last step is missing, which leads to filtering not working using the user facing rag_ids.
The PR fixes this issue.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Added unit tests
Summary by CodeRabbit
Bug Fixes
Tests