feat: Knowledge write node chunk embeding#4402
Conversation
|
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| self.post_embedding(document_model_list, knowledge_id, workspace_id) | ||
|
|
||
| write_content_list = [{ | ||
| "name": document.get("name"), |
There was a problem hiding this comment.
The code appears to be functional but could benefit from certain enhancements and optimizations. Here are some suggestions:
-
Code Formatting: Ensure consistent indentation and spacing for better readability:
... }).refresh() -
Variable Naming: Use meaningful variable names to improve code clarity, e.g.,
write_content_listinstead ofwrite_content. -
Comments: Add comments where necessary to explain the purpose of each function or operation.
-
Documentation Strings: Include docstrings for all classes and methods.
-
Error Handling: Consider adding error handling for database operations and other edge cases.
Here is a revised version with these improvements:
@@ -20,6 +20,7 @@
from common.utils.common import bulk_create_in_batches
from knowledge.models import Document, KnowledgeType, Paragraph, File, FileSourceType, Problem, ProblemParagraphMapping
from knowledge.serializers.common import ProblemParagraphObject, ProblemParagraphManage
+from knowledge.serializers.document import DocumentSerializers
class ParagraphInstanceSerializer(serializers.Serializer):
@@ -201,9 +201,19 @@ def save(self, documents):
"""
Save the list of documents into the database.
:param documents: List of dictionaries containing document data.
:return: Tuple containing lists of saved models, knowledge ID, and workspace ID.
"""
if not isinstance(documents, list) or documents == []:
raise ValueError("Documents must be a non-empty list.")
# Proceed with saving the documents...
@@ -228,7 +248,19 @@ def execute(self, documents, **kwargs) -> NodeResult:
"""
# Execute the logic to process the documents
document_model_list, knowledge_id, workspace_id = self.save(documents)
- # Call a static method to perform additional operations like embedding generation
+ """
+ Post-processing step to generate embeddings, update metadata, etc., after documents have been created.
+ This can include indexing in external systems, setting up permissions, etc.
+ """
+ self.post_embedding(document_model_list, knowledge_id, workspace_id)
write_content_list = [
{
"name": document.get("name"),These changes enhance the code's readability, maintainability, and robustness while maintaining its functionality.
feat: Knowledge write node chunk embeding