Skip to content

fix: serialization of nested ChatMessage in GeneratedAnswerdataclass#9497

Merged
Amnah199 merged 6 commits intomainfrom
fix-deserilization-answer
Jun 6, 2025
Merged

fix: serialization of nested ChatMessage in GeneratedAnswerdataclass#9497
Amnah199 merged 6 commits intomainfrom
fix-deserilization-answer

Conversation

@Amnah199
Copy link
Copy Markdown
Contributor

@Amnah199 Amnah199 commented Jun 5, 2025

Related Issues

  • fixes serialization bug in GeneratedAnswer where nested ChatMessage objects in the meta.all_messages created by AnswerBuilder were not being properly serialized/deserialized. This came up as a JSON serializable error while working on breakpoints.

Snapshot of the case:

{'answer_builder': {'answers': [GeneratedAnswer(... meta={... 'all_messages': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text='The capital of France is Paris.')], _name=None, _meta={'model': ...}]}}

Proposed Changes:

Updated the de/serialization of GeneratedAnswer to properly handle nested ChatMessage objects by ensuring they are converted to their dictionary representation.

How did you test it?

Notes for the reviewer

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes
  • I added unit tests and updated the docstrings
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I documented my code
  • I ran pre-commit hooks and fixed any issue

@Amnah199 Amnah199 marked this pull request as ready for review June 5, 2025 13:37
@Amnah199 Amnah199 requested review from a team as code owners June 5, 2025 13:37
@Amnah199 Amnah199 requested review from dfokina and julian-risch and removed request for a team June 5, 2025 13:37
@coveralls
Copy link
Copy Markdown
Collaborator

coveralls commented Jun 5, 2025

Pull Request Test Coverage Report for Build 15487358601

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.006%) to 90.424%

Totals Coverage Status
Change from base Build 15486521866: 0.006%
Covered Lines: 11520
Relevant Lines: 12740

💛 - Coveralls

Copy link
Copy Markdown
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good to me already. I have only minor change requests.

Comment thread test/components/builders/test_answer_builder.py Outdated
Comment thread test/dataclasses/test_answer.py Outdated
"meta": {"meta_key": "meta_value"},
"meta": {
"meta_key": "meta_value",
"all_messages": [ChatMessage.from_user("What is the answer?").to_dict()],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment: I thought about pros and cons of using ChatMessage.from_user("What is the answer?").to_dict() here instead of

{
    "_role": "user",
    "_meta": {},
    "_name": None,
    "_content": [
        {
            "text": "What is the answer?"
        }
    ],
}

I believe it's okay to use ChatMessage.from_user here because with this test, we don't want to test the to_dict representation of the ChatMessage and we don't want the test to fail if the to_dict method in future provides a different output. We do want to test that all_messages contains whatever ChatMessage.to_dict() returns, which is exactly what this test does. 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly and if ChatMessage serialization gets refactored we don't have to worry about updating these tests.

)
assert isinstance(answer, Answer)

def test_to_dict(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we keep the plain test_to_dict and test_from_dict tests that don't have any meta in the GeneratedAnswer too.

Comment thread haystack/dataclasses/answer.py Outdated
return default_to_dict(self, data=self.data, query=self.query, documents=documents, meta=self.meta)

# Serialize ChatMessage objects to dicts
meta = self.meta.copy()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to copy if all_messages doesn't contain any ChatMessages, correct? Therefore we could move the copy call into the if block. We would need to change the if condition then.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a shallow copy of meta inside the if block.

@Amnah199 Amnah199 requested a review from julian-risch June 5, 2025 21:38
Copy link
Copy Markdown
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

@Amnah199 Amnah199 merged commit 1d6a9f6 into main Jun 6, 2025
19 checks passed
@Amnah199 Amnah199 deleted the fix-deserilization-answer branch June 6, 2025 09:46
Comment on lines +104 to +105
meta = self.meta
all_messages = meta.get("all_messages")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Amnah199 sorry this is a bit late, but we could have used _serializable_value from haystack.tracing.utils on the entire meta field. This way it would go in and try to serialize everything that has a to_dict method instead of making this specific to ChatMessage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants