Skip to content

[Bug]: Embedding queue messages drop their request wait id during serialization #1379

@officialasishkumar

Description

@officialasishkumar

Bug Description

EmbeddingMsg assigns each message an id, and request-scoped wait tracking registers that id for direct embedding work such as add_skill(wait=True). However, EmbeddingMsg.to_dict() delegates to dataclasses.asdict(), while id is not declared as a dataclass field. The queued JSON therefore omits the registered id. When the embedding worker deserializes the queued message, EmbeddingMsg.from_dict() creates a fresh id, so TextEmbeddingHandler marks a different id as complete and the original pending embedding root remains pending until the request times out.

Steps to Reproduce

  1. Create an EmbeddingMsg.
  2. Register msg.id in RequestWaitTracker as an embedding root.
  3. Serialize the message with msg.to_dict() and deserialize it with EmbeddingMsg.from_dict(), matching the queue path.
  4. Have the embedding handler mark the deserialized message id as complete.
  5. Observe that the originally registered root id is still pending.

Expected Behavior

The queued EmbeddingMsg should preserve its id across serialization so the embedding worker marks the same root id that wait=True registered. Direct embedding waits such as add_skill(wait=True) should complete when their embedding job finishes.

Actual Behavior

EmbeddingMsg.to_dict() omits id, from_dict() generates a new id, and request-scoped waits that registered the original id can remain pending until timeout.

Minimal Reproducible Example

from openviking.storage.queuefs.embedding_msg import EmbeddingMsg
from openviking.telemetry.request_wait_tracker import RequestWaitTracker

telemetry_id = "tm-demo"
tracker = RequestWaitTracker()
tracker.register_request(telemetry_id)

msg = EmbeddingMsg("hello", {"uri": "viking://agent/skills/demo"}, telemetry_id=telemetry_id)
tracker.register_embedding_root(telemetry_id, msg.id)

queued = msg.to_dict()
restored = EmbeddingMsg.from_dict(queued)
tracker.mark_embedding_done(telemetry_id, restored.id)

assert restored.id == msg.id
assert tracker.is_complete(telemetry_id)

Error Logs

AssertionError: restored.id differs from the registered id, leaving the request incomplete

OpenViking Version

main at 17dea04 / v0.3.5-era code

Python Version

Python 3.12.3

Operating System

Linux

Model Backend

Other

Additional Context

This affects direct embedding roots. Semantic-root waits are less exposed because they register SemanticMsg.id, and SemanticMsg does declare id as a dataclass field.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions