Skip to content

[feature]:kv_batch_put + mooncake-store backend, no-zero-copy will result in a relatively large local buffer size memory overhead. #78

@pxp531

Description

@pxp531

When we use the backend of tq+mooncake-store, the volume of text scenarios is relatively small, but the volume of multi-modal scenarios is relatively large. As the number of GBS and images increases linearly, the amount to be put may approach 10G. However, when choosing Mooncake-store as the backend, the kv_batch_put uses the no zero-copy API and requires a relatively large local buffer size to copy the data. This requirement only exists for the put operation, but the get client inherits this configuration, resulting in nearly gpu_per_node * local buffer size of invalid data on one machine. This issue becomes more obvious in the multi-modal scenario.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions