Skip to content

BAAI/bge-m3 models fails in SageMaker ml.g4dn.xlarge instance. #804

@mjmanoj001

Description

@mjmanoj001

System Info

After deploying BAAI/bge-m3 model in AWS Sagemaker with ml.g4dn.xlarge. It works for sometime but fails soon returning only a list of None.
I didn't test it, but it probably does not work with Nvidia T4 GPU.
Problem:
returns [[None, None, ....]]

Once It starts returning None, It will always return None. The only way to recover is to redeploy, but same thing happens again and again. It feels like the worker process died and the server is returning None.
There is no log indicating any kind of problem. The log just says success even if it returns None.

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Steps to Reproduce.

  1. Copy the deployment script from https://huggingface.co/BAAI/bge-m3?sagemaker_deploy=true
  2. Change the instance_type to "ml.g4dn.xlarge"
  3. Test with 10K data. It will initially return the correct embeddings but after few thousand requests, it starts returning [[None, None, ....]]

Expected behavior

It should return the correct embeddings or any error message saying because of such and such reasons cannot generate embeddings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions