Skip to content

Update utils.py#41

Open
jiyzhang wants to merge 2 commits into
arcee-ai:mainfrom
jiyzhang:patch-1
Open

Update utils.py#41
jiyzhang wants to merge 2 commits into
arcee-ai:mainfrom
jiyzhang:patch-1

Conversation

@jiyzhang
Copy link
Copy Markdown

in the function lm_stream_generator, the token returned by mlx_lm.lm_stream_generate is a mlx_lm.GenerationResponse, which includes a mx.array field
@dataclass class GenerationResponse: """ The output of :func:stream_generate`.

Args:
    text (str): The next segment of decoded text. This can be an empty string.
    token (int): The next token.
    logprobs (mx.array): A vector of log probabilities.
    prompt_tokens (int): The number of tokens in the prompt.
    prompt_tps (float): The prompt processing tokens-per-second.
    generation_tokens (int): The number of generated tokens.
    generation_tps (float): The tokens-per-second for generation.
    peak_memory (float): The peak memory used so far in GB.
    finish_reason (str): The reason the response is being sent: "length", "stop" or `None`
"""

text: str
token: int
logprobs: mx.array
prompt_tokens: int
prompt_tps: float
generation_tokens: int
generation_tps: float
peak_memory: float
finish_reason: Optional[str] = None

`

As pydantic doesn't support mx.array, it will fail the model_dump() below
yield f"data: {json.dumps(chunk.model_dump())}\n\n"

@Blaizzy
Copy link
Copy Markdown
Contributor

Blaizzy commented Dec 20, 2024

Thanks for the fix @jiyzhang!

once the tests clear I will merge

@jiyzhang
Copy link
Copy Markdown
Author

Thanks for the fix @jiyzhang!

once the tests clear I will merge

My update changed the interface of the function lm_stream_generator, which might be the reason to fail the test. I'll update a new version which will keep the output structure.

@Blaizzy
Copy link
Copy Markdown
Contributor

Blaizzy commented Dec 20, 2024

You need to run

pre-commit run --all

The token returned by mlx_lm.lm_steam_generato includes a field with mx.array which will fail the Pydantic dump with the error "TypeError: Object of type array is not JSON serializable"
@jiyzhang
Copy link
Copy Markdown
Author

just add new commit which changes the mx.array to [] so the server won't fail with the error:
File "/Users/macsmith/miniconda3/envs/fastmlx310/lib/python3.10/site-packages/fastmlx/utils.py", line 406, in lm_stream_generator yield f"data: {json.dumps(chunk.model_dump())}\n\n" TypeError: Object of type array is not JSON serializable

This commit didn't change the response format and value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants