Skip to content

[vLLM supports] FireRedLID vLLM supports #46

@PatchouliTIS

Description

@PatchouliTIS

Summary

I have adapted FireRedLID for vLLM inference and submitted a PR to the vLLM project:
vllm-project/vllm#39290

The converted model weights are available on Hugging Face:
https://huggingface.co/PatchyTisa/FireRedLID-vllm

Architecture

FireRedLID in vLLM follows the Whisper-style encoder-decoder pattern:

  • Encoder: ConformerEncoder (shared architecture with FireRedASR2)
  • Decoder: TransformerDecoder (6-layer cross-attention)
  • Vocabulary: 120 LID tokens (dict.txt)
  • Output: Up to 2 tokens per utterance (e.g. "en", "zh mandarin")

Usage

Server:

vllm serve PatchyTisa/FireRedLID-vllm -tp=1 --dtype=float32

Client:

python examples/online_serving/openai_lid_client.py \
    --audio_paths audio_en.wav audio_zh.wav audio_fr.wav

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions