Add model support for DeepSeek-V4 (deepseek_v4) and Ling-2.6-flash (bailing_hybrid)

## Model Support Request

We are trying to run these models with `mlx-lm` but get:

```
ValueError: Model type deepseek_v4 not supported.
ValueError: Model type bailing_hybrid not supported.
```

### 1. DeepSeek-V4-Flash (DeepSeekMoE)
- **HF repo:** https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash
- **Model type in config.json:** `deepseek_v4`
- **Architecture:** DeepSeekMoE with DSA, MLA
- **Total params:** 284B, Active: ~13B per token
- **MLX quantized versions exist:** mlx-community has 6bit, mxfp4, 3bit-DQ, 2bit-DQ
- **Already supported in SGLang and vLLM**

### 2. Ling-2.6-flash (Bailing Hybrid)
- **HF repo:** https://huggingface.co/inclusionAI/Ling-2.6-flash
- **Model type in config.json:** `bailing_hybrid`
- **Architecture:** MoE with MLA + hybrid attention, bailing_moe_v2_5
- **Total params:** 104B, Active: ~7.4B per token
- **MLX quantized version exists:** mlx-community/Ling-2.6-flash-mlx-4bit
- **Already supported in SGLang**

### Why this matters
Both models open-sourced April 2026 for agent workflows. MLX is the primary inference backend for Apple Silicon Macs but users cannot run these models locally.

### Implementation approach
Model code available in HF repos:
- DeepSeek-V4: `modeling_deepseek.py`
- Ling-2.6-flash: `modeling_bailing_moe_v2_5.py`

PyTorch implementations need translation to MLX. Architecture follows patterns similar to existing supported models.

### Request
Please add support for both model types in the mlx-lm model registry.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add model support for DeepSeek-V4 (deepseek_v4) and Ling-2.6-flash (bailing_hybrid) #1233

Model Support Request

1. DeepSeek-V4-Flash (DeepSeekMoE)

2. Ling-2.6-flash (Bailing Hybrid)

Why this matters

Implementation approach

Request

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add model support for DeepSeek-V4 (deepseek_v4) and Ling-2.6-flash (bailing_hybrid) #1233

Description

Model Support Request

1. DeepSeek-V4-Flash (DeepSeekMoE)

2. Ling-2.6-flash (Bailing Hybrid)

Why this matters

Implementation approach

Request

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions