feat: Add KL Divergence command by spicyneuron · Pull Request #1146 · ml-explore/mlx-lm

spicyneuron · 2026-04-13T23:33:46Z

This PR adds a new mlx_lm.kld command for comparing a candidate model against a baseline. Main use case is comparing quantizations of the same model.

Some design decisions worth mentioning:

The cache only stores the baseline top-K logprobs at each position, plus a single tail-mass bucket for the rest of the distribution. This is similar in spirit to the DWQ approach and avoids creating massive full-logit caches while still preserving the part of the distribution that matters most.
The metric direction is explicit as KL(baseline || model).
The cache records tokenizer metadata and validates candidate compatibility before evaluation. That helps prevent accidentally comparing models whose vocab sizes or token-id mappings do not line up.
Eval token loading now goes through a shared load_eval_tokens() helper so perplexity and kld use the same sampling path. That keeps the evaluation setup more consistent and easier to reason about across tools.

This PR also wires the new command into the CLI and adds tests covering cache creation, cache reuse, KL computation, and tokenizer mismatch handling.

Add KL Divergence command

a8e9684

spicyneuron changed the title ~~Add KL Divergence command~~ feat: Add KL Divergence command Apr 13, 2026

spicyneuron added 4 commits April 16, 2026 12:19

Validate mlx model before creating cache

9b38ff5

Add median and max KL summary metrics

1af39d7

Add p95

136df01

Calculate mean from sequences for consistency

cceccbe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add KL Divergence command#1146

feat: Add KL Divergence command#1146
spicyneuron wants to merge 5 commits intoml-explore:mainfrom
spicyneuron:kl-divergence

spicyneuron commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spicyneuron commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant