Skip to content

Fix speculator config for models with explicit head_dim#517

Draft
MeganEFlynn wants to merge 1 commit into
mainfrom
attention_head_dim_fix
Draft

Fix speculator config for models with explicit head_dim#517
MeganEFlynn wants to merge 1 commit into
mainfrom
attention_head_dim_fix

Conversation

@MeganEFlynn
Copy link
Copy Markdown
Collaborator

Purpose

Models like Laguna XS and Qwen3.6-27B have hidden_size (5120) not divisible by num_attention_heads (24) because they use an explicit head_dim (256). LlamaConfig's validate_architecture rejects this, so recompute num_attention_heads as hidden_size // head_dim for the speculator.

Description

This PR changes the way we calculate the attention heads so that the initialization doesn't fail when we use we use a hidden state dim that isn't divisible by the number of attention heads in the model, due to a limitation in the llama config.

Related Issue

NA

Tests

Using this PR makes Qwen 3.6 27B run whereas it previously failed

I have filled in:

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan/results, such as providing test command and pasting the results.
  • (Optional) The necessary documentation update.
  • I (a human) have written or reviewed the code in this pr to the best of my ability.

Models like Qwen3.6-27B have hidden_size (5120) not divisible by
num_attention_heads (24) because they use an explicit head_dim (256).
LlamaConfig's validate_architecture rejects this, so recompute
num_attention_heads as hidden_size // head_dim for the speculator.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@MeganEFlynn MeganEFlynn requested review from fynnsu and shanjiaz May 12, 2026 23:47
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2bfc0356-108b-44dd-a8a8-10068fd6176f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch attention_head_dim_fix

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@MeganEFlynn MeganEFlynn requested a review from dsikka May 12, 2026 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants