Skip to content

Implement Prefix Tuning for Gemma models.#631

Open
copybara-service[bot] wants to merge 1 commit into
mainfrom
test_899479928
Open

Implement Prefix Tuning for Gemma models.#631
copybara-service[bot] wants to merge 1 commit into
mainfrom
test_899479928

Conversation

@copybara-service
Copy link
Copy Markdown

@copybara-service copybara-service Bot commented Apr 22, 2026

Implement Prefix Tuning for Gemma models.

@google-cla
Copy link
Copy Markdown

google-cla Bot commented Apr 22, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@copybara-service copybara-service Bot changed the title Prefix tuning support for Gemma models. Implement Prefix Tuning for Gemma models. Apr 23, 2026
@copybara-service copybara-service Bot force-pushed the test_899479928 branch 3 times, most recently from 1e518ad to 9a0f5ed Compare April 23, 2026 19:22
PiperOrigin-RevId: 899479928
BenjaminBossan added a commit to huggingface/peft that referenced this pull request May 5, 2026
There was an issue with applying prefix tuning to Gemma 4 because the
model uses different head dimensions for layers that use sliding window
attention. As prefix tuning only initializes a single projection matrix
that is used for all layers, this would lead to a shape mismatch.

The solution is to "overprovision" the matrix and then slice the prefix
down to size of the layer is smaller. This is not quite as parameter
efficient as it could be, but the overhead shouldn't be too large.

For robustness, we also skip layers if the matrix is underprovisioned,
but we warn about it and raise an error if all layers are skipped.

Alternatively, we could implement one project per layer, each with the
right size, like in google-deepmind/gemma#631.
However, this would be a big refactor and also very hard to make
backwards compatible with existing checkpoints, so going with the less
efficient solution is preferable.

This PR also contains an independent, single line fix to a prefix tuning
test that was referencing a non-existing model.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants