Skip to content

feat: Adds embed_tokens, lm_head as trainable for vocab expansion in peft and enables tying of adapters#625

Merged
dushyantbehl merged 12 commits into
foundation-model-stack:mainfrom
romitjain:feat/embed-lmhead-tying-and-expansion
Nov 24, 2025
Merged

feat: Adds embed_tokens, lm_head as trainable for vocab expansion in peft and enables tying of adapters#625
dushyantbehl merged 12 commits into
foundation-model-stack:mainfrom
romitjain:feat/embed-lmhead-tying-and-expansion

Conversation

@romitjain
Copy link
Copy Markdown
Collaborator

@romitjain romitjain commented Oct 30, 2025

Description of the change

  1. If the embedding layer has been resized, this PR adds embed_tokens and lm_head as trainable layers
  2. If any of the tied layers are added in modules_to_save, enable the flag to ensure weight tying between adapters of the tied layers
  3. If any of the tied layers are added in target_modules, enable the flag to ensure weight tying between adapters of the tied layers

Related issue number

Closes https://github.ibm.com/ai-foundation/watson-fm-stack-tracker/issues/1673

How to verify the PR

For 1

 python -m pytest tests/test_sft_trainer.py::test_run_causallm_lora_add_special_tokens

For 2 and 3

 python -m pytest tests/test_sft_trainer.py -k "tied_weights"

Was the PR tested

  • I have added >=1 unit test(s) for every new method I have added.
  • I have ensured all unit tests pass

Signed-off-by: romit <romit@ibm.com>
@github-actions
Copy link
Copy Markdown

Thanks for making a pull request! 😃
One of the maintainers will review and advise on the next steps.

@github-actions github-actions Bot added the feat label Oct 30, 2025
Comment thread tuning/sft_trainer.py Outdated
Signed-off-by: romit <romit@ibm.com>
@romitjain romitjain marked this pull request as ready for review November 4, 2025 12:21
Comment thread tests/artifacts/language_models/__init__.py Outdated
Comment thread pyproject.toml Outdated
Signed-off-by: romit <romit@ibm.com>
Signed-off-by: romit <romit@ibm.com>
Comment thread tests/test_sft_trainer.py
Comment thread tests/artifacts/language_models/maykeye-tinyllama-v0/config.json
Signed-off-by: romit <romit@ibm.com>
@dushyantbehl dushyantbehl added the on hold This PR is on hold and will not be merged right away label Nov 7, 2025
Comment thread tests/test_sft_trainer.py Outdated
Comment thread tests/test_sft_trainer.py Outdated
Comment thread tuning/utils/config_utils.py
Comment thread tuning/sft_trainer.py Outdated
Comment thread pyproject.toml Outdated
Copy link
Copy Markdown
Collaborator

@dushyantbehl dushyantbehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change look okay...minor clarifications requested

…g into feat/embed-lmhead-tying-and-expansion
Signed-off-by: romit <romit@ibm.com>
@romitjain
Copy link
Copy Markdown
Collaborator Author

@dushyantbehl I have made changes based on your comments. For the PEFT patch, let's wait until they merge it.

Signed-off-by: romit <romit@ibm.com>
Comment thread tests/test_sft_trainer.py Outdated
(["embed_tokens", "lm_head"], ["embed_tokens"]),
],
)
@pytest.mark.skip(reason="Not released in a tagged version of PEFT")
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature is not supported for v0.18.0

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we skip this based on the version of peft?

from packaging import version
@pytest.mark.skipif(
    version.parse(peft.__version__) < version.parse("0.18.0"),
    reason=f"Not released in PEFT <= 0.18.0"
)

@dushyantbehl dushyantbehl removed the on hold This PR is on hold and will not be merged right away label Nov 24, 2025
Signed-off-by: romit <romit@ibm.com>
Signed-off-by: romit <romit@ibm.com>
Copy link
Copy Markdown
Collaborator

@dushyantbehl dushyantbehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dushyantbehl
Copy link
Copy Markdown
Collaborator

Thanks @romitjain !

@dushyantbehl dushyantbehl merged commit 87d90b4 into foundation-model-stack:main Nov 24, 2025
9 checks passed
@romitjain romitjain deleted the feat/embed-lmhead-tying-and-expansion branch November 24, 2025 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants