You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/design-docs/checkpointing.md
+28-17Lines changed: 28 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,11 +37,20 @@ uv run --extra mcore examples/converters/convert_megatron_to_hf.py \
37
37
--hf-ckpt-path=<path_to_save_hf_ckpt>
38
38
```
39
39
40
-
## Merging Megatron LoRA Adapter Checkpoints to Hugging Face Format
40
+
## Converting Megatron LoRA Adapter Checkpoints to Hugging Face Format
41
41
42
-
When training with [LoRA (Low-Rank Adaptation)](../guides/sft.md#lora-configuration) on the Megatron backend, the resulting checkpoint contains only the adapter weights alongside the base model configuration. To produce a standalone Hugging Face checkpoint suitable for inference or evaluation, use the LoRA merger script. It loads the base model, applies the LoRA adapter weights on top, and saves the merged result in Hugging Face format.
42
+
When training with [LoRA (Low-Rank Adaptation)](../guides/sft.md#lora-configuration) on the Megatron backend, the resulting checkpoint contains only the adapter weights alongside the base model configuration. The `convert_lora_to_hf.py`script supports two export modes:
43
43
44
-
This script requires Megatron-Core, so make sure to launch with the `mcore` extra:
44
+
-**Merged**: fold the LoRA adapter into the base model and export a single standalone HuggingFace checkpoint.
45
+
-**Adapter-only**: export only the LoRA adapter weights in [HuggingFace PEFT](https://huggingface.co/docs/peft) format, keeping the base model separate.
46
+
47
+
This script requires Megatron-Core, so make sure to launch with the `mcore` extra.
48
+
49
+
### Option A — Merged checkpoint
50
+
51
+
Loads the base model, applies the LoRA adapter weights on top, and saves the merged result in HuggingFace format. The output can be used directly with `AutoModelForCausalLM.from_pretrained` or passed to the [evaluation pipeline](../guides/eval.md).
52
+
53
+
**Example:**
45
54
46
55
```sh
47
56
uv run --extra mcore python examples/converters/convert_lora_to_hf.py \
|`--base-ckpt`| Path to the base model's Megatron checkpoint directory (the `iter_XXXXXXX` folder). |
59
-
|`--adapter-ckpt`| Path to the LoRA adapter's Megatron checkpoint directory (must contain a `run_config.yaml` with a `peft` section). |
60
-
|`--hf-model-name`| HuggingFace model identifier used to resolve the model architecture and tokenizer (e.g. `Qwen/Qwen2.5-7B`). |
61
-
|`--hf-ckpt-path`| Output directory for the merged HuggingFace checkpoint. Must not already exist. |
65
+
Exports only the LoRA adapter weights in HuggingFace PEFT format without merging into the base model. This is useful when you want to serve the base model and adapter separately (e.g. with vLLM's LoRA support).
62
66
63
-
### Example
67
+
**Example:**
64
68
65
69
```sh
66
-
# Merge a LoRA adapter trained on Qwen2.5-7B back into a full HF checkpoint
67
70
uv run --extra mcore python examples/converters/convert_lora_to_hf.py \
Copy file name to clipboardExpand all lines: docs/guides/sft.md
+6-1Lines changed: 6 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -339,7 +339,12 @@ For more details on LoRA, see [LoRA: Low-Rank Adaptation of Large Language Model
339
339
340
340
### Exporting a LoRA Checkpoint to Hugging Face Format
341
341
342
-
After training with LoRA on the Megatron backend, use the LoRA merger script to fold the adapter weights into the base model and produce a standalone Hugging Face checkpoint for inference or evaluation. See the [Checkpointing documentation](../design-docs/checkpointing.md#merging-megatron-lora-adapter-checkpoints-to-hugging-face-format) for full usage details.
342
+
After training with LoRA on the Megatron backend, the `convert_lora_to_hf.py` script supports two export modes:
343
+
344
+
- **Merged**: fold the adapter into the base model and export a single standalone HuggingFace checkpoint for inference or evaluation.
345
+
- **Adapter-only**: export only the adapter weights in HuggingFace PEFT format, keeping the base model separate (e.g. for use with vLLM's LoRA support).
346
+
347
+
See the [Checkpointing documentation](../design-docs/checkpointing.md#converting-megatron-lora-adapter-checkpoints-to-hugging-face-format) for full usage details and examples.
0 commit comments