Commit b8a4586
authored
Refactor: Eagle data loading (#668)
## What does this PR do?
**Type of change:** Refactor <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->
**Overview:**
Jira ticket: https://jirasw.nvidia.com/browse/OMNIML-2955
Main changes :
- Consolidate Eagle data loading with @ChenhanYu 's implementation of
`transformers_dataset.py`
- Refactor: baked the following logics from `example/main.py` to
`modelopt/torch` for cleaner example entrance:
- default config selecting and merging with custom config
- tokenizer post-processor (chat template and pad_tok_id)
- d2t loading
- Implementation refactor: In HF workflow, reuse base modfel's input
hidden states as input_embedding, instead of calculating from input_ids.
This has two main benefits:
- Easier VLM support, which has various embedding processing logics.
- Training effieicy.
- Deprecating eagle1 from the example. It is still available by setting
custom config.
- Other minor fixes and readme updates.
## Usage
<!-- You can potentially add a usage example below. -->
```python
# Add a code snippet demonstrating how to use this
```
## Testing
<!-- Mention how have you tested your change if applicable. -->
Tested that training curves after changes (both online&offline) is
identical with original branch:
<img width="1073" height="634" alt="image"
src="https://github.com/user-attachments/assets/abfd7bea-c82c-48a7-8181-68c5a9e4da8d"
/>
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Added draft vocabulary cache support for EAGLE model training,
enabling runtime vocabulary customization via `--draft_vocab_cache`
parameter
* Introduced new data loading utilities with sharding, streaming, and
tokenization support for large-scale training
* Added optional `--log_steps` configuration to training launcher
* **Documentation**
* Updated EAGLE configuration guides with draft vocabulary cache setup
instructions and examples
* **Refactor**
* Restructured data pipeline for offline training with improved dataset
handling and batching
* Updated command-line arguments across training scripts (`--input-data`
replaces `--input-file`)
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>1 parent 9e38041 commit b8a4586
16 files changed
Lines changed: 678 additions & 700 deletions
File tree
- examples/speculative_decoding
- scripts
- modelopt/torch
- speculative
- eagle
- plugins
- utils/plugins
- tests/examples/speculative_decoding
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
242 | 242 | | |
243 | 243 | | |
244 | 244 | | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
245 | 256 | | |
246 | 257 | | |
247 | 258 | | |
| |||
252 | 263 | | |
253 | 264 | | |
254 | 265 | | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
| 266 | + | |
264 | 267 | | |
265 | 268 | | |
266 | 269 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
| 33 | + | |
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | | - | |
| 36 | + | |
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
| |||
0 commit comments