Commit cca50d1
committed
fix: handle PreservingDataset in dataset concatenation (#2116)
`concatenate_datasets` from HuggingFace only accepts HF Dataset objects,
causing a ValueError when `use_preserving_dataset=True`. Added a
`merge_datasets` helper that detects PreservingDataset instances and
merges their data lists directly instead.
Closes #2116
Signed-off-by: Adi Krish <143638558+RudimentaryChef@users.noreply.github.com>1 parent fe3c4fc commit cca50d1
3 files changed
Lines changed: 78 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
21 | 20 | | |
22 | 21 | | |
23 | 22 | | |
| |||
29 | 28 | | |
30 | 29 | | |
31 | 30 | | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
92 | | - | |
| 92 | + | |
93 | 93 | | |
94 | 94 | | |
95 | 95 | | |
| |||
144 | 144 | | |
145 | 145 | | |
146 | 146 | | |
147 | | - | |
| 147 | + | |
148 | 148 | | |
149 | 149 | | |
150 | 150 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
28 | 31 | | |
29 | 32 | | |
30 | 33 | | |
31 | 34 | | |
32 | 35 | | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
33 | 49 | | |
34 | 50 | | |
35 | 51 | | |
| |||
134 | 150 | | |
135 | 151 | | |
136 | 152 | | |
137 | | - | |
| 153 | + | |
138 | 154 | | |
139 | 155 | | |
140 | 156 | | |
| |||
199 | 215 | | |
200 | 216 | | |
201 | 217 | | |
202 | | - | |
| 218 | + | |
203 | 219 | | |
204 | 220 | | |
205 | 221 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
313 | 313 | | |
314 | 314 | | |
315 | 315 | | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
0 commit comments