Commit 072d15e
Add Support for LTX-2.3 Models (#13217)
* Initial implementation of perturbed attn processor for LTX 2.3
* Update DiT block for LTX 2.3 + add self_attention_mask
* Add flag to control using perturbed attn processor for now
* Add support for new video upsampling blocks used by LTX-2.3
* Support LTX-2.3 Big-VGAN V2-style vocoder
* Initial implementation of LTX-2.3 vocoder with bandwidth extender
* Initial support for LTX-2.3 per-modality feature extractor
* Refactor so that text connectors own all text encoder hidden_states normalization logic
* Fix some bugs for inference
* Fix LTX-2.X DiT block forward pass
* Support prompt timestep embeds and prompt cross attn modulation
* Add LTX-2.3 configs to conversion script
* Support converting LTX-2.3 DiT checkpoints
* Support converting LTX-2.3 Video VAE checkpoints
* Support converting LTX-2.3 Vocoder with bandwidth extender
* Support converting LTX-2.3 text connectors
* Don't convert any upsamplers for now
* Support self attention mask for LTX2Pipeline
* Fix some inference bugs
* Support self attn mask and sigmas for LTX-2.3 I2V, Cond pipelines
* Support STG and modality isolation guidance for LTX-2.3
* make style and make quality
* Make audio guidance values default to video values by default
* Update to LTX-2.3 style guidance rescaling
* Support cross timesteps for LTX-2.3 cross attention modulation
* Fix RMS norm bug for LTX-2.3 text connectors
* Perform guidance rescale in sample (x0) space following original code
* Support LTX-2.3 Latent Spatial Upsampler model
* Support LTX-2.3 distilled LoRA
* Support LTX-2.3 Distilled checkpoint
* Support LTX-2.3 prompt enhancement
* Make LTX-2.X processor non-required so that tests pass
* Fix test_components_function tests for LTX2 T2V and I2V
* Fix LTX-2.3 Video VAE configuration bug causing pixel jitter
* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Refactor LTX-2.X Video VAE upsampler block init logic
* Refactor LTX-2.X guidance rescaling to use rescale_noise_cfg
* Use generator initial seed to control prompt enhancement if available
* Remove self attention mask logic as it is not used in any current pipelines
* Commit fixes suggested by claude code (guidance in sample (x0) space, denormalize after timestep conditioning)
* Use constant shift following original code
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>1 parent 6761336 commit 072d15e
File tree
13 files changed
+2494
-566
lines changed- scripts
- src/diffusers
- loaders
- models
- autoencoders
- transformers
- pipelines/ltx2
- tests/pipelines/ltx2
13 files changed
+2494
-566
lines changedLarge diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2156 | 2156 | | |
2157 | 2157 | | |
2158 | 2158 | | |
| 2159 | + | |
| 2160 | + | |
| 2161 | + | |
2159 | 2162 | | |
2160 | 2163 | | |
2161 | 2164 | | |
| |||
Lines changed: 58 additions & 24 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
237 | 237 | | |
238 | 238 | | |
239 | 239 | | |
240 | | - | |
| 240 | + | |
241 | 241 | | |
242 | 242 | | |
243 | 243 | | |
| |||
285 | 285 | | |
286 | 286 | | |
287 | 287 | | |
288 | | - | |
| 288 | + | |
289 | 289 | | |
290 | 290 | | |
291 | 291 | | |
| 292 | + | |
292 | 293 | | |
293 | 294 | | |
294 | 295 | | |
| |||
300 | 301 | | |
301 | 302 | | |
302 | 303 | | |
303 | | - | |
| 304 | + | |
| 305 | + | |
304 | 306 | | |
305 | 307 | | |
306 | 308 | | |
| |||
408 | 410 | | |
409 | 411 | | |
410 | 412 | | |
411 | | - | |
| 413 | + | |
412 | 414 | | |
413 | 415 | | |
414 | 416 | | |
| |||
417 | 419 | | |
418 | 420 | | |
419 | 421 | | |
420 | | - | |
| 422 | + | |
421 | 423 | | |
422 | 424 | | |
423 | 425 | | |
| |||
426 | 428 | | |
427 | 429 | | |
428 | 430 | | |
429 | | - | |
| 431 | + | |
430 | 432 | | |
431 | 433 | | |
432 | 434 | | |
| |||
580 | 582 | | |
581 | 583 | | |
582 | 584 | | |
| 585 | + | |
583 | 586 | | |
584 | 587 | | |
585 | 588 | | |
| |||
609 | 612 | | |
610 | 613 | | |
611 | 614 | | |
612 | | - | |
613 | | - | |
614 | | - | |
615 | | - | |
616 | | - | |
617 | | - | |
618 | | - | |
619 | | - | |
620 | | - | |
621 | | - | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
622 | 632 | | |
623 | 633 | | |
624 | 634 | | |
| |||
716 | 726 | | |
717 | 727 | | |
718 | 728 | | |
719 | | - | |
| 729 | + | |
720 | 730 | | |
721 | 731 | | |
722 | 732 | | |
| |||
726 | 736 | | |
727 | 737 | | |
728 | 738 | | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
729 | 742 | | |
730 | 743 | | |
731 | 744 | | |
| |||
860 | 873 | | |
861 | 874 | | |
862 | 875 | | |
863 | | - | |
| 876 | + | |
864 | 877 | | |
| 878 | + | |
865 | 879 | | |
866 | 880 | | |
867 | 881 | | |
868 | 882 | | |
869 | | - | |
| 883 | + | |
870 | 884 | | |
871 | | - | |
| 885 | + | |
872 | 886 | | |
873 | 887 | | |
874 | 888 | | |
875 | 889 | | |
| 890 | + | |
| 891 | + | |
| 892 | + | |
| 893 | + | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
876 | 897 | | |
877 | 898 | | |
878 | 899 | | |
| |||
917 | 938 | | |
918 | 939 | | |
919 | 940 | | |
| 941 | + | |
920 | 942 | | |
921 | 943 | | |
922 | 944 | | |
| |||
1058 | 1080 | | |
1059 | 1081 | | |
1060 | 1082 | | |
1061 | | - | |
1062 | | - | |
1063 | | - | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
1064 | 1086 | | |
1065 | | - | |
| 1087 | + | |
| 1088 | + | |
1066 | 1089 | | |
1067 | 1090 | | |
1068 | 1091 | | |
| |||
1077 | 1100 | | |
1078 | 1101 | | |
1079 | 1102 | | |
| 1103 | + | |
| 1104 | + | |
| 1105 | + | |
| 1106 | + | |
| 1107 | + | |
| 1108 | + | |
| 1109 | + | |
| 1110 | + | |
| 1111 | + | |
| 1112 | + | |
1080 | 1113 | | |
1081 | 1114 | | |
1082 | 1115 | | |
| |||
1098 | 1131 | | |
1099 | 1132 | | |
1100 | 1133 | | |
| 1134 | + | |
1101 | 1135 | | |
1102 | 1136 | | |
1103 | 1137 | | |
| |||
0 commit comments