-
-
Notifications
You must be signed in to change notification settings - Fork 839
8-bit optimizers dont work with FSDP #89
Copy link
Copy link
Labels
Contributions WelcomeWe welcome contributions to fix this issue!We welcome contributions to fix this issue!DuplicateThis issue or pull request already existsThis issue or pull request already existsFSDPOptimizersIssues or feature requests relating to optimizersIssues or feature requests relating to optimizersTo Discuss Internally
Metadata
Metadata
Assignees
Labels
Contributions WelcomeWe welcome contributions to fix this issue!We welcome contributions to fix this issue!DuplicateThis issue or pull request already existsThis issue or pull request already existsFSDPOptimizersIssues or feature requests relating to optimizersIssues or feature requests relating to optimizersTo Discuss Internally
When I use an 8-bit ADAM with FSDP, I get an error as follows:
RuntimeError: output tensor must have the same type as input tensorIf my understanding is correct, there seems to be a casting issue. Is there any workaround this?
TIA.