You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* make mamba
* add quick debug
* add quick debug
* revert debug verbosity
* Learning rate scheduler changed (Constant)
* Cosine 0.01 decay
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Add AutoHandler
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Add Auto cfg option for AutoHAndler
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Len gets called before open
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* path/filepath typo fix
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Partitioning fix from mup-search
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Warmup interval change
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Schedule change
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Constant schedule
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* LR schedule change (cool down and constant lr)
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Update dataset_utils.py
Added a check for length of doc
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* LR schedule change (Warmup + constant)
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Update main_training.py
cleanup for main_training.py
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Mirror doc len check into AHandler, fix mypy in autoHandler
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Linting
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Further linting
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* More mypy type fix
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Rename main_training.py to main_training_mamba.py
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Added main_training_llama.py file
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Rename fms_to_hf.py to fms_to_hf_mamba.py
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Added fms_to_hf_llama.py file
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Delete fms_fsdp/utils/config_utils.py
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Added mamba variant 9.8b
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Incremental mypy fix
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Fix imports (mypy)
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* Rename adapters to work correctly
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
* linting
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
---------
Signed-off-by: Davis Wertheimer <davis.wertheimer@ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Co-authored-by: Linsong Chu <lchu@us.ibm.com>
Co-authored-by: Davis Wertheimer <dww78@cornell.edu>
Co-authored-by: Antoni Viros i Martin <aviros@ibm.com>
0 commit comments