miss update#3122
Conversation
BenjaminBossan
left a comment
There was a problem hiding this comment.
Thanks for updating the MiSS paper and examples. I have two small comments, please check.
| miss_config = MissConfig( | ||
| r = 64 | ||
| r = 64, | ||
| miss_dropout = 0.01 |
There was a problem hiding this comment.
I think adding dropout of 0.01 here can be confusing. It may appear like that is the recommended setting, but skimming the paper, this doesn't appear to be the case.
There was a problem hiding this comment.
We observe that many papers do not apply dropout to MiSS, while LoRA variants typically use dropout, which is not a fair comparison. We would like to clarify that MiSS also supports a dropout parameter.
There was a problem hiding this comment.
I see. I think that's fair, I'd just not put 0.01 here, which I assume has no noticeable effect. Is there a rate you'd recommend as it works best?
There was a problem hiding this comment.
Yes, 0.01 is not necessarily the optimal setting, but researchers often use it as a default. However, it can be effective in preventing overfitting in RL training.
By the way, is the official PEFT arena still being updated? Currently, there is no unified training and evaluation setup for fair comparison, so I believe maintaining a well-designed PEFT arena is important. A good benchmark can significantly promote the development of PEFT methods. I’d be happy to help if needed.
There was a problem hiding this comment.
Yes, 0.01 is not necessarily the optimal setting, but researchers often use it as a default. However, it can be effective in preventing overfitting in RL training.
I was not aware.
By the way, is the official PEFT arena still being updated?
Yes. We recently merged some new PEFT methods and will rerun it sometime soon. But generally, unless we have any indication that something important changed (say, a bug in PEFT that affects performance), we won't rerun the old experiments, just the new ones.
Currently, there is no unified training and evaluation setup for fair comparison, so I believe maintaining a well-designed PEFT arena is important. A good benchmark can significantly promote the development of PEFT methods. I’d be happy to help if needed.
We see it the same way and definitely want to expand our support in that area. Not sure if you saw that, but we're currently adding an image generation (#3082) and a reinforcement learning (#3078) benchmark. We always welcome help with that. Easiest is to contribute new experiments, but other forms of contribution are also welcome (ideally discussed with us first to be aligned).
Besides that, we added a feature to convert non-LoRA adapaters into LoRA adapters (yet unreleased, see https://huggingface.co/docs/peft/main/en/package_reference/lora_conversion). This should allow methods like MiSS to be more easily adapted in downstream packages that only support LoRA, like Diffusers or vLLM. The conversion is not perfect, it's lossy (using SVD) and right now, only supports "easy conversion" targets of type: W' = W_0 + dW (so bat works but mini doesn't). This is also an area that we welcome support.
There was a problem hiding this comment.
Yes. We recently merged some new PEFT methods and will rerun it sometime soon. But generally, unless we have any indication that something important changed (say, a bug in PEFT that affects performance), we won't rerun the old experiments, just the new ones.
My intention is not to rerun the experiments, but rather to build a more complete PEFT pipeline.
We see it the same way and definitely want to expand our support in that area. Not sure if you saw that, but we're currently adding an image generation (#3082) and a reinforcement learning (#3078) benchmark. We always welcome help with that. Easiest is to contribute new experiments, but other forms of contribution are also welcome (ideally discussed with us first to be aligned).
^^
Besides that, we added a feature to convert non-LoRA adapaters into LoRA adapters (yet unreleased, see https://huggingface.co/docs/peft/main/en/package_reference/lora_conversion). This should allow methods like MiSS to be more easily adapted in downstream packages that only support LoRA, like Diffusers or vLLM. The conversion is not perfect, it's lossy (using SVD) and right now, only supports "easy conversion" targets of type:
W' = W_0 + dW(sobatworks butminidoesn't). This is also an area that we welcome support.
MiSS can be converted into the LoRA form without any loss. I will add this section to the paper as soon as possible. Meanwhile, I would also like to integrate it into PEFT—how should I proceed?
There was a problem hiding this comment.
My intention is not to rerun the experiments, but rather to build a more complete PEFT pipeline.
If you have a suggestion, LMK. Ideally, let's open a new issue for better visibility.
MiSS can be converted into the LoRA form without any loss. I will add this section to the paper as soon as possible. Meanwhile, I would also like to integrate it into PEFT—how should I proceed?
If exact conversion is possible, we definitely want to support that. The main logic lives here:
peft/src/peft/tuners/lora/conversion.py
Line 81 in 2d48882
I think we could add a check for the PEFT type and then, instead of running this loop:
peft/src/peft/tuners/lora/conversion.py
Lines 234 to 269 in 2d48882
we dispatch to a MiSS-specific function for exact conversion. If you want, feel free to start a (draft) PR and I'll help you with the implementation details.
| miss_config = MissConfig( | ||
| r = 64 | ||
| r = 64, | ||
| miss_dropout = 0.01 |
There was a problem hiding this comment.
I see. I think that's fair, I'd just not put 0.01 here, which I assume has no noticeable effect. Is there a rate you'd recommend as it works best?
BenjaminBossan
left a comment
There was a problem hiding this comment.
Thanks for the additional info. PR LGTM.
| miss_config = MissConfig( | ||
| r = 64 | ||
| r = 64, | ||
| miss_dropout = 0.01 |
There was a problem hiding this comment.
Yes, 0.01 is not necessarily the optimal setting, but researchers often use it as a default. However, it can be effective in preventing overfitting in RL training.
I was not aware.
By the way, is the official PEFT arena still being updated?
Yes. We recently merged some new PEFT methods and will rerun it sometime soon. But generally, unless we have any indication that something important changed (say, a bug in PEFT that affects performance), we won't rerun the old experiments, just the new ones.
Currently, there is no unified training and evaluation setup for fair comparison, so I believe maintaining a well-designed PEFT arena is important. A good benchmark can significantly promote the development of PEFT methods. I’d be happy to help if needed.
We see it the same way and definitely want to expand our support in that area. Not sure if you saw that, but we're currently adding an image generation (#3082) and a reinforcement learning (#3078) benchmark. We always welcome help with that. Easiest is to contribute new experiments, but other forms of contribution are also welcome (ideally discussed with us first to be aligned).
Besides that, we added a feature to convert non-LoRA adapaters into LoRA adapters (yet unreleased, see https://huggingface.co/docs/peft/main/en/package_reference/lora_conversion). This should allow methods like MiSS to be more easily adapted in downstream packages that only support LoRA, like Diffusers or vLLM. The conversion is not perfect, it's lossy (using SVD) and right now, only supports "easy conversion" targets of type: W' = W_0 + dW (so bat works but mini doesn't). This is also an area that we welcome support.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
I’m very happy to share that MiSS has been accepted to ICLR 2026. Here are a few small updates.