Greetings,
First of all, really appreciate the work you have done. This has been really useful for my experiments.
Now, to my question, I have seen that generally, for fine-tuning, the batch normalization layers are frozen.
Did the authors of this repository try something like that?
Just curious about it. I do plan to implement it myself (any tips on that would be appreciated btw), but I was wondering if the authors of this repo or the paper have any insights on it.
Greetings,
First of all, really appreciate the work you have done. This has been really useful for my experiments.
Now, to my question, I have seen that generally, for fine-tuning, the batch normalization layers are frozen.
Did the authors of this repository try something like that?
Just curious about it. I do plan to implement it myself (any tips on that would be appreciated btw), but I was wondering if the authors of this repo or the paper have any insights on it.