Dear authors,
Thank you for open-sourcing your great work.
Could you please release the model checkpoints after safety training?
Alternatively, could you provide guidance on how you performed safety training on HH-RLHF
(e.g., minimal instructions to reproduce your setup on the OpenRLHF codebase)
so I can check detailed setups other than those already specified in your Appendix B, such as LR scheduler, etc.?
Appreciate it.
Best,
Arthur
Dear authors,
Thank you for open-sourcing your great work.
Could you please release the model checkpoints after safety training?
Alternatively, could you provide guidance on how you performed safety training on HH-RLHF
(e.g., minimal instructions to reproduce your setup on the OpenRLHF codebase)
so I can check detailed setups other than those already specified in your Appendix B, such as LR scheduler, etc.?
Appreciate it.
Best,
Arthur