Add Gaussian Noise Option to SyntheticBanditDataset by usaito · Pull Request #188 · st-tech/zr-obp

usaito · 2022-12-03T10:11:40Z

Enable to choose reward_noise_distribution="normal" and reward_noise_distribution="truncated_normal" when using obp.dataset.SyntheticBanditDataset with reward_type="continuous". Before this, the truncated normal noise was the only option, which is not flexible for OPE experiments.
fix a bug of obp.policy.QLearner when fitting_method="iw"

usaito added 2 commits December 3, 2022 04:14

add gaussian noise option to SyntheticBanditDataset

b890c82

fix a bug in QLearner when importance weighting is applied

204777c

Provide feedback