Previously visits were only updated when calling backpropagation with a reward value,
@ #48 attempted to fix by always propagating values up independently from reward.
However this introduces a new bug of overcounting visits, because every time _backpropagate is called, all ancestor nodes get .visits +1.
I think it might be better to separate visit update from _backpropagate, there could be several options:
- Visits could be updated when a node is selected, and the incrementing could be implemented in the
BaseSelector class,
- Alternatively, it could be updated in
search_tree.py's _select function
Previously visits were only updated when calling backpropagation with a reward value,
@ #48 attempted to fix by always propagating values up independently from reward.
However this introduces a new bug of overcounting visits, because every time
_backpropagateis called, all ancestor nodes get.visits+1.I think it might be better to separate visit update from
_backpropagate, there could be several options:BaseSelectorclass,search_tree.py's_selectfunction