Skip to content

Fix AlphaZero auto-reset carry state#1315

Open
Yusuke-Mukuta wants to merge 1 commit into
sotetsuk:mainfrom
Yusuke-Mukuta:fix/alphazero-board-init-in-loop
Open

Fix AlphaZero auto-reset carry state#1315
Yusuke-Mukuta wants to merge 1 commit into
sotetsuk:mainfrom
Yusuke-Mukuta:fix/alphazero-board-init-in-loop

Conversation

@Yusuke-Mukuta

Copy link
Copy Markdown

Preserve the reward and terminal flag returned by auto_reset for training targets, then clear rewards, terminated, and truncated before carrying the reset state into the next self-play step. This prevents MCTS from starting from a reset initial board that still has terminal metadata from the previous episode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant