Fixed step used to log SAC summary#1008
Conversation
| self.summary = tf.summary.merge_all() | ||
|
|
||
| def _train_step(self, step, writer, learning_rate): | ||
| del step |
There was a problem hiding this comment.
if step is not used anymore, why not removing it from the arguments?
There was a problem hiding this comment.
good point! I will do that instead, wasn't sure if it followed a train_step template used by other algorithms as well, but it appears that is not the case
There was a problem hiding this comment.
removed step, and fixed issue of wrapping replay_buffer on each call to learn (now only happens on the first one). Although let me know if you think it is better to check if self.model.replay_buffer is wrapped instead using isinstance(self.model.replay_buffer, HindsightExperienceReplayWrapper)
…ping of HER replay_buffer
aad197b to
8123b35
Compare
|
|
||
| def learn(self, total_timesteps, callback=None, log_interval=100, tb_log_name="HER", | ||
| reset_num_timesteps=True): | ||
| replay_wrapper = self.replay_wrapper if not self.wrapped_buffer else None |
There was a problem hiding this comment.
this look like another issue... Please open a separate PR (you can include the minimal code and traceback there to reproduce the error).
Although it is small, it is better to solve one problem at a time ;)
fixes #1007
Description
Changed call to writer.add_summary to use self.num_timesteps in the case that num_timesteps is not reset after multiple calls to model.learn(..., reset_num_timesteps=False).
Motivation and Context
I have raised an issue to propose this change (required for new features and bug fixes)
Issues #1007 shows an example where tensorboard plotted line is discontinuous due to incorrect step counter
Types of changes
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)
Checklist:
I've read the CONTRIBUTION guide (required)
I have updated the changelog accordingly (required).
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.
I have ensured pytest and pytype both pass (by running make pytest and make type).