Skip to content

Update hyper params and set seeds#3384

Merged
sekyondaMeta merged 7 commits into
pytorch:mainfrom
splion-360:splion-360/rl-dqn-fixes
Jun 16, 2025
Merged

Update hyper params and set seeds#3384
sekyondaMeta merged 7 commits into
pytorch:mainfrom
splion-360:splion-360/rl-dqn-fixes

Conversation

@splion-360

Copy link
Copy Markdown
Contributor

Fixes #3080

Description

  • This PR updates the hyper parameters for the CartPole-v1 environment in the DQN tutorial to better match the results shown in the reference image (only the tutorial file is modified).
  • A fixed seed has been added to ensure reproducibility of training behavior and evaluation outcomes.

Checklist

  • The issue that is being fixed is referred in the description (see above "Fixes [Improve] - Pytorch Reinforcement DQN Tutorial #3080")
  • Only one issue is addressed in this pull request
  • Labels from the issue that this PR is fixing are added to this pull request
  • No unnecessary issues are included into this pull request.

@pytorch-bot

pytorch-bot Bot commented Jun 4, 2025

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3384

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 4a9b408 with merge base 2c4c99d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions github-actions Bot added docathon-h1-2025 A label for the docathon in H1 2025 medium Reinforcement Learning Issues relating to reinforcement learning tutorials labels Jun 4, 2025
@sekyondaMeta

Copy link
Copy Markdown
Contributor

@vmoens Mind taking a look at these changes

@vmoens vmoens left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Before approving, do you have a learning curve to share before / after these changes?

@splion-360

Copy link
Copy Markdown
Contributor Author

Hello @vmoens . I have attached the learning curves for the CartPole-v1 environment in the DQN tutorial. These plots are generated after confirming the behavior for 2 complete runs by fixing the seeds for reproducibility (as mentioned in the issued #3080).

Before fix

learning_curve-pre-fix

After fix

learning_curve-post-fix

@svekars svekars requested a review from malfet June 5, 2025 19:27
"cpu"
)

# set the seeds for reproducibility

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, we already doing it in the CI, not sure if it's helpful to do something like that for all users...
May be add a paragraph saying to uncomment those if you want fixed output all the time

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good practice to have it being part of the script and I'd keep it here.
It helps when you run it locally - RL is very seed dependent usually

@sekyondaMeta

Copy link
Copy Markdown
Contributor

@vmoens & @malfet Can we go ahead and merge this?

@svekars svekars requested review from malfet and vmoens June 13, 2025 19:28

@vmoens vmoens left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with these changes!

@sekyondaMeta sekyondaMeta merged commit ab2aafd into pytorch:main Jun 16, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed docathon-h1-2025 A label for the docathon in H1 2025 medium Reinforcement Learning Issues relating to reinforcement learning tutorials

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improve] - Pytorch Reinforcement DQN Tutorial

6 participants