Skip to content

About Reinforcement Learning #21

@gmftbyGMFTBY

Description

@gmftbyGMFTBY

First of all, thanks for your open-source code of this wonderful work.
I also have some questions about your code of reinforcement learning. I found that in your version of reinforcement learning, you use the training dataset for policy gradient to fine-tuning parameters.
But actually, in my opinion, a user simulator should be used as the environment for updating the parameters in RL setup. Can you tell me the reason?
Thank you very much !

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions