DuelingDQN#127
Conversation
|
Just so you know, I had in mind to implement rainbow: DLR-RM/stable-baselines3#622 btw, double dqn is implemented here: https://github.com/osigaud/stable-baselines3/blob/feat/double-dqn/stable_baselines3/dqn/dqn.py#L170-L176 Related: DLR-RM/stable-baselines3#487 |
Agree, it would be much better
Thanks for the ref! |
|
@araffin I want your opinion on this: |
| "MultiInputPolicy": MultiInputPolicy, | ||
| } | ||
|
|
||
| def __init__( |
There was a problem hiding this comment.
if init is the same as DQN, I guess you can drop it.
There was a problem hiding this comment.
The only (but necessary) diff is the policy type hint.
you mean DQN a special case of DuelingDQN?
you are mainly changing the policy, no? |
I'm changing only the policy. |
Yes, it makes more sense in that order. But this would require putting DuelingDQN in the main repo.
I'll make a draft PR in SB3 |
yes, so for now, I would keep it as is. My goal with Vanilla DQN in SB3 was to keep it as simple as possible to be a reference when learning or implementing other things. |
|
On second thought, it might be more practical to work in reverse. I mean, instead of implementing RAINBOW all at once, I was thinking of implementing the features one by one upon DQN, disabling them by default. And at the renaming DQN to RAINBOW, changing the defaults. Roughly we would have: Currently: class DQN:
def __init__(self, policy, env, ...): ...We implement dueling: class DQN:
def __init__(self, policy, env, ..., dueling=False): ...
class DuelingDQN(DQN):
def __init__(self, policy, env, ...):
super().__init__(policy, env, ..., dueling=True)Then we implement PER class DQN:
def __init__(self, policy, env, ..., dueling=False, per=False): ...
class DuelingDQN(DQN):
def __init__(self, policy, env, ..., per=False):
super().__init__(policy, env, ..., dueling=True, per=per)
class PERDQN(DQN):
def __init__(self, policy, env, ..., dueling=False):
super().__init__(policy, env, ..., dueling=dueling, per=True)And so on. Once all the features have been implemented, we can rename DQN to RAINBOW, changing only the default values, and add a DQN class that inherits from RAINBOW: class RAINBOW:
def __init__(self, policy, env, ..., dueling=True, per=True): ...
class DuelingDQN(RAINBOW):
def __init__(self, policy, env, ..., per=False):
super().__init__(policy, env, ..., dueling=True, per=per)
class PERDQN(RAINBOW):
def __init__(self, policy, env, ..., dueling=False):
super().__init__(policy, env, ..., dueling=dueling, per=True)
class DQN(RAINBOW):
def __init__(self, policy, env, ...):
super().__init__(policy, env, ..., dueling=False, per=False)The main constraint is that when renaming DQN -> RAINBOW, and then creating a DQN class that inherits from RAINBOW, we introduce a breaking change, since arguments from DQN are removed (per, dueling, etc.). They would have to be deprecated first. |
why not implement all at once? Compared to DDPG vs TD3, obtaining DQN from Rainbow is not trivial (correct me if I'm wrong). As DQN is a key algorithm in RL, I would really like to keep it simple so people can study it (even if it means duplicated code when implementing DQN extensions). What you propose is what I had in mind (so Dueling special case of RAINBOW) except for DQN. |
We could. I don't know if the same reasoning applies to other rainbow features.
Right now I imagine that it should not be complicated, but I may be wrong, I would have to look into it in detail.
So DQN would be a special case for which we would keep a readable implementation by not deriving it from RAINBOW, right? It seems legitimate. |
Description
TODOs
Naming convention
Method : Dueling DQN
Class name : DuelingDQN
Lowercase : dueling_dqn
branch : dueling-dqn
Context
Types of changes
Checklist:
make format(required)make check-codestyleandmake lint(required)make pytestandmake typeboth pass. (required)Note: we are using a maximum length of 127 characters per line