In the paper, there is a state that
leverage the output state policy in nonmemory environments and the full state policy or hidden state policy within memory environments.
But in the configs.yaml, there is only in the POPGym task where the actor.inputs is set to be [stoch, hidden]. Why? Looking forward to your reply.
In the paper, there is a state that
But in the configs.yaml, there is only in the POPGym task where the actor.inputs is set to be [stoch, hidden]. Why? Looking forward to your reply.