[Performance] Add _skip_maybe_reset flag to bypass auto-reset in step_and_maybe_reset#3560
[Performance] Add _skip_maybe_reset flag to bypass auto-reset in step_and_maybe_reset#3560vmoens wants to merge 2 commits intogh/vmoens/241/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3560
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ⏳ No Failures, 14 PendingAs of commit db0dc99 with merge base a4301ee ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 83.0031μs | 82.2934μs | 12.1516 KOps/s | 12.3255 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1478ms | 0.1462ms | 6.8386 KOps/s | 7.0491 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1106s | 0.1101s | 9.0830 Ops/s | 8.8721 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.6900μs | 2.6855μs | 372.3723 KOps/s | 394.9506 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 37.4513μs | 37.2260μs | 26.8630 KOps/s | 27.4249 KOps/s | |
| test_simple | 0.5460s | 0.5450s | 1.8349 Ops/s | 1.7410 Ops/s | |
| test_transformed | 1.0925s | 1.0898s | 0.9176 Ops/s | 0.8901 Ops/s | |
| test_serial | 1.7057s | 1.6960s | 0.5896 Ops/s | 0.5783 Ops/s | |
| test_parallel | 1.0283s | 1.0230s | 0.9775 Ops/s | 0.9698 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.3354ms | 41.9960μs | 23.8118 KOps/s | 23.9635 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 49.3210μs | 23.2128μs | 43.0797 KOps/s | 43.7010 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 71.0520μs | 23.4910μs | 42.5696 KOps/s | 41.9813 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 40.0210μs | 12.7626μs | 78.3541 KOps/s | 77.3343 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 73.7810μs | 43.8408μs | 22.8098 KOps/s | 22.7904 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 69.2510μs | 25.6485μs | 38.9886 KOps/s | 39.6125 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 57.3710μs | 26.0510μs | 38.3863 KOps/s | 38.6520 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 45.3710μs | 15.5654μs | 64.2452 KOps/s | 65.8822 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 76.8910μs | 47.4002μs | 21.0969 KOps/s | 21.6968 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 65.5510μs | 28.9358μs | 34.5592 KOps/s | 35.9540 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 62.3110μs | 26.5893μs | 37.6092 KOps/s | 38.6890 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 52.6810μs | 15.7163μs | 63.6283 KOps/s | 64.8689 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 89.2710μs | 49.8309μs | 20.0679 KOps/s | 20.2866 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 64.4920μs | 31.1695μs | 32.0827 KOps/s | 32.4490 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 61.7420μs | 29.0510μs | 34.4222 KOps/s | 35.0114 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 48.0010μs | 17.9982μs | 55.5613 KOps/s | 56.3126 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 74.3920μs | 46.8265μs | 21.3554 KOps/s | 21.0114 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 68.1220μs | 28.2646μs | 35.3799 KOps/s | 36.1060 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.3469ms | 30.3672μs | 32.9302 KOps/s | 33.9840 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 50.4310μs | 17.5561μs | 56.9602 KOps/s | 59.1063 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 0.1256ms | 49.4498μs | 20.2225 KOps/s | 20.5808 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 61.5310μs | 31.0492μs | 32.2070 KOps/s | 33.0431 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 62.6810μs | 32.1827μs | 31.0726 KOps/s | 31.7941 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 57.7710μs | 19.7330μs | 50.6766 KOps/s | 51.9227 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 80.3810μs | 51.8173μs | 19.2986 KOps/s | 19.2604 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 61.0410μs | 33.4831μs | 29.8658 KOps/s | 31.2584 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 62.4210μs | 32.0299μs | 31.2208 KOps/s | 31.4430 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 48.7710μs | 19.4108μs | 51.5177 KOps/s | 51.5474 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 85.8910μs | 53.9050μs | 18.5511 KOps/s | 18.7227 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 67.0510μs | 36.1931μs | 27.6296 KOps/s | 28.1999 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 61.2020μs | 33.8740μs | 29.5211 KOps/s | 28.8336 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 53.0310μs | 22.0807μs | 45.2884 KOps/s | 46.0184 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8534s | 0.7436s | 1.3448 Ops/s | 1.3396 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7156s | 0.6038s | 1.6561 Ops/s | 1.6328 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7285s | 1.6437s | 0.6084 Ops/s | 0.6044 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5128s | 1.4287s | 0.6999 Ops/s | 0.7011 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9781s | 1.8978s | 0.5269 Ops/s | 0.5226 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7679s | 1.6815s | 0.5947 Ops/s | 0.5915 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6833s | 4.6231s | 0.2163 Ops/s | 0.2145 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5118s | 4.4196s | 0.2263 Ops/s | 0.2262 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9888s | 1.9011s | 0.5260 Ops/s | 0.5364 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6654s | 1.5898s | 0.6290 Ops/s | 0.6326 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 10.6565ms | 10.2464ms | 97.5954 Ops/s | 99.4797 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 19.2830ms | 17.3880ms | 57.5109 Ops/s | 57.3714 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2013ms | 0.1316ms | 7.5970 KOps/s | 7.6171 KOps/s | |
| test_values[td1_return_estimate-False-False] | 29.1241ms | 28.0120ms | 35.6989 Ops/s | 35.8448 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 17.7579ms | 17.4786ms | 57.2128 Ops/s | 56.9585 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 43.4091ms | 41.5788ms | 24.0507 Ops/s | 24.5953 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 18.7168ms | 17.5589ms | 56.9511 Ops/s | 57.2624 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 9.1119ms | 9.0168ms | 110.9042 Ops/s | 113.1399 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.0260ms | 1.5504ms | 645.0011 Ops/s | 644.5285 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.6028ms | 0.4319ms | 2.3153 KOps/s | 2.3954 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 35.5900ms | 34.5702ms | 28.9266 Ops/s | 29.0534 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.1904ms | 1.7229ms | 580.4208 Ops/s | 583.8918 Ops/s | |
| test_dqn_speed[False-None] | 1.6161ms | 1.4161ms | 706.1730 Ops/s | 709.5376 Ops/s | |
| test_dqn_speed[False-backward] | 2.0699ms | 2.0195ms | 495.1781 Ops/s | 507.8040 Ops/s | |
| test_dqn_speed[True-None] | 0.8557ms | 0.5910ms | 1.6921 KOps/s | 1.7081 KOps/s | |
| test_dqn_speed[True-backward] | 1.0902ms | 1.0367ms | 964.6133 Ops/s | 944.3419 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.9197ms | 0.5547ms | 1.8027 KOps/s | 1.7333 KOps/s | |
| test_ddpg_speed[False-None] | 3.2035ms | 2.8492ms | 350.9760 Ops/s | 351.3395 Ops/s | |
| test_ddpg_speed[False-backward] | 4.1774ms | 4.0553ms | 246.5917 Ops/s | 242.3480 Ops/s | |
| test_ddpg_speed[True-None] | 8.3865ms | 1.5468ms | 646.5090 Ops/s | 677.2813 Ops/s | |
| test_ddpg_speed[True-backward] | 2.5580ms | 2.4706ms | 404.7657 Ops/s | 366.7891 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 2.1942ms | 1.4510ms | 689.1965 Ops/s | 684.2492 Ops/s | |
| test_sac_speed[False-None] | 9.7774ms | 8.1842ms | 122.1860 Ops/s | 121.8226 Ops/s | |
| test_sac_speed[False-backward] | 11.8295ms | 11.3154ms | 88.3752 Ops/s | 87.3953 Ops/s | |
| test_sac_speed[True-None] | 2.3100ms | 2.1918ms | 456.2420 Ops/s | 450.2571 Ops/s | |
| test_sac_speed[True-backward] | 4.3844ms | 4.1145ms | 243.0400 Ops/s | 238.1631 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.3000ms | 2.1770ms | 459.3426 Ops/s | 443.3157 Ops/s | |
| test_redq_speed[False-None] | 16.1258ms | 10.7872ms | 92.7028 Ops/s | 93.9334 Ops/s | |
| test_redq_speed[False-backward] | 23.8786ms | 17.8547ms | 56.0075 Ops/s | 55.2967 Ops/s | |
| test_redq_speed[True-None] | 4.7296ms | 4.5217ms | 221.1536 Ops/s | 211.7969 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.7502ms | 4.5008ms | 222.1851 Ops/s | 222.0178 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.7401ms | 11.0010ms | 90.9007 Ops/s | 89.4470 Ops/s | |
| test_redq_deprec_speed[False-backward] | 16.1630ms | 15.7735ms | 63.3975 Ops/s | 62.0689 Ops/s | |
| test_redq_deprec_speed[True-None] | 4.0490ms | 3.5793ms | 279.3873 Ops/s | 260.4865 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.5626ms | 7.1323ms | 140.2074 Ops/s | 119.1163 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 3.8674ms | 3.5153ms | 284.4728 Ops/s | 268.9070 Ops/s | |
| test_td3_speed[False-None] | 8.4518ms | 8.1851ms | 122.1734 Ops/s | 120.9637 Ops/s | |
| test_td3_speed[False-backward] | 11.2576ms | 10.9838ms | 91.0434 Ops/s | 89.7200 Ops/s | |
| test_td3_speed[True-None] | 1.8570ms | 1.8192ms | 549.7024 Ops/s | 540.1317 Ops/s | |
| test_td3_speed[True-backward] | 3.6853ms | 3.5785ms | 279.4485 Ops/s | 254.4475 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8789ms | 1.7866ms | 559.7096 Ops/s | 541.2705 Ops/s | |
| test_cql_speed[False-None] | 29.6532ms | 26.7860ms | 37.3330 Ops/s | 37.9403 Ops/s | |
| test_cql_speed[False-backward] | 38.7165ms | 35.5538ms | 28.1264 Ops/s | 28.4453 Ops/s | |
| test_cql_speed[True-None] | 12.8383ms | 12.1422ms | 82.3572 Ops/s | 77.9540 Ops/s | |
| test_cql_speed[True-backward] | 17.5938ms | 17.2964ms | 57.8156 Ops/s | 57.0957 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 12.7340ms | 12.3334ms | 81.0810 Ops/s | 80.5148 Ops/s | |
| test_a2c_speed[False-None] | 5.4260ms | 5.2592ms | 190.1432 Ops/s | 182.4159 Ops/s | |
| test_a2c_speed[False-backward] | 13.9422ms | 11.9277ms | 83.8384 Ops/s | 84.6310 Ops/s | |
| test_a2c_speed[True-None] | 4.0401ms | 3.7881ms | 263.9822 Ops/s | 259.4143 Ops/s | |
| test_a2c_speed[True-backward] | 9.1763ms | 8.7509ms | 114.2737 Ops/s | 102.5565 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.1627ms | 3.7979ms | 263.3028 Ops/s | 259.6615 Ops/s | |
| test_ppo_speed[False-None] | 6.1430ms | 5.9201ms | 168.9155 Ops/s | 167.1764 Ops/s | |
| test_ppo_speed[False-backward] | 12.7589ms | 12.3957ms | 80.6733 Ops/s | 79.6698 Ops/s | |
| test_ppo_speed[True-None] | 4.3479ms | 3.8079ms | 262.6129 Ops/s | 264.7291 Ops/s | |
| test_ppo_speed[True-backward] | 9.1122ms | 8.6737ms | 115.2908 Ops/s | 115.8063 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 4.2466ms | 3.7669ms | 265.4714 Ops/s | 265.4588 Ops/s | |
| test_reinforce_speed[False-None] | 4.7938ms | 4.6093ms | 216.9524 Ops/s | 219.2279 Ops/s | |
| test_reinforce_speed[False-backward] | 7.6102ms | 7.4403ms | 134.4028 Ops/s | 134.9365 Ops/s | |
| test_reinforce_speed[True-None] | 3.4744ms | 3.0155ms | 331.6228 Ops/s | 329.9157 Ops/s | |
| test_reinforce_speed[True-backward] | 8.0224ms | 7.8814ms | 126.8817 Ops/s | 119.3689 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.4169ms | 2.9714ms | 336.5437 Ops/s | 331.4936 Ops/s | |
| test_iql_speed[False-None] | 25.0992ms | 20.1844ms | 49.5431 Ops/s | 49.6090 Ops/s | |
| test_iql_speed[False-backward] | 35.1038ms | 30.2875ms | 33.0170 Ops/s | 32.7932 Ops/s | |
| test_iql_speed[True-None] | 9.3924ms | 8.4893ms | 117.7947 Ops/s | 114.8729 Ops/s | |
| test_iql_speed[True-backward] | 16.9542ms | 16.5516ms | 60.4170 Ops/s | 58.7231 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 9.0235ms | 8.5177ms | 117.4025 Ops/s | 116.6854 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.2851ms | 6.1558ms | 162.4494 Ops/s | 161.5793 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.9385ms | 0.3444ms | 2.9036 KOps/s | 2.7077 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5753ms | 0.3184ms | 3.1410 KOps/s | 2.9225 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1880ms | 5.8996ms | 169.5034 Ops/s | 168.6989 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.1257ms | 0.2873ms | 3.4802 KOps/s | 2.8332 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4420ms | 0.2662ms | 3.7571 KOps/s | 3.5067 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.6542ms | 1.2809ms | 780.7152 Ops/s | 692.6229 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.5483ms | 1.2078ms | 827.9822 Ops/s | 752.8565 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 9.3003ms | 6.0956ms | 164.0524 Ops/s | 164.7594 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0770ms | 0.4718ms | 2.1198 KOps/s | 2.3108 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8269ms | 0.4604ms | 2.1719 KOps/s | 2.3703 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0249ms | 5.8648ms | 170.5091 Ops/s | 167.3423 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.0263ms | 0.3620ms | 2.7623 KOps/s | 3.2880 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6140ms | 0.3209ms | 3.1165 KOps/s | 3.0994 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0904ms | 5.8324ms | 171.4565 Ops/s | 168.7798 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9633ms | 0.3565ms | 2.8052 KOps/s | 2.8378 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5110ms | 0.3280ms | 3.0483 KOps/s | 3.1855 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1284ms | 6.0018ms | 166.6170 Ops/s | 165.6041 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.2142ms | 0.5006ms | 1.9978 KOps/s | 1.9929 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7752ms | 0.4917ms | 2.0336 KOps/s | 2.0952 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.4917ms | 5.0950ms | 196.2710 Ops/s | 44.9245 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 3.5661ms | 1.9364ms | 516.4285 Ops/s | 511.8734 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 7.3390ms | 1.2505ms | 799.6655 Ops/s | 755.3821 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.6410s | 17.9936ms | 55.5754 Ops/s | 194.6503 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 4.9657ms | 1.8332ms | 545.4859 Ops/s | 562.6339 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 2.1923ms | 1.1040ms | 905.7853 Ops/s | 1.1206 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 9.3642ms | 5.3559ms | 186.7117 Ops/s | 187.1775 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 4.0170ms | 1.9389ms | 515.7572 Ops/s | 464.0667 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 12.4662ms | 1.4917ms | 670.3980 Ops/s | 880.4847 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 42.7203ms | 39.9823ms | 25.0110 Ops/s | 25.2850 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 20.8388ms | 18.8778ms | 52.9724 Ops/s | 54.3072 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 45.2076ms | 40.9583ms | 24.4151 Ops/s | 24.3534 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 21.0835ms | 19.4907ms | 51.3065 Ops/s | 53.5378 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 44.8262ms | 42.6058ms | 23.4710 Ops/s | 23.2749 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 0.5768s | 31.6836ms | 31.5620 Ops/s | 48.6429 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8524ms | 0.2235ms | 4.4734 KOps/s | 4.2682 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.5795ms | 1.3818ms | 723.6887 Ops/s | 714.3018 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.5436ms | 2.3926ms | 417.9604 Ops/s | 420.4667 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.1065ms | 2.9238ms | 342.0222 Ops/s | 342.7436 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.6079ms | 0.1407ms | 7.1052 KOps/s | 7.3010 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3443ms | 0.1922ms | 5.2034 KOps/s | 5.1638 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.9544ms | 1.7887ms | 559.0755 Ops/s | 575.4885 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.7487ms | 1.3080ms | 764.5083 Ops/s | 780.9584 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.5812ms | 1.1377ms | 878.9487 Ops/s | 874.0932 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.7776ms | 3.5747ms | 279.7437 Ops/s | 275.6482 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 10.3692ms | 5.6396ms | 177.3165 Ops/s | 176.6251 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.4975ms | 7.3213ms | 136.5884 Ops/s | 142.1230 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4306ms | 0.2802ms | 3.5695 KOps/s | 3.4806 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.6336ms | 1.4941ms | 669.2985 Ops/s | 654.3839 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.6341ms | 2.5097ms | 398.4565 Ops/s | 397.0481 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.4747ms | 3.1412ms | 318.3459 Ops/s | 317.8926 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 33.7115ms | 32.8472ms | 30.4440 Ops/s | 30.1779 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 65.0133ms | 64.7006ms | 15.4558 Ops/s | 15.3073 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 38.2435ms | 37.7225ms | 26.5094 Ops/s | 26.2956 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 74.2895ms | 73.6565ms | 13.5765 Ops/s | 13.4424 Ops/s |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 82.1249μs | 79.6404μs | 12.5564 KOps/s | 12.5510 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1420ms | 0.1390ms | 7.1929 KOps/s | 7.2414 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1026s | 0.1023s | 9.7749 Ops/s | 9.8832 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.4447μs | 2.4387μs | 410.0493 KOps/s | 411.4385 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 37.1041μs | 36.1976μs | 27.6261 KOps/s | 27.8511 KOps/s | |
| test_simple | 0.8960s | 0.8071s | 1.2389 Ops/s | 1.2546 Ops/s | |
| test_transformed | 1.3514s | 1.3491s | 0.7412 Ops/s | 0.7246 Ops/s | |
| test_serial | 2.2579s | 2.2547s | 0.4435 Ops/s | 0.4339 Ops/s | |
| test_parallel | 1.8949s | 1.8018s | 0.5550 Ops/s | 0.5592 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.2887ms | 40.9994μs | 24.3906 KOps/s | 25.4311 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 52.4900μs | 22.3297μs | 44.7835 KOps/s | 44.6835 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 56.9310μs | 23.1007μs | 43.2888 KOps/s | 43.7983 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 43.5500μs | 12.3895μs | 80.7134 KOps/s | 80.4517 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 95.2310μs | 42.9382μs | 23.2893 KOps/s | 23.3487 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 46.9200μs | 24.4670μs | 40.8714 KOps/s | 40.4733 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 54.8410μs | 24.9740μs | 40.0416 KOps/s | 39.4676 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 51.1210μs | 15.0362μs | 66.5060 KOps/s | 66.3630 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 0.1169ms | 45.2808μs | 22.0844 KOps/s | 22.2409 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 50.0800μs | 27.6261μs | 36.1977 KOps/s | 36.3466 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 57.0210μs | 25.8395μs | 38.7004 KOps/s | 39.0335 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 45.4700μs | 14.9445μs | 66.9142 KOps/s | 66.0216 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 84.5220μs | 47.7170μs | 20.9569 KOps/s | 20.8074 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 67.9510μs | 29.7404μs | 33.6243 KOps/s | 33.5771 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 70.5200μs | 27.8301μs | 35.9323 KOps/s | 35.9143 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 56.3810μs | 17.5774μs | 56.8912 KOps/s | 56.4334 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 90.1810μs | 46.4392μs | 21.5336 KOps/s | 21.8989 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 66.5910μs | 27.1918μs | 36.7758 KOps/s | 36.5022 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.6985ms | 29.6342μs | 33.7448 KOps/s | 35.3020 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 59.5510μs | 16.6872μs | 59.9263 KOps/s | 60.3925 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 84.7310μs | 47.6480μs | 20.9872 KOps/s | 21.0610 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 58.7110μs | 29.8448μs | 33.5067 KOps/s | 33.5428 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 56.9310μs | 31.8912μs | 31.3567 KOps/s | 32.2791 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 51.1710μs | 18.9820μs | 52.6814 KOps/s | 52.5777 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 97.0110μs | 50.0281μs | 19.9888 KOps/s | 19.7864 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 99.4810μs | 32.3648μs | 30.8978 KOps/s | 30.9879 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 62.6800μs | 30.9468μs | 32.3135 KOps/s | 32.3082 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 52.6000μs | 19.1659μs | 52.1759 KOps/s | 52.9331 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 93.0110μs | 52.3100μs | 19.1168 KOps/s | 19.0272 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 61.6710μs | 34.7115μs | 28.8089 KOps/s | 28.8541 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 98.2810μs | 33.7607μs | 29.6202 KOps/s | 30.5035 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 57.0310μs | 21.4839μs | 46.5464 KOps/s | 47.2159 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7143s | 0.7080s | 1.4125 Ops/s | 1.3702 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.6993s | 0.5964s | 1.6767 Ops/s | 1.6571 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7096s | 1.6152s | 0.6191 Ops/s | 0.6197 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4859s | 1.4013s | 0.7136 Ops/s | 0.7114 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9495s | 1.8603s | 0.5375 Ops/s | 0.5396 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7396s | 1.6528s | 0.6050 Ops/s | 0.6030 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6046s | 4.5202s | 0.2212 Ops/s | 0.2220 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5246s | 4.3748s | 0.2286 Ops/s | 0.2300 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9806s | 1.8746s | 0.5334 Ops/s | 0.5438 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6324s | 1.5500s | 0.6451 Ops/s | 0.6308 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 20.4069ms | 19.4693ms | 51.3628 Ops/s | 52.1930 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1341s | 3.5902ms | 278.5374 Ops/s | 270.5801 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1046ms | 81.1038μs | 12.3299 KOps/s | 12.1055 KOps/s | |
| test_values[td1_return_estimate-False-False] | 48.5151ms | 46.9693ms | 21.2905 Ops/s | 21.3312 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.4032ms | 1.0768ms | 928.7004 Ops/s | 926.1204 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 79.4752ms | 75.9731ms | 13.1626 Ops/s | 13.1692 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.3125ms | 1.0697ms | 934.7992 Ops/s | 936.5424 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 20.5029ms | 19.6391ms | 50.9188 Ops/s | 50.7323 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.9982ms | 0.7374ms | 1.3561 KOps/s | 1.3529 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7153ms | 0.6620ms | 1.5105 KOps/s | 1.5127 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5644ms | 1.4741ms | 678.3610 Ops/s | 681.3210 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7419ms | 0.6775ms | 1.4761 KOps/s | 1.4905 KOps/s | |
| test_dqn_speed[False-None] | 1.6943ms | 1.5564ms | 642.5054 Ops/s | 642.3170 Ops/s | |
| test_dqn_speed[False-backward] | 2.2867ms | 2.1989ms | 454.7667 Ops/s | 457.4076 Ops/s | |
| test_dqn_speed[True-None] | 0.6675ms | 0.5909ms | 1.6922 KOps/s | 1.6232 KOps/s | |
| test_dqn_speed[True-backward] | 1.1867ms | 1.1335ms | 882.2596 Ops/s | 780.7849 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7056ms | 0.6228ms | 1.6057 KOps/s | 1.6092 KOps/s | |
| test_ddpg_speed[False-None] | 3.3488ms | 2.9671ms | 337.0350 Ops/s | 340.7469 Ops/s | |
| test_ddpg_speed[False-backward] | 4.5974ms | 4.2053ms | 237.7958 Ops/s | 231.9708 Ops/s | |
| test_ddpg_speed[True-None] | 1.4991ms | 1.3632ms | 733.5653 Ops/s | 728.5087 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4335ms | 2.3829ms | 419.6515 Ops/s | 388.2468 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.5249ms | 1.3540ms | 738.5645 Ops/s | 697.3843 Ops/s | |
| test_sac_speed[False-None] | 8.9975ms | 8.3323ms | 120.0149 Ops/s | 118.1177 Ops/s | |
| test_sac_speed[False-backward] | 11.5740ms | 11.2474ms | 88.9092 Ops/s | 86.6503 Ops/s | |
| test_sac_speed[True-None] | 2.4082ms | 1.9115ms | 523.1494 Ops/s | 521.2789 Ops/s | |
| test_sac_speed[True-backward] | 3.5948ms | 3.5278ms | 283.4605 Ops/s | 264.8247 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 17.1291ms | 10.0688ms | 99.3172 Ops/s | 100.4175 Ops/s | |
| test_redq_deprec_speed[False-None] | 10.3279ms | 9.3594ms | 106.8449 Ops/s | 107.3034 Ops/s | |
| test_redq_deprec_speed[False-backward] | 12.8856ms | 12.4404ms | 80.3831 Ops/s | 79.3917 Ops/s | |
| test_redq_deprec_speed[True-None] | 3.2999ms | 2.6879ms | 372.0424 Ops/s | 357.0060 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.6305ms | 4.1954ms | 238.3548 Ops/s | 225.8239 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 14.2275ms | 9.4801ms | 105.4847 Ops/s | 103.7991 Ops/s | |
| test_td3_speed[False-None] | 8.4676ms | 8.2445ms | 121.2930 Ops/s | 121.7305 Ops/s | |
| test_td3_speed[False-backward] | 11.0832ms | 10.5786ms | 94.5307 Ops/s | 92.8426 Ops/s | |
| test_td3_speed[True-None] | 1.7170ms | 1.6844ms | 593.6771 Ops/s | 587.5772 Ops/s | |
| test_td3_speed[True-backward] | 3.1435ms | 3.0537ms | 327.4714 Ops/s | 302.5590 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 48.7199ms | 25.1290ms | 39.7947 Ops/s | 38.5228 Ops/s | |
| test_cql_speed[False-None] | 17.5966ms | 17.2967ms | 57.8144 Ops/s | 57.5407 Ops/s | |
| test_cql_speed[False-backward] | 23.0059ms | 22.4946ms | 44.4552 Ops/s | 43.8223 Ops/s | |
| test_cql_speed[True-None] | 3.5657ms | 3.4003ms | 294.0959 Ops/s | 286.0417 Ops/s | |
| test_cql_speed[True-backward] | 5.6506ms | 5.5410ms | 180.4716 Ops/s | 178.1185 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 18.3221ms | 11.9471ms | 83.7020 Ops/s | 82.8251 Ops/s | |
| test_a2c_speed[False-None] | 3.7093ms | 3.2655ms | 306.2287 Ops/s | 306.8623 Ops/s | |
| test_a2c_speed[False-backward] | 6.6226ms | 6.1376ms | 162.9289 Ops/s | 164.5039 Ops/s | |
| test_a2c_speed[True-None] | 1.5463ms | 1.4632ms | 683.4365 Ops/s | 682.2850 Ops/s | |
| test_a2c_speed[True-backward] | 3.1902ms | 3.1266ms | 319.8362 Ops/s | 302.9035 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.2316ms | 1.0776ms | 927.9786 Ops/s | 916.7603 Ops/s | |
| test_ppo_speed[False-None] | 3.9746ms | 3.8938ms | 256.8189 Ops/s | 252.8948 Ops/s | |
| test_ppo_speed[False-backward] | 7.4749ms | 6.9794ms | 143.2795 Ops/s | 138.1828 Ops/s | |
| test_ppo_speed[True-None] | 1.7005ms | 1.5836ms | 631.4625 Ops/s | 623.7318 Ops/s | |
| test_ppo_speed[True-backward] | 3.2939ms | 3.2473ms | 307.9441 Ops/s | 285.2499 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.2260ms | 1.1370ms | 879.5421 Ops/s | 857.4530 Ops/s | |
| test_reinforce_speed[False-None] | 3.2239ms | 2.3771ms | 420.6820 Ops/s | 429.8048 Ops/s | |
| test_reinforce_speed[False-backward] | 3.5572ms | 3.3732ms | 296.4580 Ops/s | 289.4510 Ops/s | |
| test_reinforce_speed[True-None] | 1.6311ms | 1.4261ms | 701.2365 Ops/s | 703.6131 Ops/s | |
| test_reinforce_speed[True-backward] | 3.1637ms | 3.1075ms | 321.8006 Ops/s | 301.8917 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 0.6842s | 10.4747ms | 95.4678 Ops/s | 113.2954 Ops/s | |
| test_iql_speed[False-None] | 10.0669ms | 9.5489ms | 104.7242 Ops/s | 103.2378 Ops/s | |
| test_iql_speed[False-backward] | 13.7118ms | 13.2463ms | 75.4926 Ops/s | 74.3089 Ops/s | |
| test_iql_speed[True-None] | 2.3840ms | 2.2761ms | 439.3496 Ops/s | 430.1899 Ops/s | |
| test_iql_speed[True-backward] | 4.8900ms | 4.8001ms | 208.3273 Ops/s | 196.9331 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 16.7716ms | 10.1661ms | 98.3660 Ops/s | 101.1574 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.3478ms | 5.8926ms | 169.7034 Ops/s | 173.0552 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9582ms | 0.3478ms | 2.8751 KOps/s | 2.9041 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6204ms | 0.4049ms | 2.4697 KOps/s | 3.0501 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.9803ms | 5.6930ms | 175.6538 Ops/s | 178.8836 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.3178ms | 0.3709ms | 2.6961 KOps/s | 3.3019 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5798ms | 0.3564ms | 2.8060 KOps/s | 3.4096 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.7354ms | 1.4091ms | 709.6632 Ops/s | 766.7663 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6368ms | 1.3468ms | 742.5216 Ops/s | 829.0745 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 10.1218ms | 6.0294ms | 165.8529 Ops/s | 175.5531 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1928ms | 0.4834ms | 2.0686 KOps/s | 2.0491 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7124ms | 0.4441ms | 2.2517 KOps/s | 2.3720 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.8274ms | 5.6901ms | 175.7445 Ops/s | 181.5940 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6999ms | 0.3871ms | 2.5833 KOps/s | 2.7348 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6253ms | 0.3653ms | 2.7375 KOps/s | 2.8707 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.8297ms | 5.6133ms | 178.1486 Ops/s | 178.3309 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.8228ms | 0.3901ms | 2.5636 KOps/s | 2.7502 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6004ms | 0.3722ms | 2.6867 KOps/s | 2.8750 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.2418ms | 5.7389ms | 174.2486 Ops/s | 170.7448 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9519s | 1.9006ms | 526.1577 Ops/s | 1.9583 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7012ms | 0.4712ms | 2.1222 KOps/s | 2.1265 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.5863ms | 5.0606ms | 197.6035 Ops/s | 196.8185 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 3.8454ms | 1.7918ms | 558.1117 Ops/s | 447.2875 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 3.6008ms | 1.0136ms | 986.5922 Ops/s | 1.0074 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 6.4806ms | 4.9791ms | 200.8387 Ops/s | 195.5451 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 12.9221ms | 2.0617ms | 485.0315 Ops/s | 462.4501 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 3.5353ms | 1.1966ms | 835.7167 Ops/s | 1.0263 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.6893s | 18.9502ms | 52.7699 Ops/s | 43.2767 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 7.9971ms | 2.1070ms | 474.6122 Ops/s | 477.4768 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.4360ms | 1.1330ms | 882.6385 Ops/s | 847.7341 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 42.2770ms | 38.6516ms | 25.8721 Ops/s | 25.0401 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.3710ms | 17.8813ms | 55.9244 Ops/s | 53.7021 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 44.0957ms | 40.1461ms | 24.9090 Ops/s | 24.3893 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.3111ms | 18.3609ms | 54.4636 Ops/s | 52.5077 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 44.4449ms | 42.1447ms | 23.7278 Ops/s | 23.4866 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.1909ms | 19.8633ms | 50.3441 Ops/s | 49.3443 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.9094ms | 0.2292ms | 4.3634 KOps/s | 4.3282 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7187ms | 1.4769ms | 677.0880 Ops/s | 662.6937 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.7637ms | 2.3938ms | 417.7520 Ops/s | 405.6280 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.2769ms | 3.0750ms | 325.2041 Ops/s | 319.0962 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.3737ms | 0.1598ms | 6.2596 KOps/s | 5.9488 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3430ms | 0.2330ms | 4.2926 KOps/s | 3.9655 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 2.1669ms | 1.9613ms | 509.8537 Ops/s | 513.6646 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.7421ms | 1.4251ms | 701.6900 Ops/s | 653.6124 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.4566ms | 1.1350ms | 881.0616 Ops/s | 882.4118 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.8706ms | 3.6817ms | 271.6132 Ops/s | 261.5268 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 6.3146ms | 6.0288ms | 165.8712 Ops/s | 165.3997 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 8.8078ms | 7.5948ms | 131.6689 Ops/s | 136.0147 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.5451ms | 0.2799ms | 3.5725 KOps/s | 3.4922 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.8352ms | 1.5791ms | 633.2842 Ops/s | 633.1781 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.7774ms | 2.5278ms | 395.6001 Ops/s | 385.0150 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.4734ms | 3.2591ms | 306.8362 Ops/s | 300.1090 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.2071ms | 32.9983ms | 30.3046 Ops/s | 29.7993 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 65.1660ms | 64.7396ms | 15.4465 Ops/s | 15.3107 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 37.5790ms | 37.0860ms | 26.9644 Ops/s | 26.1044 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 75.0881ms | 73.4548ms | 13.6138 Ops/s | 12.5112 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 56.9760ms | 56.3142ms | 17.7575 Ops/s | 17.5602 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1128s | 0.1117s | 8.9506 Ops/s | 8.8288 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 58.7702ms | 57.7742ms | 17.3088 Ops/s | 16.9363 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.1170s | 0.1153s | 8.6724 Ops/s | 8.5919 Ops/s |
There was a problem hiding this comment.
We don't want that.
The proper way of doing auto-reset with torch compile should be to ALWAYS compute the reset and then mask reset and non-rest in tensordict_ using torch.where. We should have a toy auto-reset env that we can compile as an example and we should discuss how to make an extension point out of this but we should NOT skip maybe_reset entirely. I would rather make it a no-op if auto-reset is used with the masking I just talked about, or implement the masking in that maybe_reset which I would find more natural.
Stack from ghstack (oldest at bottom):
For environments that handle resets internally in _step() (e.g., GPU-batched
auto-resetting envs), the maybe_reset() call in step_and_maybe_reset() is
redundant overhead: it checks done flags, clones tensors, and potentially
calls reset() again. Adding _skip_maybe_reset = True on such envs skips
this entire codepath.
Made-with: Cursor