[Performance] Add _skip_maybe_reset flag to bypass auto-reset in step_and_maybe_reset#3560
[Performance] Add _skip_maybe_reset flag to bypass auto-reset in step_and_maybe_reset#3560vmoens wants to merge 1 commit intogh/vmoens/241/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3560
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit aba1d20 with merge base 0a1aea6 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 82.9848μs | 81.5420μs | 12.2636 KOps/s | 12.4927 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1401ms | 0.1391ms | 7.1916 KOps/s | 6.9324 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1233s | 0.1231s | 8.1217 Ops/s | 8.1619 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.4780μs | 2.4713μs | 404.6414 KOps/s | 395.0210 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 38.8348μs | 38.6497μs | 25.8734 KOps/s | 25.9372 KOps/s | |
| test_simple | 0.5560s | 0.5514s | 1.8135 Ops/s | 1.7220 Ops/s | |
| test_transformed | 1.0960s | 1.0943s | 0.9138 Ops/s | 0.8940 Ops/s | |
| test_serial | 1.6881s | 1.6840s | 0.5938 Ops/s | 0.5860 Ops/s | |
| test_parallel | 1.1570s | 1.0573s | 0.9458 Ops/s | 0.9508 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.1668ms | 41.0679μs | 24.3499 KOps/s | 23.8629 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 54.1110μs | 23.6761μs | 42.2368 KOps/s | 43.3213 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 94.3420μs | 23.9637μs | 41.7299 KOps/s | 42.6630 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 39.8210μs | 13.1404μs | 76.1014 KOps/s | 78.7653 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 0.1179ms | 46.0142μs | 21.7324 KOps/s | 22.8084 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 53.5110μs | 26.3198μs | 37.9942 KOps/s | 39.7702 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 62.3820μs | 26.7492μs | 37.3842 KOps/s | 37.9795 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 57.7810μs | 15.9600μs | 62.6567 KOps/s | 65.0950 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 95.5320μs | 48.7684μs | 20.5051 KOps/s | 21.5345 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 61.7310μs | 29.3040μs | 34.1250 KOps/s | 35.6445 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 95.3920μs | 26.9025μs | 37.1713 KOps/s | 38.8712 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 42.1710μs | 15.8677μs | 63.0209 KOps/s | 64.8986 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 82.1910μs | 51.0036μs | 19.6065 KOps/s | 20.3497 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 78.9010μs | 32.0297μs | 31.2211 KOps/s | 32.7587 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 84.6110μs | 29.4429μs | 33.9640 KOps/s | 35.5852 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 95.4810μs | 18.5197μs | 53.9964 KOps/s | 55.7958 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 0.1045ms | 48.0179μs | 20.8256 KOps/s | 21.4973 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 57.4510μs | 29.1167μs | 34.3445 KOps/s | 35.8828 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.5584ms | 31.5330μs | 31.7128 KOps/s | 33.7982 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 50.2210μs | 17.9429μs | 55.7323 KOps/s | 58.5374 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 80.7520μs | 50.9024μs | 19.6454 KOps/s | 21.1105 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 0.1074ms | 32.0556μs | 31.1958 KOps/s | 33.0032 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 57.4910μs | 34.0309μs | 29.3851 KOps/s | 31.7887 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 59.4410μs | 20.1981μs | 49.5096 KOps/s | 51.3076 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 93.7020μs | 53.8050μs | 18.5856 KOps/s | 19.3341 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 63.5110μs | 34.8426μs | 28.7005 KOps/s | 29.8742 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 0.1137ms | 33.4293μs | 29.9139 KOps/s | 31.5456 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 50.1910μs | 20.1999μs | 49.5053 KOps/s | 50.9637 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 89.5210μs | 56.3065μs | 17.7600 KOps/s | 18.6632 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 68.1710μs | 37.4180μs | 26.7251 KOps/s | 28.0033 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 69.8510μs | 35.5055μs | 28.1647 KOps/s | 29.3303 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 96.7010μs | 23.1067μs | 43.2774 KOps/s | 46.0760 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7144s | 0.7113s | 1.4059 Ops/s | 1.3474 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7111s | 0.6031s | 1.6580 Ops/s | 1.6475 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7266s | 1.6274s | 0.6145 Ops/s | 0.6085 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4951s | 1.4064s | 0.7110 Ops/s | 0.7013 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9670s | 1.8779s | 0.5325 Ops/s | 0.5291 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7526s | 1.6618s | 0.6018 Ops/s | 0.6014 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6615s | 4.5738s | 0.2186 Ops/s | 0.2194 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.3737s | 4.2911s | 0.2330 Ops/s | 0.2279 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9439s | 1.8727s | 0.5340 Ops/s | 0.5275 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.8007s | 1.6373s | 0.6108 Ops/s | 0.6186 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 10.4674ms | 10.3071ms | 97.0201 Ops/s | 96.5607 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 18.9397ms | 17.9421ms | 55.7350 Ops/s | 55.2761 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2141ms | 0.1258ms | 7.9482 KOps/s | 7.6070 KOps/s | |
| test_values[td1_return_estimate-False-False] | 27.7205ms | 27.3082ms | 36.6190 Ops/s | 36.0681 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 18.8061ms | 18.2135ms | 54.9044 Ops/s | 54.0437 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 40.9556ms | 40.7534ms | 24.5378 Ops/s | 24.3324 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 18.6741ms | 18.2117ms | 54.9096 Ops/s | 54.1366 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 9.3428ms | 9.2251ms | 108.3999 Ops/s | 109.8685 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.7454ms | 1.5637ms | 639.5011 Ops/s | 626.1433 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.5438ms | 0.4233ms | 2.3624 KOps/s | 2.3415 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 30.4127ms | 29.8131ms | 33.5423 Ops/s | 28.3638 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 1.9326ms | 1.7932ms | 557.6712 Ops/s | 554.1874 Ops/s | |
| test_dqn_speed[False-None] | 1.5829ms | 1.4369ms | 695.9655 Ops/s | 701.3209 Ops/s | |
| test_dqn_speed[False-backward] | 2.0148ms | 1.9523ms | 512.2151 Ops/s | 515.0989 Ops/s | |
| test_dqn_speed[True-None] | 0.7014ms | 0.5490ms | 1.8216 KOps/s | 1.7412 KOps/s | |
| test_dqn_speed[True-backward] | 1.0732ms | 1.0171ms | 983.1785 Ops/s | 957.2890 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.9403ms | 0.5474ms | 1.8268 KOps/s | 1.7879 KOps/s | |
| test_ddpg_speed[False-None] | 3.3145ms | 2.9062ms | 344.0876 Ops/s | 348.9422 Ops/s | |
| test_ddpg_speed[False-backward] | 4.2374ms | 4.1593ms | 240.4271 Ops/s | 243.1088 Ops/s | |
| test_ddpg_speed[True-None] | 1.8498ms | 1.4542ms | 687.6658 Ops/s | 680.0374 Ops/s | |
| test_ddpg_speed[True-backward] | 2.5453ms | 2.4742ms | 404.1640 Ops/s | 391.5942 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.6135ms | 1.4659ms | 682.1592 Ops/s | 685.5397 Ops/s | |
| test_sac_speed[False-None] | 8.9094ms | 8.2628ms | 121.0237 Ops/s | 121.8032 Ops/s | |
| test_sac_speed[False-backward] | 11.8245ms | 11.5578ms | 86.5213 Ops/s | 86.8328 Ops/s | |
| test_sac_speed[True-None] | 2.4471ms | 2.2232ms | 449.8012 Ops/s | 445.8017 Ops/s | |
| test_sac_speed[True-backward] | 4.7009ms | 4.1856ms | 238.9125 Ops/s | 214.7470 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.6367ms | 2.2177ms | 450.9208 Ops/s | 442.5775 Ops/s | |
| test_redq_speed[False-None] | 11.6933ms | 10.8466ms | 92.1948 Ops/s | 90.4609 Ops/s | |
| test_redq_speed[False-backward] | 21.3497ms | 18.7420ms | 53.3561 Ops/s | 53.5300 Ops/s | |
| test_redq_speed[True-None] | 5.0817ms | 4.5852ms | 218.0933 Ops/s | 212.1106 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.9035ms | 4.5527ms | 219.6493 Ops/s | 221.0881 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.9812ms | 11.4227ms | 87.5447 Ops/s | 85.8973 Ops/s | |
| test_redq_deprec_speed[False-backward] | 17.8752ms | 16.3492ms | 61.1650 Ops/s | 59.5804 Ops/s | |
| test_redq_deprec_speed[True-None] | 4.0977ms | 3.7510ms | 266.5939 Ops/s | 263.4999 Ops/s | |
| test_redq_deprec_speed[True-backward] | 8.0061ms | 7.7145ms | 129.6265 Ops/s | 123.9994 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 4.1342ms | 3.7016ms | 270.1550 Ops/s | 260.3523 Ops/s | |
| test_td3_speed[False-None] | 8.3259ms | 8.2243ms | 121.5902 Ops/s | 121.1529 Ops/s | |
| test_td3_speed[False-backward] | 11.3617ms | 11.1098ms | 90.0105 Ops/s | 90.2568 Ops/s | |
| test_td3_speed[True-None] | 1.9236ms | 1.8802ms | 531.8462 Ops/s | 532.5908 Ops/s | |
| test_td3_speed[True-backward] | 3.8861ms | 3.6760ms | 272.0321 Ops/s | 272.0295 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8975ms | 1.8453ms | 541.9123 Ops/s | 539.6771 Ops/s | |
| test_cql_speed[False-None] | 31.9625ms | 27.5571ms | 36.2883 Ops/s | 37.1357 Ops/s | |
| test_cql_speed[False-backward] | 37.6211ms | 36.6963ms | 27.2507 Ops/s | 27.1589 Ops/s | |
| test_cql_speed[True-None] | 13.6737ms | 13.1021ms | 76.3234 Ops/s | 77.6418 Ops/s | |
| test_cql_speed[True-backward] | 18.9705ms | 18.6165ms | 53.7157 Ops/s | 55.2597 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 17.0186ms | 13.0982ms | 76.3463 Ops/s | 77.5513 Ops/s | |
| test_a2c_speed[False-None] | 6.0164ms | 5.6227ms | 177.8504 Ops/s | 178.3288 Ops/s | |
| test_a2c_speed[False-backward] | 12.8185ms | 12.2104ms | 81.8973 Ops/s | 81.9290 Ops/s | |
| test_a2c_speed[True-None] | 4.4442ms | 3.9079ms | 255.8922 Ops/s | 259.1394 Ops/s | |
| test_a2c_speed[True-backward] | 8.9413ms | 8.7363ms | 114.4655 Ops/s | 113.7364 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.2894ms | 3.8950ms | 256.7398 Ops/s | 256.7407 Ops/s | |
| test_ppo_speed[False-None] | 6.4214ms | 6.0484ms | 165.3331 Ops/s | 161.2031 Ops/s | |
| test_ppo_speed[False-backward] | 13.2319ms | 12.9075ms | 77.4745 Ops/s | 75.9554 Ops/s | |
| test_ppo_speed[True-None] | 3.9904ms | 3.8168ms | 262.0019 Ops/s | 260.6291 Ops/s | |
| test_ppo_speed[True-backward] | 8.9585ms | 8.7252ms | 114.6099 Ops/s | 110.8398 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 4.0682ms | 3.7828ms | 264.3517 Ops/s | 264.0851 Ops/s | |
| test_reinforce_speed[False-None] | 4.9789ms | 4.7155ms | 212.0666 Ops/s | 214.8739 Ops/s | |
| test_reinforce_speed[False-backward] | 7.9383ms | 7.6556ms | 130.6226 Ops/s | 131.7556 Ops/s | |
| test_reinforce_speed[True-None] | 3.4493ms | 3.0200ms | 331.1229 Ops/s | 333.7912 Ops/s | |
| test_reinforce_speed[True-backward] | 8.1997ms | 7.9728ms | 125.4271 Ops/s | 119.2332 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.2070ms | 2.9943ms | 333.9625 Ops/s | 333.6264 Ops/s | |
| test_iql_speed[False-None] | 21.2184ms | 20.8190ms | 48.0330 Ops/s | 47.6478 Ops/s | |
| test_iql_speed[False-backward] | 36.2802ms | 31.9557ms | 31.2933 Ops/s | 31.2621 Ops/s | |
| test_iql_speed[True-None] | 9.2320ms | 8.8373ms | 113.1569 Ops/s | 99.2283 Ops/s | |
| test_iql_speed[True-backward] | 17.6912ms | 17.2408ms | 58.0019 Ops/s | 55.8637 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 9.1750ms | 8.8747ms | 112.6799 Ops/s | 108.2409 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.2578ms | 6.0229ms | 166.0335 Ops/s | 165.2549 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 3.1724ms | 0.3658ms | 2.7341 KOps/s | 2.5307 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5974ms | 0.3520ms | 2.8408 KOps/s | 2.9705 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0904ms | 5.8078ms | 172.1820 Ops/s | 171.1770 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.1304ms | 0.3399ms | 2.9418 KOps/s | 2.6606 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6638ms | 0.3256ms | 3.0710 KOps/s | 2.7385 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.7794ms | 1.4466ms | 691.2899 Ops/s | 679.3898 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6393ms | 1.3635ms | 733.3823 Ops/s | 716.1992 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 9.5029ms | 6.0851ms | 164.3362 Ops/s | 165.6255 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.9257ms | 0.5288ms | 1.8911 KOps/s | 1.9487 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7776ms | 0.5110ms | 1.9571 KOps/s | 2.0805 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9999ms | 5.8432ms | 171.1383 Ops/s | 169.0870 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.8632ms | 0.3423ms | 2.9218 KOps/s | 3.1961 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6005ms | 0.3652ms | 2.7385 KOps/s | 2.9074 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0549ms | 5.7823ms | 172.9401 Ops/s | 170.8234 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.0208ms | 0.3411ms | 2.9314 KOps/s | 3.0477 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5535ms | 0.3418ms | 2.9261 KOps/s | 2.8930 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.0886ms | 5.9456ms | 168.1904 Ops/s | 165.5206 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.7258ms | 0.5258ms | 1.9020 KOps/s | 2.1277 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 7.2177ms | 0.5251ms | 1.9045 KOps/s | 2.3035 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.6660ms | 5.2917ms | 188.9740 Ops/s | 192.9879 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 4.1195ms | 2.0229ms | 494.3427 Ops/s | 478.0683 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 3.4925ms | 0.9765ms | 1.0240 KOps/s | 766.4525 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.6494s | 18.4391ms | 54.2326 Ops/s | 36.3413 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 9.9765ms | 1.8922ms | 528.4747 Ops/s | 511.2541 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 9.5423ms | 1.2803ms | 781.0863 Ops/s | 806.5900 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 7.2639ms | 5.5171ms | 181.2550 Ops/s | 188.6284 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 10.0546ms | 2.0448ms | 489.0495 Ops/s | 526.2388 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.3317ms | 1.0572ms | 945.9272 Ops/s | 925.3513 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 41.2534ms | 38.1778ms | 26.1932 Ops/s | 25.3792 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.8374ms | 18.3237ms | 54.5741 Ops/s | 53.4679 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 43.9768ms | 39.1689ms | 25.5305 Ops/s | 24.4870 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 19.8826ms | 18.6117ms | 53.7296 Ops/s | 52.4370 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 42.4471ms | 40.9722ms | 24.4068 Ops/s | 23.2249 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.1408ms | 20.1070ms | 49.7338 Ops/s | 48.7621 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8749ms | 0.2209ms | 4.5275 KOps/s | 4.2790 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.8136ms | 1.5371ms | 650.5760 Ops/s | 634.9699 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.9052ms | 2.5062ms | 399.0176 Ops/s | 374.8167 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.5662ms | 3.1644ms | 316.0115 Ops/s | 308.6064 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.4154ms | 0.1368ms | 7.3074 KOps/s | 7.1290 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3230ms | 0.1861ms | 5.3720 KOps/s | 5.2573 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 2.2781ms | 1.9115ms | 523.1390 Ops/s | 526.2274 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.7387ms | 1.4248ms | 701.8670 Ops/s | 705.7421 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.3358ms | 1.1285ms | 886.1029 Ops/s | 884.0787 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 4.1143ms | 3.6716ms | 272.3608 Ops/s | 269.2155 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 12.0197ms | 6.1059ms | 163.7763 Ops/s | 166.4702 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 15.4204ms | 7.4633ms | 133.9895 Ops/s | 133.6489 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4483ms | 0.2760ms | 3.6226 KOps/s | 3.4691 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.8600ms | 1.6398ms | 609.8471 Ops/s | 601.3924 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 3.2262ms | 2.6334ms | 379.7422 Ops/s | 359.8658 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.8062ms | 3.3929ms | 294.7369 Ops/s | 289.6547 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.5450ms | 33.9971ms | 29.4143 Ops/s | 29.2369 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 67.0800ms | 66.5561ms | 15.0249 Ops/s | 14.7586 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 39.3009ms | 38.6187ms | 25.8942 Ops/s | 25.8331 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 76.9006ms | 75.5987ms | 13.2277 Ops/s | 13.0317 Ops/s |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 83.1385μs | 81.5006μs | 12.2699 KOps/s | 12.4701 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1428ms | 0.1419ms | 7.0464 KOps/s | 7.1215 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1204s | 0.1199s | 8.3400 Ops/s | 8.0446 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5350μs | 2.5228μs | 396.3775 KOps/s | 407.6290 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 38.4013μs | 37.4170μs | 26.7258 KOps/s | 26.4363 KOps/s | |
| test_simple | 0.8159s | 0.8050s | 1.2422 Ops/s | 1.2233 Ops/s | |
| test_transformed | 1.4178s | 1.4004s | 0.7141 Ops/s | 0.7079 Ops/s | |
| test_serial | 2.3380s | 2.3324s | 0.4287 Ops/s | 0.4238 Ops/s | |
| test_parallel | 1.9186s | 1.8541s | 0.5394 Ops/s | 0.5459 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.2405ms | 42.6652μs | 23.4383 KOps/s | 23.6662 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 43.4410μs | 23.1734μs | 43.1530 KOps/s | 43.7711 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 53.9910μs | 23.9510μs | 41.7520 KOps/s | 41.3991 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 43.4200μs | 12.8826μs | 77.6241 KOps/s | 78.6405 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 93.9320μs | 45.5505μs | 21.9537 KOps/s | 22.6274 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 55.7010μs | 25.9906μs | 38.4755 KOps/s | 40.1029 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 62.0010μs | 27.0411μs | 36.9807 KOps/s | 38.9001 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 39.7100μs | 15.5282μs | 64.3989 KOps/s | 65.7513 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 92.9120μs | 46.8787μs | 21.3317 KOps/s | 21.5121 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 59.0810μs | 28.0440μs | 35.6582 KOps/s | 35.3078 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 50.5210μs | 26.3241μs | 37.9880 KOps/s | 38.8483 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 47.1500μs | 15.3711μs | 65.0570 KOps/s | 65.6824 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.1081ms | 49.6751μs | 20.1308 KOps/s | 20.4770 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 80.5620μs | 30.6268μs | 32.6511 KOps/s | 32.6996 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 64.7710μs | 28.4440μs | 35.1567 KOps/s | 35.1510 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 47.1910μs | 17.7238μs | 56.4212 KOps/s | 55.6720 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 80.2610μs | 46.9304μs | 21.3082 KOps/s | 21.3538 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 57.4710μs | 28.2787μs | 35.3623 KOps/s | 35.4964 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.4382ms | 30.8866μs | 32.3765 KOps/s | 33.7529 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 58.9710μs | 17.0091μs | 58.7919 KOps/s | 59.0824 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 88.8010μs | 50.3180μs | 19.8736 KOps/s | 20.6255 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 57.3310μs | 30.4860μs | 32.8019 KOps/s | 33.2495 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 56.8810μs | 31.9479μs | 31.3010 KOps/s | 31.2330 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 55.3710μs | 19.5731μs | 51.0907 KOps/s | 51.7866 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 82.3210μs | 51.9445μs | 19.2513 KOps/s | 19.3672 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 58.7410μs | 33.3013μs | 30.0289 KOps/s | 30.5781 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 73.5320μs | 32.3517μs | 30.9103 KOps/s | 31.3128 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 46.8610μs | 19.7817μs | 50.5518 KOps/s | 51.6601 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 88.1720μs | 53.1943μs | 18.7990 KOps/s | 18.6970 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 61.4510μs | 35.5049μs | 28.1651 KOps/s | 28.1255 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 80.0210μs | 33.7517μs | 29.6281 KOps/s | 29.5959 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 86.2610μs | 21.7087μs | 46.0645 KOps/s | 45.5640 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7557s | 0.7454s | 1.3415 Ops/s | 1.3360 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7370s | 0.6345s | 1.5760 Ops/s | 1.6295 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7874s | 1.6882s | 0.5923 Ops/s | 0.6060 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5574s | 1.4656s | 0.6823 Ops/s | 0.6988 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 2.0297s | 1.9301s | 0.5181 Ops/s | 0.5202 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7586s | 1.6778s | 0.5960 Ops/s | 0.5963 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.7772s | 4.6279s | 0.2161 Ops/s | 0.2160 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5482s | 4.4649s | 0.2240 Ops/s | 0.2256 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9563s | 1.8831s | 0.5310 Ops/s | 0.5330 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6996s | 1.5996s | 0.6251 Ops/s | 0.6240 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 21.3194ms | 20.8821ms | 47.8879 Ops/s | 48.7839 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1281s | 3.4900ms | 286.5335 Ops/s | 281.6171 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1100ms | 83.7689μs | 11.9376 KOps/s | 12.1238 KOps/s | |
| test_values[td1_return_estimate-False-False] | 52.5008ms | 49.9453ms | 20.0219 Ops/s | 20.7409 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3698ms | 1.1004ms | 908.7562 Ops/s | 914.8068 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 85.7212ms | 81.9522ms | 12.2022 Ops/s | 12.6101 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.3574ms | 1.0960ms | 912.3918 Ops/s | 915.1423 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 21.2060ms | 21.0217ms | 47.5698 Ops/s | 49.3973 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0822ms | 0.7678ms | 1.3025 KOps/s | 1.3102 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7835ms | 0.6871ms | 1.4553 KOps/s | 1.4163 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5506ms | 1.4987ms | 667.2422 Ops/s | 664.1985 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7925ms | 0.7042ms | 1.4200 KOps/s | 1.4357 KOps/s | |
| test_dqn_speed[False-None] | 1.7665ms | 1.6170ms | 618.4409 Ops/s | 627.8133 Ops/s | |
| test_dqn_speed[False-backward] | 2.3488ms | 2.2581ms | 442.8540 Ops/s | 445.6435 Ops/s | |
| test_dqn_speed[True-None] | 0.6826ms | 0.5931ms | 1.6861 KOps/s | 1.6944 KOps/s | |
| test_dqn_speed[True-backward] | 1.2862ms | 1.2332ms | 810.9074 Ops/s | 796.9822 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.6671ms | 0.6050ms | 1.6528 KOps/s | 1.5847 KOps/s | |
| test_ddpg_speed[False-None] | 3.4684ms | 3.0885ms | 323.7845 Ops/s | 332.0451 Ops/s | |
| test_ddpg_speed[False-backward] | 4.8804ms | 4.4635ms | 224.0413 Ops/s | 225.1448 Ops/s | |
| test_ddpg_speed[True-None] | 1.5220ms | 1.3802ms | 724.5573 Ops/s | 732.2351 Ops/s | |
| test_ddpg_speed[True-backward] | 2.5579ms | 2.5154ms | 397.5493 Ops/s | 390.0002 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.4475ms | 1.3781ms | 725.6331 Ops/s | 713.9245 Ops/s | |
| test_sac_speed[False-None] | 8.9701ms | 8.5860ms | 116.4692 Ops/s | 117.4381 Ops/s | |
| test_sac_speed[False-backward] | 12.3433ms | 11.8999ms | 84.0343 Ops/s | 84.4302 Ops/s | |
| test_sac_speed[True-None] | 2.0765ms | 1.8800ms | 531.9170 Ops/s | 533.2166 Ops/s | |
| test_sac_speed[True-backward] | 3.7303ms | 3.6613ms | 273.1240 Ops/s | 271.6886 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 16.7088ms | 10.1502ms | 98.5207 Ops/s | 98.3303 Ops/s | |
| test_redq_deprec_speed[False-None] | 10.5161ms | 9.6137ms | 104.0180 Ops/s | 103.6348 Ops/s | |
| test_redq_deprec_speed[False-backward] | 14.1706ms | 13.0716ms | 76.5017 Ops/s | 76.6229 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.6619ms | 2.5728ms | 388.6848 Ops/s | 383.0442 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.5713ms | 4.2715ms | 234.1124 Ops/s | 233.6874 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 14.6136ms | 9.6570ms | 103.5523 Ops/s | 103.6972 Ops/s | |
| test_td3_speed[False-None] | 8.5438ms | 8.4237ms | 118.7121 Ops/s | 119.0333 Ops/s | |
| test_td3_speed[False-backward] | 11.5493ms | 11.1071ms | 90.0326 Ops/s | 90.9403 Ops/s | |
| test_td3_speed[True-None] | 1.7362ms | 1.6621ms | 601.6371 Ops/s | 576.2446 Ops/s | |
| test_td3_speed[True-backward] | 3.2502ms | 3.1661ms | 315.8418 Ops/s | 314.6931 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 84.2311ms | 25.7979ms | 38.7629 Ops/s | 37.8999 Ops/s | |
| test_cql_speed[False-None] | 18.2195ms | 17.9318ms | 55.7668 Ops/s | 54.3315 Ops/s | |
| test_cql_speed[False-backward] | 23.7387ms | 23.2958ms | 42.9262 Ops/s | 42.3853 Ops/s | |
| test_cql_speed[True-None] | 3.5404ms | 3.3376ms | 299.6126 Ops/s | 298.9628 Ops/s | |
| test_cql_speed[True-backward] | 5.8488ms | 5.4646ms | 182.9975 Ops/s | 181.7242 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 19.0456ms | 11.9025ms | 84.0159 Ops/s | 82.6856 Ops/s | |
| test_a2c_speed[False-None] | 3.4710ms | 3.3847ms | 295.4498 Ops/s | 296.1130 Ops/s | |
| test_a2c_speed[False-backward] | 6.8545ms | 6.3635ms | 157.1457 Ops/s | 152.3281 Ops/s | |
| test_a2c_speed[True-None] | 1.6089ms | 1.4033ms | 712.6160 Ops/s | 709.7726 Ops/s | |
| test_a2c_speed[True-backward] | 3.1286ms | 3.0325ms | 329.7627 Ops/s | 308.2206 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.1567ms | 1.0509ms | 951.6033 Ops/s | 957.1123 Ops/s | |
| test_ppo_speed[False-None] | 4.4034ms | 4.0615ms | 246.2153 Ops/s | 250.1731 Ops/s | |
| test_ppo_speed[False-backward] | 7.7548ms | 7.3245ms | 136.5281 Ops/s | 134.7924 Ops/s | |
| test_ppo_speed[True-None] | 1.9517ms | 1.5227ms | 656.7477 Ops/s | 654.3025 Ops/s | |
| test_ppo_speed[True-backward] | 3.2274ms | 3.1833ms | 314.1397 Ops/s | 295.7119 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.2804ms | 1.1105ms | 900.4734 Ops/s | 890.6279 Ops/s | |
| test_reinforce_speed[False-None] | 2.7494ms | 2.4552ms | 407.2998 Ops/s | 417.0241 Ops/s | |
| test_reinforce_speed[False-backward] | 4.0659ms | 3.5035ms | 285.4304 Ops/s | 292.4985 Ops/s | |
| test_reinforce_speed[True-None] | 1.5634ms | 1.3867ms | 721.1116 Ops/s | 733.2089 Ops/s | |
| test_reinforce_speed[True-backward] | 3.5980ms | 3.0796ms | 324.7163 Ops/s | 324.3300 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 0.6670s | 10.5176ms | 95.0791 Ops/s | 112.8671 Ops/s | |
| test_iql_speed[False-None] | 10.4424ms | 9.8705ms | 101.3121 Ops/s | 102.1841 Ops/s | |
| test_iql_speed[False-backward] | 14.1699ms | 13.7012ms | 72.9862 Ops/s | 73.9143 Ops/s | |
| test_iql_speed[True-None] | 2.4800ms | 2.2733ms | 439.8882 Ops/s | 433.5256 Ops/s | |
| test_iql_speed[True-backward] | 5.3056ms | 4.8360ms | 206.7835 Ops/s | 199.5077 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 16.3138ms | 10.0324ms | 99.6772 Ops/s | 100.1626 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1486ms | 5.9330ms | 168.5500 Ops/s | 166.2497 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8002ms | 0.3789ms | 2.6389 KOps/s | 2.6215 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7305ms | 0.3627ms | 2.7569 KOps/s | 2.6787 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.9668ms | 5.7642ms | 173.4842 Ops/s | 172.6990 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.3081ms | 0.3732ms | 2.6796 KOps/s | 2.8816 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6503ms | 0.3539ms | 2.8260 KOps/s | 2.9859 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.5872ms | 1.3100ms | 763.3433 Ops/s | 678.8185 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.5260ms | 1.2263ms | 815.4289 Ops/s | 758.8821 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.0264ms | 5.9003ms | 169.4823 Ops/s | 167.7356 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9233ms | 0.4922ms | 2.0315 KOps/s | 2.1371 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6638ms | 0.4379ms | 2.2836 KOps/s | 1.9808 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9611ms | 5.7985ms | 172.4585 Ops/s | 171.1523 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.0442ms | 0.3416ms | 2.9274 KOps/s | 2.8018 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5425ms | 0.3149ms | 3.1754 KOps/s | 2.7668 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0027ms | 5.7356ms | 174.3487 Ops/s | 174.5731 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.1850ms | 0.3974ms | 2.5162 KOps/s | 3.1921 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6101ms | 0.3895ms | 2.5673 KOps/s | 3.2238 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1479ms | 5.9713ms | 167.4684 Ops/s | 167.5928 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9884ms | 0.4648ms | 2.1515 KOps/s | 1.8650 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6903ms | 0.4784ms | 2.0905 KOps/s | 1.9527 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.5426ms | 5.0581ms | 197.7026 Ops/s | 36.0925 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 10.2154ms | 2.0833ms | 480.0180 Ops/s | 511.3197 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 3.5302ms | 1.0379ms | 963.4990 Ops/s | 727.8069 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 7.8783ms | 5.0990ms | 196.1152 Ops/s | 194.9859 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 4.0978ms | 1.9056ms | 524.7657 Ops/s | 503.6732 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 1.4125ms | 1.0051ms | 994.9074 Ops/s | 716.3584 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.7128s | 19.4882ms | 51.3131 Ops/s | 187.5825 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 12.7513ms | 2.2342ms | 447.5923 Ops/s | 464.2981 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.4684ms | 1.1779ms | 848.9349 Ops/s | 841.0059 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 41.4882ms | 39.5045ms | 25.3136 Ops/s | 25.9579 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.9928ms | 18.5344ms | 53.9538 Ops/s | 55.1862 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 44.2446ms | 40.3051ms | 24.8108 Ops/s | 24.9445 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.2331ms | 18.6531ms | 53.6105 Ops/s | 53.5958 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 43.5454ms | 41.6466ms | 24.0115 Ops/s | 23.8775 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.2802ms | 20.1447ms | 49.6409 Ops/s | 48.9858 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.9379ms | 0.2321ms | 4.3092 KOps/s | 4.3137 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.9009ms | 1.4357ms | 696.5212 Ops/s | 710.2852 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.6906ms | 2.4771ms | 403.6923 Ops/s | 412.0793 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.4146ms | 3.1563ms | 316.8250 Ops/s | 340.6237 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.3370ms | 0.1670ms | 5.9864 KOps/s | 6.1036 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3726ms | 0.2319ms | 4.3129 KOps/s | 3.5055 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 2.1013ms | 1.9142ms | 522.4037 Ops/s | 528.8546 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.6678ms | 1.4590ms | 685.3954 Ops/s | 735.5493 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.3330ms | 1.1538ms | 866.7148 Ops/s | 858.0837 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 7.6857ms | 3.7817ms | 264.4336 Ops/s | 270.7174 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 11.3811ms | 5.8600ms | 170.6473 Ops/s | 164.1216 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.6926ms | 7.1713ms | 139.4441 Ops/s | 137.7755 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.7245ms | 0.2852ms | 3.5069 KOps/s | 3.5808 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.9999ms | 1.5466ms | 646.5817 Ops/s | 640.6801 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 3.0157ms | 2.4260ms | 412.2053 Ops/s | 382.2278 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.6543ms | 3.2764ms | 305.2103 Ops/s | 308.2788 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.4610ms | 33.7208ms | 29.6553 Ops/s | 29.9863 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 66.7161ms | 65.5094ms | 15.2650 Ops/s | 15.2307 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 38.8090ms | 38.0419ms | 26.2868 Ops/s | 26.0193 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 75.5724ms | 74.4618ms | 13.4297 Ops/s | 13.4693 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 58.8173ms | 58.3348ms | 17.1424 Ops/s | 17.8625 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1169s | 0.1156s | 8.6510 Ops/s | 8.9043 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 60.5829ms | 59.8044ms | 16.7212 Ops/s | 17.0941 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.1203s | 0.1188s | 8.4167 Ops/s | 8.6533 Ops/s |
There was a problem hiding this comment.
We don't want that.
The proper way of doing auto-reset with torch compile should be to ALWAYS compute the reset and then mask reset and non-rest in tensordict_ using torch.where. We should have a toy auto-reset env that we can compile as an example and we should discuss how to make an extension point out of this but we should NOT skip maybe_reset entirely. I would rather make it a no-op if auto-reset is used with the masking I just talked about, or implement the masking in that maybe_reset which I would find more natural.
Stack from ghstack (oldest at bottom):
For environments that handle resets internally in _step() (e.g., GPU-batched
auto-resetting envs), the maybe_reset() call in step_and_maybe_reset() is
redundant overhead: it checks done flags, clones tensors, and potentially
calls reset() again. Adding _skip_maybe_reset = True on such envs skips
this entire codepath.
Made-with: Cursor