[Performance] Add out= parameter to _StepMDP for output buffer reuse#3561
Closed
vmoens wants to merge 2 commits intogh/vmoens/242/basefrom
Closed
[Performance] Add out= parameter to _StepMDP for output buffer reuse#3561vmoens wants to merge 2 commits intogh/vmoens/242/basefrom
vmoens wants to merge 2 commits intogh/vmoens/242/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3561
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ⏳ No Failures, 14 PendingAs of commit bb2c911 with merge base a4301ee ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This was referenced Mar 23, 2026
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 85.7906μs | 84.0190μs | 11.9021 KOps/s | 11.7690 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1456ms | 0.1439ms | 6.9495 KOps/s | 6.8857 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1112s | 0.1108s | 9.0237 Ops/s | 8.8025 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.6645μs | 2.6502μs | 377.3291 KOps/s | 372.1609 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 39.1302μs | 38.0561μs | 26.2770 KOps/s | 25.5326 KOps/s | |
| test_simple | 0.6733s | 0.5728s | 1.7457 Ops/s | 1.7410 Ops/s | |
| test_transformed | 1.1008s | 1.0996s | 0.9094 Ops/s | 0.8941 Ops/s | |
| test_serial | 1.7108s | 1.7087s | 0.5852 Ops/s | 0.5809 Ops/s | |
| test_parallel | 1.0271s | 1.0238s | 0.9768 Ops/s | 0.9475 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.2138ms | 41.8582μs | 23.8902 KOps/s | 23.9513 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 0.1050ms | 23.1687μs | 43.1616 KOps/s | 43.2266 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 94.2820μs | 24.0301μs | 41.6145 KOps/s | 41.6271 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 35.4510μs | 13.0172μs | 76.8214 KOps/s | 77.9363 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 73.3320μs | 45.2559μs | 22.0966 KOps/s | 22.7804 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 56.5610μs | 25.8053μs | 38.7518 KOps/s | 38.8720 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 52.8710μs | 26.8617μs | 37.2277 KOps/s | 37.4986 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 39.8010μs | 15.7633μs | 63.4386 KOps/s | 64.6036 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 79.3820μs | 47.9195μs | 20.8683 KOps/s | 21.2188 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 66.1210μs | 28.6203μs | 34.9402 KOps/s | 35.4003 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 57.5920μs | 26.9085μs | 37.1630 KOps/s | 38.1802 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 45.3310μs | 15.4691μs | 64.6450 KOps/s | 64.5625 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 90.6120μs | 49.3912μs | 20.2465 KOps/s | 20.5040 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 98.8520μs | 30.6574μs | 32.6186 KOps/s | 32.9387 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 58.7010μs | 29.3276μs | 34.0976 KOps/s | 34.2655 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 47.4410μs | 18.2212μs | 54.8812 KOps/s | 56.2947 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 84.0110μs | 47.6325μs | 20.9941 KOps/s | 21.1355 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 57.8010μs | 28.1871μs | 35.4772 KOps/s | 35.5675 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.3418ms | 30.7873μs | 32.4809 KOps/s | 33.3710 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 53.1820μs | 17.1514μs | 58.3044 KOps/s | 58.3944 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 84.9720μs | 49.5239μs | 20.1923 KOps/s | 20.2514 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 80.5620μs | 31.0168μs | 32.2406 KOps/s | 32.5933 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 67.6310μs | 32.5521μs | 30.7199 KOps/s | 30.8554 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 49.4910μs | 19.7472μs | 50.6402 KOps/s | 51.6852 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 86.8820μs | 53.8085μs | 18.5844 KOps/s | 19.4556 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 0.1056ms | 33.2920μs | 30.0373 KOps/s | 30.3288 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 65.0010μs | 32.2537μs | 31.0042 KOps/s | 31.4735 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 48.2410μs | 19.6279μs | 50.9480 KOps/s | 51.5356 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 87.0510μs | 55.6706μs | 17.9628 KOps/s | 18.4551 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 66.0310μs | 35.9124μs | 27.8455 KOps/s | 27.8330 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 70.4710μs | 34.2508μs | 29.1964 KOps/s | 28.3574 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 52.1210μs | 22.4166μs | 44.6097 KOps/s | 44.7355 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7339s | 0.7300s | 1.3698 Ops/s | 1.3250 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7250s | 0.6180s | 1.6180 Ops/s | 1.6201 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7543s | 1.6623s | 0.6016 Ops/s | 0.6011 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5264s | 1.4414s | 0.6938 Ops/s | 0.6921 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 2.0078s | 1.9217s | 0.5204 Ops/s | 0.5216 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7797s | 1.6940s | 0.5903 Ops/s | 0.5922 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.7256s | 4.6674s | 0.2143 Ops/s | 0.2151 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5471s | 4.4136s | 0.2266 Ops/s | 0.2253 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0162s | 1.9117s | 0.5231 Ops/s | 0.5303 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6821s | 1.5892s | 0.6292 Ops/s | 0.5942 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 10.5604ms | 10.3470ms | 96.6466 Ops/s | 99.5685 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 19.8090ms | 17.6399ms | 56.6897 Ops/s | 56.6992 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2351ms | 0.1300ms | 7.6915 KOps/s | 7.6104 KOps/s | |
| test_values[td1_return_estimate-False-False] | 28.6411ms | 28.2552ms | 35.3917 Ops/s | 35.6798 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 20.6756ms | 17.9120ms | 55.8285 Ops/s | 56.0166 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 42.4328ms | 41.7016ms | 23.9799 Ops/s | 24.5479 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 18.1035ms | 17.7011ms | 56.4936 Ops/s | 56.5468 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 9.2857ms | 9.2057ms | 108.6287 Ops/s | 112.0663 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.7812ms | 1.5576ms | 642.0233 Ops/s | 641.5771 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4974ms | 0.4319ms | 2.3156 KOps/s | 2.3150 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 35.7709ms | 34.8305ms | 28.7104 Ops/s | 28.3177 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 1.8518ms | 1.7252ms | 579.6440 Ops/s | 580.0244 Ops/s | |
| test_dqn_speed[False-None] | 1.5059ms | 1.4111ms | 708.6610 Ops/s | 689.8803 Ops/s | |
| test_dqn_speed[False-backward] | 2.0390ms | 1.9517ms | 512.3798 Ops/s | 505.2917 Ops/s | |
| test_dqn_speed[True-None] | 0.8071ms | 0.5738ms | 1.7427 KOps/s | 1.7222 KOps/s | |
| test_dqn_speed[True-backward] | 1.1136ms | 1.0564ms | 946.6465 Ops/s | 840.8758 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7334ms | 0.5611ms | 1.7821 KOps/s | 1.7493 KOps/s | |
| test_ddpg_speed[False-None] | 3.2285ms | 2.8900ms | 346.0239 Ops/s | 350.1665 Ops/s | |
| test_ddpg_speed[False-backward] | 4.2350ms | 4.0975ms | 244.0538 Ops/s | 243.3542 Ops/s | |
| test_ddpg_speed[True-None] | 1.6233ms | 1.4746ms | 678.1558 Ops/s | 666.2942 Ops/s | |
| test_ddpg_speed[True-backward] | 2.5638ms | 2.5046ms | 399.2625 Ops/s | 393.7675 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.7383ms | 1.4677ms | 681.3341 Ops/s | 689.1228 Ops/s | |
| test_sac_speed[False-None] | 8.6895ms | 8.1651ms | 122.4729 Ops/s | 121.6897 Ops/s | |
| test_sac_speed[False-backward] | 12.0005ms | 11.4659ms | 87.2154 Ops/s | 85.5290 Ops/s | |
| test_sac_speed[True-None] | 2.3931ms | 2.2580ms | 442.8754 Ops/s | 451.6738 Ops/s | |
| test_sac_speed[True-backward] | 4.3135ms | 4.2130ms | 237.3602 Ops/s | 244.5113 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.4355ms | 2.2435ms | 445.7400 Ops/s | 454.1539 Ops/s | |
| test_redq_speed[False-None] | 15.8068ms | 10.8729ms | 91.9715 Ops/s | 91.6400 Ops/s | |
| test_redq_speed[False-backward] | 22.3640ms | 18.3574ms | 54.4739 Ops/s | 54.4496 Ops/s | |
| test_redq_speed[True-None] | 4.9442ms | 4.6792ms | 213.7127 Ops/s | 210.1151 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 5.0361ms | 4.6026ms | 217.2662 Ops/s | 207.3352 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.9427ms | 11.3576ms | 88.0464 Ops/s | 88.6460 Ops/s | |
| test_redq_deprec_speed[False-backward] | 17.0402ms | 16.4416ms | 60.8214 Ops/s | 61.3382 Ops/s | |
| test_redq_deprec_speed[True-None] | 3.9795ms | 3.7516ms | 266.5523 Ops/s | 270.6746 Ops/s | |
| test_redq_deprec_speed[True-backward] | 8.1867ms | 7.7449ms | 129.1171 Ops/s | 130.9711 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 4.0777ms | 3.7150ms | 269.1776 Ops/s | 271.9465 Ops/s | |
| test_td3_speed[False-None] | 9.1648ms | 8.2837ms | 120.7194 Ops/s | 122.4444 Ops/s | |
| test_td3_speed[False-backward] | 11.4222ms | 11.1284ms | 89.8604 Ops/s | 90.7265 Ops/s | |
| test_td3_speed[True-None] | 1.9607ms | 1.8915ms | 528.6686 Ops/s | 533.7130 Ops/s | |
| test_td3_speed[True-backward] | 3.8928ms | 3.6689ms | 272.5633 Ops/s | 273.7127 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.9240ms | 1.8481ms | 541.0923 Ops/s | 550.2645 Ops/s | |
| test_cql_speed[False-None] | 30.6340ms | 26.8677ms | 37.2195 Ops/s | 37.7917 Ops/s | |
| test_cql_speed[False-backward] | 39.9192ms | 36.1357ms | 27.6734 Ops/s | 27.5047 Ops/s | |
| test_cql_speed[True-None] | 15.5284ms | 12.9687ms | 77.1085 Ops/s | 78.3082 Ops/s | |
| test_cql_speed[True-backward] | 18.8255ms | 18.2790ms | 54.7076 Ops/s | 55.9372 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 13.3764ms | 12.9476ms | 77.2344 Ops/s | 80.1344 Ops/s | |
| test_a2c_speed[False-None] | 5.9292ms | 5.5396ms | 180.5197 Ops/s | 183.2923 Ops/s | |
| test_a2c_speed[False-backward] | 12.5332ms | 12.0710ms | 82.8433 Ops/s | 83.9974 Ops/s | |
| test_a2c_speed[True-None] | 4.1635ms | 3.8724ms | 258.2390 Ops/s | 256.0750 Ops/s | |
| test_a2c_speed[True-backward] | 9.1611ms | 8.9084ms | 112.2542 Ops/s | 107.2174 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.1805ms | 3.8383ms | 260.5322 Ops/s | 259.7826 Ops/s | |
| test_ppo_speed[False-None] | 6.5488ms | 5.9663ms | 167.6082 Ops/s | 166.5271 Ops/s | |
| test_ppo_speed[False-backward] | 12.8723ms | 12.5977ms | 79.3798 Ops/s | 78.8960 Ops/s | |
| test_ppo_speed[True-None] | 4.0517ms | 3.8487ms | 259.8250 Ops/s | 263.3360 Ops/s | |
| test_ppo_speed[True-backward] | 9.1091ms | 8.8019ms | 113.6121 Ops/s | 112.5167 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 4.1814ms | 3.8205ms | 261.7435 Ops/s | 263.5204 Ops/s | |
| test_reinforce_speed[False-None] | 4.7722ms | 4.5906ms | 217.8376 Ops/s | 219.1425 Ops/s | |
| test_reinforce_speed[False-backward] | 8.0172ms | 7.5052ms | 133.2410 Ops/s | 133.8750 Ops/s | |
| test_reinforce_speed[True-None] | 3.5954ms | 3.0753ms | 325.1678 Ops/s | 332.1843 Ops/s | |
| test_reinforce_speed[True-backward] | 8.5095ms | 8.0786ms | 123.7831 Ops/s | 122.4166 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.2707ms | 3.0249ms | 330.5892 Ops/s | 325.9361 Ops/s | |
| test_iql_speed[False-None] | 21.2430ms | 20.2535ms | 49.3743 Ops/s | 47.9400 Ops/s | |
| test_iql_speed[False-backward] | 32.7698ms | 30.9613ms | 32.2983 Ops/s | 32.3227 Ops/s | |
| test_iql_speed[True-None] | 9.3127ms | 8.7911ms | 113.7511 Ops/s | 114.8077 Ops/s | |
| test_iql_speed[True-backward] | 17.9517ms | 17.1864ms | 58.1856 Ops/s | 59.2773 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 9.2170ms | 8.7074ms | 114.8454 Ops/s | 115.6028 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.2448ms | 6.0963ms | 164.0345 Ops/s | 162.8184 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.9911ms | 0.3019ms | 3.3121 KOps/s | 2.9692 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5337ms | 0.2747ms | 3.6402 KOps/s | 3.1033 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1563ms | 5.8557ms | 170.7738 Ops/s | 169.1058 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.7058ms | 0.2917ms | 3.4281 KOps/s | 3.0454 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4918ms | 0.2779ms | 3.5990 KOps/s | 3.1749 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.5111ms | 1.2910ms | 774.6128 Ops/s | 744.1092 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.4105ms | 1.2128ms | 824.5263 Ops/s | 792.8441 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 10.1477ms | 6.1226ms | 163.3290 Ops/s | 165.3298 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9029ms | 0.4932ms | 2.0276 KOps/s | 2.2152 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8087ms | 0.4834ms | 2.0685 KOps/s | 2.3037 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9302ms | 5.8454ms | 171.0758 Ops/s | 170.4633 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.6260ms | 0.2948ms | 3.3918 KOps/s | 2.8039 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.4652ms | 0.2726ms | 3.6687 KOps/s | 2.9738 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1477ms | 5.8101ms | 172.1143 Ops/s | 169.5301 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.1162ms | 0.2900ms | 3.4477 KOps/s | 2.7117 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4785ms | 0.2696ms | 3.7089 KOps/s | 2.9817 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1764ms | 6.0170ms | 166.1958 Ops/s | 166.0060 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9248ms | 0.4911ms | 2.0364 KOps/s | 1.9455 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7363ms | 0.4613ms | 2.1677 KOps/s | 2.1488 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.4637ms | 5.0767ms | 196.9765 Ops/s | 49.2032 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 4.0429ms | 1.9963ms | 500.9223 Ops/s | 480.3819 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.0968ms | 0.9092ms | 1.0998 KOps/s | 897.3809 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.6334s | 17.7982ms | 56.1853 Ops/s | 194.4992 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.8095ms | 1.8542ms | 539.3207 Ops/s | 563.9204 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 2.0093ms | 1.0986ms | 910.2235 Ops/s | 775.0537 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 8.5067ms | 5.3420ms | 187.1944 Ops/s | 186.0174 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 8.7413ms | 2.0583ms | 485.8386 Ops/s | 532.8509 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.1607ms | 1.2436ms | 804.1253 Ops/s | 897.8005 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 45.1670ms | 40.2258ms | 24.8596 Ops/s | 24.6188 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.8061ms | 18.5335ms | 53.9564 Ops/s | 53.1865 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 46.3212ms | 41.4054ms | 24.1515 Ops/s | 23.9193 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.3752ms | 18.9234ms | 52.8447 Ops/s | 53.0760 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 45.8199ms | 43.7863ms | 22.8382 Ops/s | 22.8524 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.9346ms | 20.8060ms | 48.0631 Ops/s | 48.9276 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8864ms | 0.2342ms | 4.2702 KOps/s | 4.4341 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.8708ms | 1.5127ms | 661.0860 Ops/s | 702.5375 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.6190ms | 2.4136ms | 414.3249 Ops/s | 413.7479 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.4093ms | 3.1296ms | 319.5310 Ops/s | 337.1785 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2099ms | 0.1394ms | 7.1760 KOps/s | 7.1225 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3214ms | 0.1935ms | 5.1692 KOps/s | 5.4141 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 2.1604ms | 1.8841ms | 530.7639 Ops/s | 571.4324 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.6954ms | 1.4087ms | 709.8889 Ops/s | 767.6196 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2670ms | 1.1441ms | 874.0681 Ops/s | 879.1337 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.9096ms | 3.6872ms | 271.2087 Ops/s | 276.7493 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 10.2102ms | 5.8679ms | 170.4194 Ops/s | 178.3045 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 15.5368ms | 7.3857ms | 135.3959 Ops/s | 143.5517 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4688ms | 0.2920ms | 3.4243 KOps/s | 3.5774 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.8689ms | 1.6194ms | 617.5014 Ops/s | 641.6081 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.8441ms | 2.5588ms | 390.8061 Ops/s | 392.7740 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.5837ms | 3.2931ms | 303.6646 Ops/s | 315.0862 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.2180ms | 33.6906ms | 29.6819 Ops/s | 29.8793 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 66.9777ms | 66.2043ms | 15.1048 Ops/s | 15.2401 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 38.9950ms | 38.3620ms | 26.0675 Ops/s | 26.2389 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 76.8274ms | 75.0421ms | 13.3259 Ops/s | 13.3586 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 82.2061μs | 81.3948μs | 12.2858 KOps/s | 12.2276 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1506ms | 0.1487ms | 6.7248 KOps/s | 7.0173 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1188s | 0.1185s | 8.4361 Ops/s | 8.2795 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5125μs | 2.5084μs | 398.6571 KOps/s | 395.6212 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 37.1004μs | 36.8819μs | 27.1135 KOps/s | 27.0615 KOps/s | |
| test_simple | 0.8132s | 0.8002s | 1.2497 Ops/s | 1.2090 Ops/s | |
| test_transformed | 1.3998s | 1.3987s | 0.7149 Ops/s | 0.7080 Ops/s | |
| test_serial | 2.4574s | 2.3700s | 0.4219 Ops/s | 0.4192 Ops/s | |
| test_parallel | 1.9333s | 1.8792s | 0.5321 Ops/s | 0.5487 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.1688ms | 42.0165μs | 23.8002 KOps/s | 23.6964 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 57.8740μs | 22.9129μs | 43.6435 KOps/s | 43.0469 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 59.8130μs | 23.1203μs | 43.2520 KOps/s | 42.8717 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 42.1330μs | 12.7176μs | 78.6310 KOps/s | 78.0322 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 78.2850μs | 44.2376μs | 22.6052 KOps/s | 22.1760 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 58.6640μs | 25.5491μs | 39.1403 KOps/s | 38.8396 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 54.0530μs | 25.7454μs | 38.8419 KOps/s | 38.2686 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 52.8230μs | 15.3020μs | 65.3511 KOps/s | 64.3435 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 79.7550μs | 46.0832μs | 21.6999 KOps/s | 21.5996 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 58.7840μs | 28.2841μs | 35.3555 KOps/s | 35.6327 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 59.2930μs | 25.5939μs | 39.0717 KOps/s | 37.9379 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 47.0730μs | 15.2494μs | 65.5762 KOps/s | 65.2903 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 79.6450μs | 49.1063μs | 20.3640 KOps/s | 20.4559 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 63.1640μs | 30.2491μs | 33.0589 KOps/s | 32.1612 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 59.2040μs | 28.3910μs | 35.2225 KOps/s | 34.8156 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 47.3130μs | 17.6171μs | 56.7632 KOps/s | 55.8781 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 81.9240μs | 46.9618μs | 21.2939 KOps/s | 21.1186 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 63.3230μs | 27.8784μs | 35.8701 KOps/s | 35.2236 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.5122ms | 29.7716μs | 33.5891 KOps/s | 33.1040 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 48.4630μs | 17.1018μs | 58.4733 KOps/s | 58.7743 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 79.7940μs | 49.0506μs | 20.3871 KOps/s | 20.2514 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 64.1740μs | 30.6387μs | 32.6384 KOps/s | 33.1177 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 61.6640μs | 31.7842μs | 31.4622 KOps/s | 31.2170 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 60.3040μs | 19.3638μs | 51.6427 KOps/s | 52.2289 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 84.9650μs | 51.6157μs | 19.3739 KOps/s | 19.1052 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 64.5240μs | 33.6370μs | 29.7292 KOps/s | 30.4645 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 71.1540μs | 32.0997μs | 31.1529 KOps/s | 30.9651 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 49.8930μs | 19.3643μs | 51.6415 KOps/s | 51.8894 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 91.2550μs | 55.1574μs | 18.1299 KOps/s | 18.6104 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 78.6450μs | 36.0440μs | 27.7439 KOps/s | 28.2113 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.1047ms | 34.2862μs | 29.1662 KOps/s | 29.0991 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 50.2530μs | 22.0058μs | 45.4425 KOps/s | 45.6486 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8527s | 0.7514s | 1.3309 Ops/s | 1.3405 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7103s | 0.6084s | 1.6438 Ops/s | 1.6404 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7432s | 1.6553s | 0.6041 Ops/s | 0.6055 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5092s | 1.4264s | 0.7011 Ops/s | 0.7006 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9813s | 1.8968s | 0.5272 Ops/s | 0.5274 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7579s | 1.6787s | 0.5957 Ops/s | 0.5987 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.7145s | 4.5968s | 0.2175 Ops/s | 0.2159 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.4923s | 4.4425s | 0.2251 Ops/s | 0.2251 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9674s | 1.8967s | 0.5272 Ops/s | 0.5292 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7285s | 1.6273s | 0.6145 Ops/s | 0.6258 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 23.1400ms | 21.2675ms | 47.0200 Ops/s | 46.7034 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1337s | 3.6082ms | 277.1493 Ops/s | 280.2886 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1084ms | 84.6432μs | 11.8143 KOps/s | 11.8446 KOps/s | |
| test_values[td1_return_estimate-False-False] | 50.9713ms | 49.8219ms | 20.0715 Ops/s | 20.1528 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3829ms | 1.1152ms | 896.6628 Ops/s | 907.0700 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 84.1277ms | 81.9853ms | 12.1973 Ops/s | 12.3752 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.3377ms | 1.1006ms | 908.6202 Ops/s | 909.8099 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 21.2771ms | 21.1250ms | 47.3373 Ops/s | 47.3050 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0496ms | 0.7731ms | 1.2935 KOps/s | 1.2938 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.8390ms | 0.6916ms | 1.4460 KOps/s | 1.4441 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5354ms | 1.5044ms | 664.6974 Ops/s | 665.7213 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7852ms | 0.7081ms | 1.4123 KOps/s | 1.4130 KOps/s | |
| test_dqn_speed[False-None] | 1.7124ms | 1.6258ms | 615.0795 Ops/s | 621.5379 Ops/s | |
| test_dqn_speed[False-backward] | 2.6423ms | 2.2752ms | 439.5141 Ops/s | 439.6583 Ops/s | |
| test_dqn_speed[True-None] | 0.7681ms | 0.5925ms | 1.6879 KOps/s | 1.6765 KOps/s | |
| test_dqn_speed[True-backward] | 1.3881ms | 1.2438ms | 803.9649 Ops/s | 809.9802 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7677ms | 0.6365ms | 1.5711 KOps/s | 1.5894 KOps/s | |
| test_ddpg_speed[False-None] | 3.4718ms | 3.1772ms | 314.7428 Ops/s | 326.7278 Ops/s | |
| test_ddpg_speed[False-backward] | 5.0646ms | 4.6026ms | 217.2676 Ops/s | 222.2544 Ops/s | |
| test_ddpg_speed[True-None] | 1.6105ms | 1.4069ms | 710.8025 Ops/s | 732.1398 Ops/s | |
| test_ddpg_speed[True-backward] | 2.6384ms | 2.5527ms | 391.7399 Ops/s | 391.7254 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.6330ms | 1.4127ms | 707.8465 Ops/s | 714.1584 Ops/s | |
| test_sac_speed[False-None] | 8.9446ms | 8.5939ms | 116.3614 Ops/s | 116.0574 Ops/s | |
| test_sac_speed[False-backward] | 12.2541ms | 11.8827ms | 84.1561 Ops/s | 84.0896 Ops/s | |
| test_sac_speed[True-None] | 2.2441ms | 1.8948ms | 527.7499 Ops/s | 534.1525 Ops/s | |
| test_sac_speed[True-backward] | 3.8486ms | 3.7457ms | 266.9703 Ops/s | 271.9216 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 16.6397ms | 10.1162ms | 98.8511 Ops/s | 98.7586 Ops/s | |
| test_redq_deprec_speed[False-None] | 10.5189ms | 9.6618ms | 103.5005 Ops/s | 103.6336 Ops/s | |
| test_redq_deprec_speed[False-backward] | 13.6767ms | 13.1281ms | 76.1725 Ops/s | 76.4335 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.8770ms | 2.6321ms | 379.9308 Ops/s | 381.4294 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.7404ms | 4.3007ms | 232.5223 Ops/s | 229.2037 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 14.5449ms | 9.6486ms | 103.6421 Ops/s | 103.0209 Ops/s | |
| test_td3_speed[False-None] | 8.6628ms | 8.4954ms | 117.7105 Ops/s | 117.8534 Ops/s | |
| test_td3_speed[False-backward] | 12.0296ms | 11.1770ms | 89.4693 Ops/s | 89.7715 Ops/s | |
| test_td3_speed[True-None] | 1.7568ms | 1.7207ms | 581.1545 Ops/s | 607.5491 Ops/s | |
| test_td3_speed[True-backward] | 3.2956ms | 3.2006ms | 312.4406 Ops/s | 310.9313 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 98.6499ms | 25.9290ms | 38.5669 Ops/s | 38.1199 Ops/s | |
| test_cql_speed[False-None] | 18.2812ms | 17.9987ms | 55.5595 Ops/s | 55.6628 Ops/s | |
| test_cql_speed[False-backward] | 24.4449ms | 23.8655ms | 41.9015 Ops/s | 42.0936 Ops/s | |
| test_cql_speed[True-None] | 3.4891ms | 3.3650ms | 297.1789 Ops/s | 298.9559 Ops/s | |
| test_cql_speed[True-backward] | 5.8022ms | 5.6194ms | 177.9535 Ops/s | 175.4869 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 17.9892ms | 11.9073ms | 83.9819 Ops/s | 83.1174 Ops/s | |
| test_a2c_speed[False-None] | 3.5866ms | 3.4054ms | 293.6556 Ops/s | 291.2374 Ops/s | |
| test_a2c_speed[False-backward] | 7.3889ms | 6.6500ms | 150.3764 Ops/s | 149.5127 Ops/s | |
| test_a2c_speed[True-None] | 1.5435ms | 1.3825ms | 723.3134 Ops/s | 697.8725 Ops/s | |
| test_a2c_speed[True-backward] | 3.2817ms | 3.2303ms | 309.5673 Ops/s | 321.9697 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.2047ms | 1.0543ms | 948.5247 Ops/s | 944.4109 Ops/s | |
| test_ppo_speed[False-None] | 4.1914ms | 4.0769ms | 245.2830 Ops/s | 237.7262 Ops/s | |
| test_ppo_speed[False-backward] | 7.9712ms | 7.5877ms | 131.7921 Ops/s | 135.3200 Ops/s | |
| test_ppo_speed[True-None] | 1.6346ms | 1.5312ms | 653.0878 Ops/s | 653.8251 Ops/s | |
| test_ppo_speed[True-backward] | 3.4717ms | 3.4261ms | 291.8779 Ops/s | 310.5376 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.1779ms | 1.1029ms | 906.7205 Ops/s | 900.8899 Ops/s | |
| test_reinforce_speed[False-None] | 3.2088ms | 2.4615ms | 406.2490 Ops/s | 410.2081 Ops/s | |
| test_reinforce_speed[False-backward] | 3.7660ms | 3.6105ms | 276.9692 Ops/s | 276.4654 Ops/s | |
| test_reinforce_speed[True-None] | 1.5068ms | 1.3885ms | 720.2065 Ops/s | 735.6074 Ops/s | |
| test_reinforce_speed[True-backward] | 3.3597ms | 3.2113ms | 311.4044 Ops/s | 323.8252 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 15.8549ms | 8.8956ms | 112.4156 Ops/s | 112.1263 Ops/s | |
| test_iql_speed[False-None] | 10.4712ms | 9.8516ms | 101.5060 Ops/s | 101.5894 Ops/s | |
| test_iql_speed[False-backward] | 14.7354ms | 13.9830ms | 71.5155 Ops/s | 72.9793 Ops/s | |
| test_iql_speed[True-None] | 2.4591ms | 2.2963ms | 435.4738 Ops/s | 434.0339 Ops/s | |
| test_iql_speed[True-backward] | 5.4792ms | 5.0681ms | 197.3122 Ops/s | 204.2989 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 16.1719ms | 10.0347ms | 99.6542 Ops/s | 99.2410 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.3358ms | 5.9394ms | 168.3662 Ops/s | 167.4721 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.7269ms | 0.3535ms | 2.8289 KOps/s | 2.7398 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5595ms | 0.3095ms | 3.2313 KOps/s | 2.8734 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1228ms | 5.7385ms | 174.2604 Ops/s | 172.4803 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9937ms | 0.3508ms | 2.8508 KOps/s | 3.2107 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6096ms | 0.3015ms | 3.3166 KOps/s | 3.1084 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.5588ms | 1.3294ms | 752.2010 Ops/s | 779.6286 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.5536ms | 1.2408ms | 805.9571 Ops/s | 831.3161 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 9.2056ms | 6.0293ms | 165.8563 Ops/s | 166.6352 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.2495ms | 0.4422ms | 2.2613 KOps/s | 2.2782 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6489ms | 0.4258ms | 2.3483 KOps/s | 2.3466 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9107ms | 5.8232ms | 171.7282 Ops/s | 171.9553 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.9083ms | 0.2903ms | 3.4443 KOps/s | 2.9381 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.4747ms | 0.2762ms | 3.6210 KOps/s | 2.6644 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0220ms | 5.7361ms | 174.3338 Ops/s | 173.8428 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6716ms | 0.3211ms | 3.1144 KOps/s | 3.3974 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5389ms | 0.3142ms | 3.1827 KOps/s | 3.3446 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 8.9060ms | 5.9217ms | 168.8695 Ops/s | 166.0937 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.7794ms | 0.4790ms | 2.0875 KOps/s | 2.2445 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8650ms | 0.4831ms | 2.0700 KOps/s | 2.3560 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.9478s | 23.9421ms | 41.7674 Ops/s | 194.8081 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 3.9898ms | 1.9329ms | 517.3542 Ops/s | 552.7010 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 7.3539ms | 1.3064ms | 765.4332 Ops/s | 805.3738 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 6.5465ms | 5.0412ms | 198.3665 Ops/s | 160.5633 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.9159ms | 1.8486ms | 540.9478 Ops/s | 477.7089 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 6.3134ms | 1.2703ms | 787.2191 Ops/s | 704.6573 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.6584s | 18.3065ms | 54.6253 Ops/s | 185.4539 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 6.2367ms | 2.1421ms | 466.8310 Ops/s | 467.6496 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.3919ms | 1.1848ms | 844.0400 Ops/s | 55.4431 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 41.0489ms | 38.9263ms | 25.6896 Ops/s | 25.7023 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.7881ms | 18.1774ms | 55.0135 Ops/s | 54.4410 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 44.5686ms | 40.1481ms | 24.9078 Ops/s | 24.3933 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.9574ms | 18.7929ms | 53.2117 Ops/s | 53.2255 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 44.5462ms | 42.4405ms | 23.5624 Ops/s | 23.6306 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.7735ms | 20.1644ms | 49.5923 Ops/s | 49.3862 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8743ms | 0.2301ms | 4.3461 KOps/s | 4.5899 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7055ms | 1.3671ms | 731.4710 Ops/s | 737.7719 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.7545ms | 2.3102ms | 432.8572 Ops/s | 432.4771 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.1109ms | 2.9199ms | 342.4726 Ops/s | 347.3819 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.5358ms | 0.1701ms | 5.8805 KOps/s | 6.0187 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3931ms | 0.2329ms | 4.2930 KOps/s | 4.2681 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.9500ms | 1.7779ms | 562.4643 Ops/s | 543.3703 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5857ms | 1.3812ms | 723.9959 Ops/s | 717.5430 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.3175ms | 1.1548ms | 865.9871 Ops/s | 870.0505 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.8167ms | 3.6197ms | 276.2695 Ops/s | 278.8514 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 5.9553ms | 5.6735ms | 176.2594 Ops/s | 172.6535 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.1259ms | 6.9307ms | 144.2848 Ops/s | 141.5608 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4404ms | 0.2777ms | 3.6009 KOps/s | 3.6454 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.7215ms | 1.5561ms | 642.6174 Ops/s | 656.0202 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.7135ms | 2.4281ms | 411.8408 Ops/s | 409.3859 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.3980ms | 3.1191ms | 320.6074 Ops/s | 323.8943 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.4576ms | 33.4915ms | 29.8584 Ops/s | 30.2327 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 66.7240ms | 65.6280ms | 15.2374 Ops/s | 15.3268 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 38.8018ms | 37.8771ms | 26.4012 Ops/s | 26.6543 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 76.0888ms | 75.1861ms | 13.3003 Ops/s | 13.4810 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 58.2142ms | 57.2455ms | 17.4686 Ops/s | 17.7656 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1155s | 0.1126s | 8.8824 Ops/s | 8.9986 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 59.8349ms | 58.2246ms | 17.1749 Ops/s | 17.1434 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.1195s | 0.1161s | 8.6098 Ops/s | 8.7055 Ops/s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
_StepMDP.call now accepts an optional out parameter. When provided,
the output TensorDict is reused instead of allocating a new one each call.
This enables callers (collectors, rollout loops) to pre-allocate a buffer
and avoid per-step TensorDict creation overhead.
Also fixes _exclude return type annotation and ensures it returns the
pre-provided out buffer even when no new keys are set.
Made-with: Cursor