Skip to content

[Performance] Add update_traj_ids flag to Collector to skip trajectory tracking#3563

Open
vmoens wants to merge 1 commit intogh/vmoens/244/basefrom
gh/vmoens/244/head
Open

[Performance] Add update_traj_ids flag to Collector to skip trajectory tracking#3563
vmoens wants to merge 1 commit intogh/vmoens/244/basefrom
gh/vmoens/244/head

Conversation

@vmoens
Copy link
Copy Markdown
Collaborator

@vmoens vmoens commented Mar 23, 2026

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 23, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3563

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit ab2608b with merge base 0a1aea6 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link
Copy Markdown
Contributor

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 79.9240μs 78.7591μs 12.6969 KOps/s 12.5896 KOps/s $\color{#35bf28}+0.85\%$
test_tensor_to_bytestream_speed[torch.save] 0.1432ms 0.1406ms 7.1132 KOps/s 7.2841 KOps/s $\color{#d91a1a}-2.35\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1127s 0.1122s 8.9159 Ops/s 8.9127 Ops/s $\color{#35bf28}+0.04\%$
test_tensor_to_bytestream_speed[numpy] 2.5142μs 2.5045μs 399.2756 KOps/s 397.6395 KOps/s $\color{#35bf28}+0.41\%$
test_tensor_to_bytestream_speed[safetensors] 38.2736μs 37.6907μs 26.5317 KOps/s 27.1963 KOps/s $\color{#d91a1a}-2.44\%$
test_simple 0.7692s 0.7685s 1.3012 Ops/s 1.2456 Ops/s $\color{#35bf28}+4.47\%$
test_transformed 1.3832s 1.3691s 0.7304 Ops/s 0.7267 Ops/s $\color{#35bf28}+0.52\%$
test_serial 2.3868s 2.3051s 0.4338 Ops/s 0.4326 Ops/s $\color{#35bf28}+0.28\%$
test_parallel 1.8897s 1.8094s 0.5527 Ops/s 0.5537 Ops/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[True-True-True-True-True] 0.2109ms 41.0430μs 24.3647 KOps/s 24.5463 KOps/s $\color{#d91a1a}-0.74\%$
test_step_mdp_speed[True-True-True-True-False] 44.9000μs 22.2278μs 44.9886 KOps/s 43.7777 KOps/s $\color{#35bf28}+2.77\%$
test_step_mdp_speed[True-True-True-False-True] 61.3110μs 23.0333μs 43.4155 KOps/s 42.5336 KOps/s $\color{#35bf28}+2.07\%$
test_step_mdp_speed[True-True-True-False-False] 39.1810μs 12.4489μs 80.3282 KOps/s 78.7574 KOps/s $\color{#35bf28}+1.99\%$
test_step_mdp_speed[True-True-False-True-True] 70.9410μs 44.9868μs 22.2287 KOps/s 22.4683 KOps/s $\color{#d91a1a}-1.07\%$
test_step_mdp_speed[True-True-False-True-False] 58.7810μs 24.6361μs 40.5909 KOps/s 39.2218 KOps/s $\color{#35bf28}+3.49\%$
test_step_mdp_speed[True-True-False-False-True] 61.4110μs 25.8627μs 38.6657 KOps/s 39.3197 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[True-True-False-False-False] 51.2310μs 15.1141μs 66.1636 KOps/s 64.7739 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[True-False-True-True-True] 85.0610μs 47.6923μs 20.9677 KOps/s 21.4405 KOps/s $\color{#d91a1a}-2.20\%$
test_step_mdp_speed[True-False-True-True-False] 67.3010μs 27.5856μs 36.2508 KOps/s 35.8545 KOps/s $\color{#35bf28}+1.11\%$
test_step_mdp_speed[True-False-True-False-True] 50.2500μs 25.4170μs 39.3438 KOps/s 38.0816 KOps/s $\color{#35bf28}+3.31\%$
test_step_mdp_speed[True-False-True-False-False] 46.2700μs 14.7926μs 67.6014 KOps/s 65.4868 KOps/s $\color{#35bf28}+3.23\%$
test_step_mdp_speed[True-False-False-True-True] 84.9110μs 48.2687μs 20.7174 KOps/s 20.6037 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[True-False-False-True-False] 0.1021ms 29.7397μs 33.6251 KOps/s 32.8099 KOps/s $\color{#35bf28}+2.48\%$
test_step_mdp_speed[True-False-False-False-True] 64.9810μs 28.3511μs 35.2720 KOps/s 36.1659 KOps/s $\color{#d91a1a}-2.47\%$
test_step_mdp_speed[True-False-False-False-False] 53.4000μs 17.6539μs 56.6448 KOps/s 56.9583 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[False-True-True-True-True] 86.3710μs 46.6602μs 21.4315 KOps/s 21.5221 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[False-True-True-True-False] 94.2910μs 27.2603μs 36.6834 KOps/s 35.5530 KOps/s $\color{#35bf28}+3.18\%$
test_step_mdp_speed[False-True-True-False-True] 2.5117ms 30.0684μs 33.2575 KOps/s 33.5670 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[False-True-True-False-False] 53.7100μs 17.0262μs 58.7330 KOps/s 59.1592 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[False-True-False-True-True] 92.0010μs 49.5990μs 20.1617 KOps/s 20.6402 KOps/s $\color{#d91a1a}-2.32\%$
test_step_mdp_speed[False-True-False-True-False] 82.6710μs 30.3953μs 32.8998 KOps/s 32.9300 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-True-False-False-True] 66.0210μs 31.0342μs 32.2225 KOps/s 31.3161 KOps/s $\color{#35bf28}+2.89\%$
test_step_mdp_speed[False-True-False-False-False] 51.5900μs 19.4051μs 51.5328 KOps/s 51.5474 KOps/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[False-False-True-True-True] 88.8020μs 50.7374μs 19.7093 KOps/s 19.2795 KOps/s $\color{#35bf28}+2.23\%$
test_step_mdp_speed[False-False-True-True-False] 64.4610μs 32.8716μs 30.4214 KOps/s 30.6900 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[False-False-True-False-True] 67.1610μs 31.9822μs 31.2674 KOps/s 31.7862 KOps/s $\color{#d91a1a}-1.63\%$
test_step_mdp_speed[False-False-True-False-False] 50.2110μs 19.2771μs 51.8750 KOps/s 51.4204 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[False-False-False-True-True] 85.5220μs 53.2252μs 18.7881 KOps/s 18.9991 KOps/s $\color{#d91a1a}-1.11\%$
test_step_mdp_speed[False-False-False-True-False] 0.1122ms 35.2145μs 28.3974 KOps/s 28.9134 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[False-False-False-False-True] 63.9310μs 33.6432μs 29.7237 KOps/s 30.3404 KOps/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[False-False-False-False-False] 45.7900μs 21.8319μs 45.8046 KOps/s 46.2363 KOps/s $\color{#d91a1a}-0.93\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8442s 0.7458s 1.3409 Ops/s 1.3648 Ops/s $\color{#d91a1a}-1.75\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7096s 0.6117s 1.6348 Ops/s 1.6638 Ops/s $\color{#d91a1a}-1.74\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7347s 1.6516s 0.6055 Ops/s 0.6112 Ops/s $\color{#d91a1a}-0.94\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5033s 1.4211s 0.7037 Ops/s 0.7065 Ops/s $\color{#d91a1a}-0.40\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9904s 1.9168s 0.5217 Ops/s 0.5344 Ops/s $\color{#d91a1a}-2.37\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7524s 1.6654s 0.6004 Ops/s 0.6058 Ops/s $\color{#d91a1a}-0.88\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6995s 4.5694s 0.2188 Ops/s 0.2178 Ops/s $\color{#35bf28}+0.46\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.4358s 4.3690s 0.2289 Ops/s 0.2274 Ops/s $\color{#35bf28}+0.64\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9271s 1.8455s 0.5419 Ops/s 0.5351 Ops/s $\color{#35bf28}+1.27\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7846s 1.5993s 0.6253 Ops/s 0.6295 Ops/s $\color{#d91a1a}-0.67\%$
test_values[generalized_advantage_estimate-True-True] 20.3983ms 19.6190ms 50.9710 Ops/s 47.9451 Ops/s $\textbf{\color{#35bf28}+6.31\%}$
test_values[vec_generalized_advantage_estimate-True-True] 0.1285s 3.4824ms 287.1599 Ops/s 281.7498 Ops/s $\color{#35bf28}+1.92\%$
test_values[td0_return_estimate-False-False] 0.1053ms 80.6410μs 12.4006 KOps/s 12.4293 KOps/s $\color{#d91a1a}-0.23\%$
test_values[td1_return_estimate-False-False] 48.4932ms 46.8995ms 21.3222 Ops/s 20.7121 Ops/s $\color{#35bf28}+2.95\%$
test_values[vec_td1_return_estimate-False-False] 1.3522ms 1.0795ms 926.3828 Ops/s 934.3779 Ops/s $\color{#d91a1a}-0.86\%$
test_values[td_lambda_return_estimate-True-False] 77.8107ms 76.8202ms 13.0174 Ops/s 12.7246 Ops/s $\color{#35bf28}+2.30\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2922ms 1.0659ms 938.1677 Ops/s 931.2379 Ops/s $\color{#35bf28}+0.74\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 20.1374ms 19.6941ms 50.7766 Ops/s 49.9105 Ops/s $\color{#35bf28}+1.74\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0053ms 0.7388ms 1.3535 KOps/s 1.3615 KOps/s $\color{#d91a1a}-0.58\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7083ms 0.6613ms 1.5122 KOps/s 1.5088 KOps/s $\color{#35bf28}+0.23\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5374ms 1.4745ms 678.2067 Ops/s 679.0253 Ops/s $\color{#d91a1a}-0.12\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7183ms 0.6777ms 1.4756 KOps/s 1.4438 KOps/s $\color{#35bf28}+2.20\%$
test_dqn_speed[False-None] 1.6851ms 1.5632ms 639.7179 Ops/s 634.9880 Ops/s $\color{#35bf28}+0.74\%$
test_dqn_speed[False-backward] 2.3798ms 2.2177ms 450.9240 Ops/s 454.8908 Ops/s $\color{#d91a1a}-0.87\%$
test_dqn_speed[True-None] 1.1431ms 0.5881ms 1.7005 KOps/s 1.7371 KOps/s $\color{#d91a1a}-2.10\%$
test_dqn_speed[True-backward] 1.2614ms 1.2077ms 828.0275 Ops/s 827.8399 Ops/s $\color{#35bf28}+0.02\%$
test_dqn_speed[reduce-overhead-None] 0.6659ms 0.5876ms 1.7020 KOps/s 1.6379 KOps/s $\color{#35bf28}+3.91\%$
test_ddpg_speed[False-None] 3.5095ms 3.0880ms 323.8347 Ops/s 340.1636 Ops/s $\color{#d91a1a}-4.80\%$
test_ddpg_speed[False-backward] 4.8676ms 4.4236ms 226.0593 Ops/s 232.6687 Ops/s $\color{#d91a1a}-2.84\%$
test_ddpg_speed[True-None] 1.4314ms 1.3161ms 759.7934 Ops/s 756.1637 Ops/s $\color{#35bf28}+0.48\%$
test_ddpg_speed[True-backward] 2.5183ms 2.4386ms 410.0780 Ops/s 408.2763 Ops/s $\color{#35bf28}+0.44\%$
test_ddpg_speed[reduce-overhead-None] 1.4319ms 1.3455ms 743.2190 Ops/s 741.6509 Ops/s $\color{#35bf28}+0.21\%$
test_sac_speed[False-None] 8.7199ms 8.3628ms 119.5776 Ops/s 120.1779 Ops/s $\color{#d91a1a}-0.50\%$
test_sac_speed[False-backward] 12.0758ms 11.5969ms 86.2302 Ops/s 86.2798 Ops/s $\color{#d91a1a}-0.06\%$
test_sac_speed[True-None] 2.1738ms 1.8155ms 550.8012 Ops/s 554.1788 Ops/s $\color{#d91a1a}-0.61\%$
test_sac_speed[True-backward] 3.9769ms 3.5536ms 281.4075 Ops/s 279.2363 Ops/s $\color{#35bf28}+0.78\%$
test_sac_speed[reduce-overhead-None] 16.8598ms 10.1175ms 98.8383 Ops/s 99.0458 Ops/s $\color{#d91a1a}-0.21\%$
test_redq_deprec_speed[False-None] 9.5211ms 9.3567ms 106.8751 Ops/s 106.8532 Ops/s $\color{#35bf28}+0.02\%$
test_redq_deprec_speed[False-backward] 13.0594ms 12.7212ms 78.6086 Ops/s 78.6565 Ops/s $\color{#d91a1a}-0.06\%$
test_redq_deprec_speed[True-None] 2.6264ms 2.4987ms 400.2009 Ops/s 380.6235 Ops/s $\textbf{\color{#35bf28}+5.14\%}$
test_redq_deprec_speed[True-backward] 4.3415ms 4.1299ms 242.1372 Ops/s 240.5665 Ops/s $\color{#35bf28}+0.65\%$
test_redq_deprec_speed[reduce-overhead-None] 14.1794ms 9.3122ms 107.3855 Ops/s 104.4393 Ops/s $\color{#35bf28}+2.82\%$
test_td3_speed[False-None] 8.6322ms 8.2446ms 121.2922 Ops/s 121.7477 Ops/s $\color{#d91a1a}-0.37\%$
test_td3_speed[False-backward] 11.8190ms 10.9437ms 91.3769 Ops/s 92.0317 Ops/s $\color{#d91a1a}-0.71\%$
test_td3_speed[True-None] 1.6756ms 1.5921ms 628.1118 Ops/s 633.2692 Ops/s $\color{#d91a1a}-0.81\%$
test_td3_speed[True-backward] 3.2093ms 3.0792ms 324.7635 Ops/s 323.3547 Ops/s $\color{#35bf28}+0.44\%$
test_td3_speed[reduce-overhead-None] 83.9550ms 25.3631ms 39.4274 Ops/s 39.0447 Ops/s $\color{#35bf28}+0.98\%$
test_cql_speed[False-None] 17.8310ms 17.5083ms 57.1156 Ops/s 57.3232 Ops/s $\color{#d91a1a}-0.36\%$
test_cql_speed[False-backward] 23.5365ms 23.0936ms 43.3020 Ops/s 43.2932 Ops/s $\color{#35bf28}+0.02\%$
test_cql_speed[True-None] 3.2941ms 3.2318ms 309.4209 Ops/s 309.2400 Ops/s $\color{#35bf28}+0.06\%$
test_cql_speed[True-backward] 5.8341ms 5.4386ms 183.8692 Ops/s 187.4562 Ops/s $\color{#d91a1a}-1.91\%$
test_cql_speed[reduce-overhead-None] 0.8402s 16.8894ms 59.2086 Ops/s 84.9018 Ops/s $\textbf{\color{#d91a1a}-30.26\%}$
test_a2c_speed[False-None] 3.3748ms 3.2985ms 303.1709 Ops/s 302.3649 Ops/s $\color{#35bf28}+0.27\%$
test_a2c_speed[False-backward] 6.9264ms 6.5061ms 153.7009 Ops/s 161.7388 Ops/s $\color{#d91a1a}-4.97\%$
test_a2c_speed[True-None] 1.4315ms 1.3592ms 735.7264 Ops/s 731.1010 Ops/s $\color{#35bf28}+0.63\%$
test_a2c_speed[True-backward] 3.1941ms 3.1010ms 322.4779 Ops/s 320.2121 Ops/s $\color{#35bf28}+0.71\%$
test_a2c_speed[reduce-overhead-None] 1.0872ms 1.0206ms 979.8391 Ops/s 959.3841 Ops/s $\color{#35bf28}+2.13\%$
test_ppo_speed[False-None] 4.0457ms 3.9265ms 254.6821 Ops/s 248.7485 Ops/s $\color{#35bf28}+2.39\%$
test_ppo_speed[False-backward] 7.6736ms 7.2692ms 137.5659 Ops/s 135.4415 Ops/s $\color{#35bf28}+1.57\%$
test_ppo_speed[True-None] 1.5709ms 1.4726ms 679.0584 Ops/s 669.4854 Ops/s $\color{#35bf28}+1.43\%$
test_ppo_speed[True-backward] 3.4083ms 3.2605ms 306.6984 Ops/s 319.8625 Ops/s $\color{#d91a1a}-4.12\%$
test_ppo_speed[reduce-overhead-None] 1.1897ms 1.0622ms 941.4522 Ops/s 904.1757 Ops/s $\color{#35bf28}+4.12\%$
test_reinforce_speed[False-None] 2.4829ms 2.3640ms 423.0149 Ops/s 418.0762 Ops/s $\color{#35bf28}+1.18\%$
test_reinforce_speed[False-backward] 3.5318ms 3.4849ms 286.9554 Ops/s 284.0643 Ops/s $\color{#35bf28}+1.02\%$
test_reinforce_speed[True-None] 1.5230ms 1.3312ms 751.2080 Ops/s 726.0417 Ops/s $\color{#35bf28}+3.47\%$
test_reinforce_speed[True-backward] 3.1614ms 3.0682ms 325.9229 Ops/s 337.5102 Ops/s $\color{#d91a1a}-3.43\%$
test_reinforce_speed[reduce-overhead-None] 15.6636ms 8.9274ms 112.0142 Ops/s 112.6109 Ops/s $\color{#d91a1a}-0.53\%$
test_iql_speed[False-None] 10.3925ms 9.5309ms 104.9216 Ops/s 104.2661 Ops/s $\color{#35bf28}+0.63\%$
test_iql_speed[False-backward] 14.0543ms 13.5183ms 73.9737 Ops/s 75.1781 Ops/s $\color{#d91a1a}-1.60\%$
test_iql_speed[True-None] 2.3498ms 2.2102ms 452.4573 Ops/s 449.6840 Ops/s $\color{#35bf28}+0.62\%$
test_iql_speed[True-backward] 5.1762ms 4.8097ms 207.9117 Ops/s 212.8982 Ops/s $\color{#d91a1a}-2.34\%$
test_iql_speed[reduce-overhead-None] 17.0308ms 9.9116ms 100.8914 Ops/s 101.0899 Ops/s $\color{#d91a1a}-0.20\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1402ms 5.7926ms 172.6328 Ops/s 171.2296 Ops/s $\color{#35bf28}+0.82\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9328ms 0.3365ms 2.9719 KOps/s 2.9774 KOps/s $\color{#d91a1a}-0.19\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6481ms 0.3361ms 2.9754 KOps/s 3.2969 KOps/s $\textbf{\color{#d91a1a}-9.75\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.8673ms 5.5803ms 179.2014 Ops/s 176.6530 Ops/s $\color{#35bf28}+1.44\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0034ms 0.3532ms 2.8312 KOps/s 3.0961 KOps/s $\textbf{\color{#d91a1a}-8.55\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6375ms 0.3400ms 2.9414 KOps/s 3.0037 KOps/s $\color{#d91a1a}-2.08\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6905ms 1.4402ms 694.3692 Ops/s 760.3490 Ops/s $\textbf{\color{#d91a1a}-8.68\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5928ms 1.3476ms 742.0450 Ops/s 821.3266 Ops/s $\textbf{\color{#d91a1a}-9.65\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 9.7860ms 5.9034ms 169.3933 Ops/s 171.6976 Ops/s $\color{#d91a1a}-1.34\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1109ms 0.4472ms 2.2360 KOps/s 2.2341 KOps/s $\color{#35bf28}+0.09\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8636ms 0.4510ms 2.2173 KOps/s 2.2251 KOps/s $\color{#d91a1a}-0.35\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.6475ms 5.5710ms 179.5002 Ops/s 177.6402 Ops/s $\color{#35bf28}+1.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6882ms 0.2937ms 3.4054 KOps/s 3.3500 KOps/s $\color{#35bf28}+1.65\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4904ms 0.2747ms 3.6403 KOps/s 3.6085 KOps/s $\color{#35bf28}+0.88\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.7502ms 5.5454ms 180.3299 Ops/s 179.8050 Ops/s $\color{#35bf28}+0.29\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9035ms 0.3793ms 2.6364 KOps/s 3.2679 KOps/s $\textbf{\color{#d91a1a}-19.32\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7353ms 0.3889ms 2.5715 KOps/s 3.4785 KOps/s $\textbf{\color{#d91a1a}-26.07\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2223ms 5.7101ms 175.1276 Ops/s 170.4185 Ops/s $\color{#35bf28}+2.76\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4949ms 0.4815ms 2.0767 KOps/s 2.1971 KOps/s $\textbf{\color{#d91a1a}-5.48\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7106ms 0.4747ms 2.1064 KOps/s 2.0644 KOps/s $\color{#35bf28}+2.04\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.9589s 24.1032ms 41.4882 Ops/s 198.4924 Ops/s $\textbf{\color{#d91a1a}-79.10\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.9699ms 2.0088ms 497.8119 Ops/s 557.0729 Ops/s $\textbf{\color{#d91a1a}-10.64\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 9.8729ms 1.3303ms 751.7012 Ops/s 986.4560 Ops/s $\textbf{\color{#d91a1a}-23.80\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.8283ms 5.0129ms 199.4862 Ops/s 196.7971 Ops/s $\color{#35bf28}+1.37\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 4.0192ms 1.8274ms 547.2132 Ops/s 483.1660 Ops/s $\textbf{\color{#35bf28}+13.26\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.2738ms 0.9736ms 1.0271 KOps/s 975.5821 Ops/s $\textbf{\color{#35bf28}+5.28\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.5221ms 5.2051ms 192.1177 Ops/s 45.3686 Ops/s $\textbf{\color{#35bf28}+323.46\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.2333ms 2.1507ms 464.9674 Ops/s 425.4826 Ops/s $\textbf{\color{#35bf28}+9.28\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.5343ms 1.2167ms 821.8844 Ops/s 841.0828 Ops/s $\color{#d91a1a}-2.28\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 39.8672ms 38.1012ms 26.2459 Ops/s 25.9854 Ops/s $\color{#35bf28}+1.00\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.5096ms 18.1693ms 55.0378 Ops/s 55.0398 Ops/s $-0.00\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 42.9960ms 39.3875ms 25.3888 Ops/s 25.1321 Ops/s $\color{#35bf28}+1.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.7756ms 18.3738ms 54.4252 Ops/s 53.0208 Ops/s $\color{#35bf28}+2.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 42.4177ms 41.0100ms 24.3843 Ops/s 24.0680 Ops/s $\color{#35bf28}+1.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.0258ms 19.7445ms 50.6470 Ops/s 49.8962 Ops/s $\color{#35bf28}+1.50\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8676ms 0.2197ms 4.5522 KOps/s 4.5192 KOps/s $\color{#35bf28}+0.73\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7097ms 1.3783ms 725.5333 Ops/s 721.3034 Ops/s $\color{#35bf28}+0.59\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.9098ms 2.3734ms 421.3360 Ops/s 413.6138 Ops/s $\color{#35bf28}+1.87\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.2552ms 2.9676ms 336.9770 Ops/s 332.5092 Ops/s $\color{#35bf28}+1.34\%$
test_storage_write_contiguous[50-img_shape0-small] 0.4878ms 0.1625ms 6.1538 KOps/s 6.0304 KOps/s $\color{#35bf28}+2.05\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3816ms 0.2329ms 4.2934 KOps/s 4.4083 KOps/s $\color{#d91a1a}-2.61\%$
test_storage_write_contiguous[100-img_shape2-large_img] 2.1511ms 1.9507ms 512.6330 Ops/s 527.0610 Ops/s $\color{#d91a1a}-2.74\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5871ms 1.3728ms 728.4579 Ops/s 674.3416 Ops/s $\textbf{\color{#35bf28}+8.03\%}$
test_collector_stack_then_write[50-img_shape0-small] 1.1815ms 1.1230ms 890.4785 Ops/s 890.0946 Ops/s $\color{#35bf28}+0.04\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.9043ms 3.6263ms 275.7645 Ops/s 280.7268 Ops/s $\color{#d91a1a}-1.77\%$
test_collector_stack_then_write[100-img_shape2-large_img] 11.3651ms 5.8359ms 171.3529 Ops/s 170.4763 Ops/s $\color{#35bf28}+0.51\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.2530ms 6.9523ms 143.8371 Ops/s 136.3865 Ops/s $\textbf{\color{#35bf28}+5.46\%}$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4184ms 0.2750ms 3.6368 KOps/s 3.5605 KOps/s $\color{#35bf28}+2.14\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6746ms 1.4840ms 673.8438 Ops/s 671.2446 Ops/s $\color{#35bf28}+0.39\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.7329ms 2.5018ms 399.7049 Ops/s 396.1273 Ops/s $\color{#35bf28}+0.90\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.4933ms 3.1986ms 312.6403 Ops/s 312.5044 Ops/s $\color{#35bf28}+0.04\%$
test_collector_without_rb[100-img_shape0-atari] 33.8878ms 32.7020ms 30.5792 Ops/s 30.5745 Ops/s $\color{#35bf28}+0.02\%$
test_collector_without_rb[200-img_shape1-large_batch] 64.4432ms 63.9884ms 15.6278 Ops/s 15.6343 Ops/s $\color{#d91a1a}-0.04\%$
test_collector_with_rb[100-img_shape0-atari] 37.7857ms 37.0376ms 26.9996 Ops/s 26.8061 Ops/s $\color{#35bf28}+0.72\%$
test_collector_with_rb[200-img_shape1-large_batch] 73.0444ms 72.4247ms 13.8074 Ops/s 13.6384 Ops/s $\color{#35bf28}+1.24\%$
test_collector_without_rb_cuda[100-img_shape0-atari] 54.5560ms 53.9975ms 18.5194 Ops/s 18.4526 Ops/s $\color{#35bf28}+0.36\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1117s 0.1084s 9.2225 Ops/s 9.3036 Ops/s $\color{#d91a1a}-0.87\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 57.0927ms 56.0483ms 17.8417 Ops/s 17.8546 Ops/s $\color{#d91a1a}-0.07\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1123s 0.1114s 8.9752 Ops/s 8.9766 Ops/s $\color{#d91a1a}-0.01\%$

@github-actions
Copy link
Copy Markdown
Contributor

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}18$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 84.6260μs 83.2032μs 12.0188 KOps/s 12.2950 KOps/s $\color{#d91a1a}-2.25\%$
test_tensor_to_bytestream_speed[torch.save] 0.1455ms 0.1448ms 6.9079 KOps/s 7.1700 KOps/s $\color{#d91a1a}-3.65\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1124s 0.1119s 8.9330 Ops/s 8.9760 Ops/s $\color{#d91a1a}-0.48\%$
test_tensor_to_bytestream_speed[numpy] 2.4223μs 2.4050μs 415.8070 KOps/s 406.7866 KOps/s $\color{#35bf28}+2.22\%$
test_tensor_to_bytestream_speed[safetensors] 38.3496μs 38.1614μs 26.2045 KOps/s 26.7945 KOps/s $\color{#d91a1a}-2.20\%$
test_simple 0.5466s 0.5443s 1.8372 Ops/s 1.7428 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_transformed 1.0912s 1.0887s 0.9185 Ops/s 0.8938 Ops/s $\color{#35bf28}+2.77\%$
test_serial 1.7006s 1.6826s 0.5943 Ops/s 0.5817 Ops/s $\color{#35bf28}+2.16\%$
test_parallel 1.1478s 1.0540s 0.9488 Ops/s 0.9673 Ops/s $\color{#d91a1a}-1.91\%$
test_step_mdp_speed[True-True-True-True-True] 0.1464ms 42.9525μs 23.2816 KOps/s 23.6251 KOps/s $\color{#d91a1a}-1.45\%$
test_step_mdp_speed[True-True-True-True-False] 61.6210μs 24.5118μs 40.7967 KOps/s 43.2124 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_step_mdp_speed[True-True-True-False-True] 55.8110μs 24.7430μs 40.4154 KOps/s 43.0400 KOps/s $\textbf{\color{#d91a1a}-6.10\%}$
test_step_mdp_speed[True-True-True-False-False] 37.9410μs 12.7307μs 78.5502 KOps/s 77.0907 KOps/s $\color{#35bf28}+1.89\%$
test_step_mdp_speed[True-True-False-True-True] 80.5110μs 46.7674μs 21.3824 KOps/s 22.3730 KOps/s $\color{#d91a1a}-4.43\%$
test_step_mdp_speed[True-True-False-True-False] 51.9210μs 25.4238μs 39.3333 KOps/s 39.1334 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-True-False-False-True] 63.0810μs 25.7857μs 38.7812 KOps/s 38.1265 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[True-True-False-False-False] 44.0510μs 15.2997μs 65.3606 KOps/s 63.7713 KOps/s $\color{#35bf28}+2.49\%$
test_step_mdp_speed[True-False-True-True-True] 81.1210μs 48.3061μs 20.7013 KOps/s 21.2648 KOps/s $\color{#d91a1a}-2.65\%$
test_step_mdp_speed[True-False-True-True-False] 57.2510μs 27.6826μs 36.1237 KOps/s 35.8260 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[True-False-True-False-True] 56.0510μs 26.6084μs 37.5821 KOps/s 37.2286 KOps/s $\color{#35bf28}+0.95\%$
test_step_mdp_speed[True-False-True-False-False] 41.2410μs 15.6278μs 63.9884 KOps/s 64.6421 KOps/s $\color{#d91a1a}-1.01\%$
test_step_mdp_speed[True-False-False-True-True] 84.6710μs 49.1478μs 20.3468 KOps/s 20.0438 KOps/s $\color{#35bf28}+1.51\%$
test_step_mdp_speed[True-False-False-True-False] 67.7810μs 30.3456μs 32.9537 KOps/s 32.2535 KOps/s $\color{#35bf28}+2.17\%$
test_step_mdp_speed[True-False-False-False-True] 77.0310μs 29.3694μs 34.0490 KOps/s 34.5655 KOps/s $\color{#d91a1a}-1.49\%$
test_step_mdp_speed[True-False-False-False-False] 47.4610μs 17.9703μs 55.6473 KOps/s 54.0941 KOps/s $\color{#35bf28}+2.87\%$
test_step_mdp_speed[False-True-True-True-True] 87.7020μs 47.0962μs 21.2331 KOps/s 20.9126 KOps/s $\color{#35bf28}+1.53\%$
test_step_mdp_speed[False-True-True-True-False] 55.2210μs 28.0972μs 35.5908 KOps/s 35.4654 KOps/s $\color{#35bf28}+0.35\%$
test_step_mdp_speed[False-True-True-False-True] 2.3364ms 31.3069μs 31.9418 KOps/s 33.8776 KOps/s $\textbf{\color{#d91a1a}-5.71\%}$
test_step_mdp_speed[False-True-True-False-False] 48.8210μs 17.2572μs 57.9469 KOps/s 58.7598 KOps/s $\color{#d91a1a}-1.38\%$
test_step_mdp_speed[False-True-False-True-True] 0.1218ms 48.6046μs 20.5742 KOps/s 20.2274 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-True-False-True-False] 64.5710μs 30.8471μs 32.4180 KOps/s 32.5755 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[False-True-False-False-True] 65.5510μs 32.5994μs 30.6754 KOps/s 31.0623 KOps/s $\color{#d91a1a}-1.25\%$
test_step_mdp_speed[False-True-False-False-False] 55.6810μs 19.2811μs 51.8644 KOps/s 50.9165 KOps/s $\color{#35bf28}+1.86\%$
test_step_mdp_speed[False-False-True-True-True] 89.5020μs 53.2023μs 18.7962 KOps/s 19.1068 KOps/s $\color{#d91a1a}-1.63\%$
test_step_mdp_speed[False-False-True-True-False] 79.1710μs 32.9980μs 30.3049 KOps/s 30.2023 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[False-False-True-False-True] 92.3720μs 31.9299μs 31.3187 KOps/s 31.5349 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[False-False-True-False-False] 49.2010μs 19.2618μs 51.9162 KOps/s 51.0899 KOps/s $\color{#35bf28}+1.62\%$
test_step_mdp_speed[False-False-False-True-True] 91.0920μs 54.1344μs 18.4725 KOps/s 18.5779 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[False-False-False-True-False] 64.6410μs 35.1039μs 28.4868 KOps/s 28.1833 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-False-False-False-True] 66.4010μs 33.9056μs 29.4937 KOps/s 29.0135 KOps/s $\color{#35bf28}+1.66\%$
test_step_mdp_speed[False-False-False-False-False] 55.4410μs 21.9802μs 45.4955 KOps/s 45.3452 KOps/s $\color{#35bf28}+0.33\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8712s 0.7694s 1.2998 Ops/s 1.3157 Ops/s $\color{#d91a1a}-1.21\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7238s 0.6273s 1.5941 Ops/s 1.6223 Ops/s $\color{#d91a1a}-1.74\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.8001s 1.7063s 0.5861 Ops/s 0.5987 Ops/s $\color{#d91a1a}-2.11\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5510s 1.4721s 0.6793 Ops/s 0.6948 Ops/s $\color{#d91a1a}-2.23\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0329s 1.9590s 0.5105 Ops/s 0.5218 Ops/s $\color{#d91a1a}-2.18\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.8207s 1.7326s 0.5772 Ops/s 0.5921 Ops/s $\color{#d91a1a}-2.52\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6568s 4.5852s 0.2181 Ops/s 0.2177 Ops/s $\color{#35bf28}+0.18\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5264s 4.4139s 0.2266 Ops/s 0.2254 Ops/s $\color{#35bf28}+0.53\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9563s 1.8898s 0.5292 Ops/s 0.5349 Ops/s $\color{#d91a1a}-1.08\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6797s 1.5904s 0.6288 Ops/s 0.6254 Ops/s $\color{#35bf28}+0.54\%$
test_values[generalized_advantage_estimate-True-True] 10.1795ms 10.0226ms 99.7742 Ops/s 96.5731 Ops/s $\color{#35bf28}+3.31\%$
test_values[vec_generalized_advantage_estimate-True-True] 20.0407ms 17.7722ms 56.2677 Ops/s 56.1273 Ops/s $\color{#35bf28}+0.25\%$
test_values[td0_return_estimate-False-False] 0.2111ms 0.1253ms 7.9780 KOps/s 7.2243 KOps/s $\textbf{\color{#35bf28}+10.43\%}$
test_values[td1_return_estimate-False-False] 28.2851ms 26.8611ms 37.2286 Ops/s 35.8128 Ops/s $\color{#35bf28}+3.95\%$
test_values[vec_td1_return_estimate-False-False] 18.4078ms 17.9009ms 55.8633 Ops/s 54.8701 Ops/s $\color{#35bf28}+1.81\%$
test_values[td_lambda_return_estimate-True-False] 40.1563ms 39.7649ms 25.1478 Ops/s 24.3122 Ops/s $\color{#35bf28}+3.44\%$
test_values[vec_td_lambda_return_estimate-True-False] 19.8582ms 17.9101ms 55.8345 Ops/s 55.4703 Ops/s $\color{#35bf28}+0.66\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.9200ms 8.8217ms 113.3574 Ops/s 111.4085 Ops/s $\color{#35bf28}+1.75\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.7408ms 1.5102ms 662.1784 Ops/s 655.8901 Ops/s $\color{#35bf28}+0.96\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5839ms 0.4269ms 2.3427 KOps/s 2.3725 KOps/s $\color{#d91a1a}-1.26\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 34.5945ms 34.3310ms 29.1282 Ops/s 28.9696 Ops/s $\color{#35bf28}+0.55\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.1109ms 1.7476ms 572.2065 Ops/s 566.8116 Ops/s $\color{#35bf28}+0.95\%$
test_dqn_speed[False-None] 1.8002ms 1.4178ms 705.2969 Ops/s 687.6796 Ops/s $\color{#35bf28}+2.56\%$
test_dqn_speed[False-backward] 2.0432ms 1.9534ms 511.9381 Ops/s 514.5658 Ops/s $\color{#d91a1a}-0.51\%$
test_dqn_speed[True-None] 0.8420ms 0.5432ms 1.8411 KOps/s 1.8003 KOps/s $\color{#35bf28}+2.27\%$
test_dqn_speed[True-backward] 1.0364ms 0.9917ms 1.0084 KOps/s 883.9092 Ops/s $\textbf{\color{#35bf28}+14.08\%}$
test_dqn_speed[reduce-overhead-None] 0.7759ms 0.5342ms 1.8721 KOps/s 1.7353 KOps/s $\textbf{\color{#35bf28}+7.89\%}$
test_ddpg_speed[False-None] 3.2187ms 2.8816ms 347.0343 Ops/s 347.0665 Ops/s $-0.01\%$
test_ddpg_speed[False-backward] 4.4279ms 4.0794ms 245.1340 Ops/s 244.4477 Ops/s $\color{#35bf28}+0.28\%$
test_ddpg_speed[True-None] 1.4974ms 1.4125ms 707.9632 Ops/s 690.8567 Ops/s $\color{#35bf28}+2.48\%$
test_ddpg_speed[True-backward] 2.4778ms 2.3982ms 416.9733 Ops/s 337.7637 Ops/s $\textbf{\color{#35bf28}+23.45\%}$
test_ddpg_speed[reduce-overhead-None] 1.8090ms 1.4095ms 709.4563 Ops/s 679.6372 Ops/s $\color{#35bf28}+4.39\%$
test_sac_speed[False-None] 8.7217ms 8.0655ms 123.9845 Ops/s 123.4574 Ops/s $\color{#35bf28}+0.43\%$
test_sac_speed[False-backward] 11.9085ms 11.2827ms 88.6316 Ops/s 87.9681 Ops/s $\color{#35bf28}+0.75\%$
test_sac_speed[True-None] 2.4255ms 2.1562ms 463.7739 Ops/s 457.9312 Ops/s $\color{#35bf28}+1.28\%$
test_sac_speed[True-backward] 4.1235ms 4.0150ms 249.0681 Ops/s 226.7471 Ops/s $\textbf{\color{#35bf28}+9.84\%}$
test_sac_speed[reduce-overhead-None] 2.5341ms 2.1585ms 463.2833 Ops/s 461.8180 Ops/s $\color{#35bf28}+0.32\%$
test_redq_speed[False-None] 11.3869ms 10.4970ms 95.2656 Ops/s 92.2015 Ops/s $\color{#35bf28}+3.32\%$
test_redq_speed[False-backward] 21.3475ms 17.9622ms 55.6724 Ops/s 55.5977 Ops/s $\color{#35bf28}+0.13\%$
test_redq_speed[True-None] 4.7803ms 4.4502ms 224.7084 Ops/s 228.6407 Ops/s $\color{#d91a1a}-1.72\%$
test_redq_speed[reduce-overhead-None] 4.7612ms 4.3955ms 227.5058 Ops/s 230.1547 Ops/s $\color{#d91a1a}-1.15\%$
test_redq_deprec_speed[False-None] 11.6587ms 11.1941ms 89.3324 Ops/s 90.2166 Ops/s $\color{#d91a1a}-0.98\%$
test_redq_deprec_speed[False-backward] 16.5394ms 16.1325ms 61.9868 Ops/s 63.3564 Ops/s $\color{#d91a1a}-2.16\%$
test_redq_deprec_speed[True-None] 3.7830ms 3.6605ms 273.1848 Ops/s 282.6140 Ops/s $\color{#d91a1a}-3.34\%$
test_redq_deprec_speed[True-backward] 7.5472ms 7.2922ms 137.1331 Ops/s 140.1764 Ops/s $\color{#d91a1a}-2.17\%$
test_redq_deprec_speed[reduce-overhead-None] 3.9873ms 3.5772ms 279.5515 Ops/s 281.2084 Ops/s $\color{#d91a1a}-0.59\%$
test_td3_speed[False-None] 8.1421ms 8.0387ms 124.3977 Ops/s 121.7735 Ops/s $\color{#35bf28}+2.15\%$
test_td3_speed[False-backward] 11.5714ms 10.9322ms 91.4730 Ops/s 90.4251 Ops/s $\color{#35bf28}+1.16\%$
test_td3_speed[True-None] 1.8287ms 1.7945ms 557.2428 Ops/s 549.2150 Ops/s $\color{#35bf28}+1.46\%$
test_td3_speed[True-backward] 3.7321ms 3.5445ms 282.1276 Ops/s 237.0740 Ops/s $\textbf{\color{#35bf28}+19.00\%}$
test_td3_speed[reduce-overhead-None] 1.8343ms 1.7793ms 562.0115 Ops/s 555.6179 Ops/s $\color{#35bf28}+1.15\%$
test_cql_speed[False-None] 29.8867ms 26.6138ms 37.5745 Ops/s 38.3975 Ops/s $\color{#d91a1a}-2.14\%$
test_cql_speed[False-backward] 41.1055ms 35.8625ms 27.8843 Ops/s 28.0294 Ops/s $\color{#d91a1a}-0.52\%$
test_cql_speed[True-None] 12.9793ms 12.4784ms 80.1383 Ops/s 81.8677 Ops/s $\color{#d91a1a}-2.11\%$
test_cql_speed[True-backward] 18.1184ms 17.8549ms 56.0069 Ops/s 58.0787 Ops/s $\color{#d91a1a}-3.57\%$
test_cql_speed[reduce-overhead-None] 12.9089ms 12.4522ms 80.3068 Ops/s 80.9305 Ops/s $\color{#d91a1a}-0.77\%$
test_a2c_speed[False-None] 5.9380ms 5.4963ms 181.9408 Ops/s 186.2243 Ops/s $\color{#d91a1a}-2.30\%$
test_a2c_speed[False-backward] 12.2736ms 11.9634ms 83.5886 Ops/s 84.5799 Ops/s $\color{#d91a1a}-1.17\%$
test_a2c_speed[True-None] 3.9546ms 3.7816ms 264.4367 Ops/s 262.6704 Ops/s $\color{#35bf28}+0.67\%$
test_a2c_speed[True-backward] 9.1596ms 8.7014ms 114.9246 Ops/s 116.3481 Ops/s $\color{#d91a1a}-1.22\%$
test_a2c_speed[reduce-overhead-None] 4.7408ms 3.7573ms 266.1512 Ops/s 264.0602 Ops/s $\color{#35bf28}+0.79\%$
test_ppo_speed[False-None] 6.3086ms 6.0350ms 165.6990 Ops/s 164.2613 Ops/s $\color{#35bf28}+0.88\%$
test_ppo_speed[False-backward] 12.9066ms 12.6674ms 78.9425 Ops/s 78.3725 Ops/s $\color{#35bf28}+0.73\%$
test_ppo_speed[True-None] 4.1037ms 3.7369ms 267.6035 Ops/s 263.8676 Ops/s $\color{#35bf28}+1.42\%$
test_ppo_speed[True-backward] 9.0318ms 8.6037ms 116.2290 Ops/s 115.6493 Ops/s $\color{#35bf28}+0.50\%$
test_ppo_speed[reduce-overhead-None] 4.2741ms 3.6569ms 273.4571 Ops/s 269.1882 Ops/s $\color{#35bf28}+1.59\%$
test_reinforce_speed[False-None] 4.8621ms 4.6389ms 215.5674 Ops/s 213.1984 Ops/s $\color{#35bf28}+1.11\%$
test_reinforce_speed[False-backward] 7.7272ms 7.4825ms 133.6459 Ops/s 133.7404 Ops/s $\color{#d91a1a}-0.07\%$
test_reinforce_speed[True-None] 3.4700ms 2.9558ms 338.3138 Ops/s 326.6794 Ops/s $\color{#35bf28}+3.56\%$
test_reinforce_speed[True-backward] 8.1407ms 7.8007ms 128.1939 Ops/s 126.2882 Ops/s $\color{#35bf28}+1.51\%$
test_reinforce_speed[reduce-overhead-None] 3.3808ms 2.9306ms 341.2317 Ops/s 330.9982 Ops/s $\color{#35bf28}+3.09\%$
test_iql_speed[False-None] 26.5401ms 20.4375ms 48.9297 Ops/s 48.1271 Ops/s $\color{#35bf28}+1.67\%$
test_iql_speed[False-backward] 31.6706ms 30.6235ms 32.6547 Ops/s 32.5329 Ops/s $\color{#35bf28}+0.37\%$
test_iql_speed[True-None] 8.9598ms 8.4766ms 117.9715 Ops/s 116.2996 Ops/s $\color{#35bf28}+1.44\%$
test_iql_speed[True-backward] 17.1730ms 16.6560ms 60.0382 Ops/s 57.9040 Ops/s $\color{#35bf28}+3.69\%$
test_iql_speed[reduce-overhead-None] 9.0507ms 8.5524ms 116.9256 Ops/s 114.2915 Ops/s $\color{#35bf28}+2.30\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2701ms 6.1289ms 163.1611 Ops/s 164.6232 Ops/s $\color{#d91a1a}-0.89\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.0654ms 0.4062ms 2.4617 KOps/s 2.9526 KOps/s $\textbf{\color{#d91a1a}-16.63\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6866ms 0.3194ms 3.1310 KOps/s 2.8500 KOps/s $\textbf{\color{#35bf28}+9.86\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0575ms 5.8041ms 172.2934 Ops/s 170.4943 Ops/s $\color{#35bf28}+1.06\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1815ms 0.3731ms 2.6804 KOps/s 3.0212 KOps/s $\textbf{\color{#d91a1a}-11.28\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7646ms 0.3202ms 3.1235 KOps/s 2.8652 KOps/s $\textbf{\color{#35bf28}+9.01\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7041ms 1.4191ms 704.6816 Ops/s 700.0985 Ops/s $\color{#35bf28}+0.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5481ms 1.3400ms 746.2713 Ops/s 743.1915 Ops/s $\color{#35bf28}+0.41\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 9.4080ms 6.1262ms 163.2329 Ops/s 166.3862 Ops/s $\color{#d91a1a}-1.90\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8014ms 0.5104ms 1.9594 KOps/s 1.9586 KOps/s $\color{#35bf28}+0.04\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7712ms 0.5047ms 1.9814 KOps/s 2.0130 KOps/s $\color{#d91a1a}-1.57\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2061ms 5.8357ms 171.3588 Ops/s 169.1974 Ops/s $\color{#35bf28}+1.28\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8102ms 0.2916ms 3.4298 KOps/s 3.0315 KOps/s $\textbf{\color{#35bf28}+13.14\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4695ms 0.2688ms 3.7202 KOps/s 3.3165 KOps/s $\textbf{\color{#35bf28}+12.17\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0510ms 5.7694ms 173.3290 Ops/s 172.1176 Ops/s $\color{#35bf28}+0.70\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7305ms 0.3234ms 3.0926 KOps/s 3.1950 KOps/s $\color{#d91a1a}-3.20\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4789ms 0.2657ms 3.7635 KOps/s 3.6576 KOps/s $\color{#35bf28}+2.90\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0426ms 5.9488ms 168.1024 Ops/s 165.9707 Ops/s $\color{#35bf28}+1.28\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2522ms 0.5217ms 1.9168 KOps/s 2.2076 KOps/s $\textbf{\color{#d91a1a}-13.17\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6384ms 0.4525ms 2.2100 KOps/s 2.1785 KOps/s $\color{#35bf28}+1.45\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.4505ms 4.9664ms 201.3550 Ops/s 194.5272 Ops/s $\color{#35bf28}+3.51\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.7689ms 2.0111ms 497.2345 Ops/s 446.7817 Ops/s $\textbf{\color{#35bf28}+11.29\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.0856ms 0.8929ms 1.1200 KOps/s 787.3779 Ops/s $\textbf{\color{#35bf28}+42.24\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.6589s 18.1839ms 54.9937 Ops/s 37.2064 Ops/s $\textbf{\color{#35bf28}+47.81\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.9310ms 1.7669ms 565.9484 Ops/s 534.2475 Ops/s $\textbf{\color{#35bf28}+5.93\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.8464ms 0.8905ms 1.1229 KOps/s 905.2376 Ops/s $\textbf{\color{#35bf28}+24.05\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.4343ms 5.2707ms 189.7299 Ops/s 189.0284 Ops/s $\color{#35bf28}+0.37\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 3.9913ms 1.8662ms 535.8460 Ops/s 511.8553 Ops/s $\color{#35bf28}+4.69\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 9.6278ms 1.3079ms 764.6024 Ops/s 649.4503 Ops/s $\textbf{\color{#35bf28}+17.73\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 40.8712ms 38.7864ms 25.7823 Ops/s 25.7202 Ops/s $\color{#35bf28}+0.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.0476ms 18.3925ms 54.3700 Ops/s 53.6672 Ops/s $\color{#35bf28}+1.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 45.4743ms 39.8780ms 25.0765 Ops/s 24.8854 Ops/s $\color{#35bf28}+0.77\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.4601ms 18.7460ms 53.3448 Ops/s 52.5717 Ops/s $\color{#35bf28}+1.47\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 43.3492ms 41.3822ms 24.1650 Ops/s 23.9841 Ops/s $\color{#35bf28}+0.75\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.6497ms 19.9367ms 50.1588 Ops/s 49.3254 Ops/s $\color{#35bf28}+1.69\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8555ms 0.2179ms 4.5893 KOps/s 4.3524 KOps/s $\textbf{\color{#35bf28}+5.44\%}$
test_storage_write_lazystack[100-img_shape1-atari] 1.7052ms 1.3743ms 727.6469 Ops/s 698.4424 Ops/s $\color{#35bf28}+4.18\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.8075ms 2.3453ms 426.3936 Ops/s 427.8650 Ops/s $\color{#d91a1a}-0.34\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0987ms 2.8772ms 347.5546 Ops/s 335.9752 Ops/s $\color{#35bf28}+3.45\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2189ms 0.1398ms 7.1553 KOps/s 7.1147 KOps/s $\color{#35bf28}+0.57\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.4652ms 0.1937ms 5.1632 KOps/s 5.3849 KOps/s $\color{#d91a1a}-4.12\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.8809ms 1.7652ms 566.5062 Ops/s 569.4238 Ops/s $\color{#d91a1a}-0.51\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5357ms 1.3647ms 732.7664 Ops/s 758.7065 Ops/s $\color{#d91a1a}-3.42\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2313ms 1.1178ms 894.6534 Ops/s 891.4781 Ops/s $\color{#35bf28}+0.36\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.6797ms 3.4739ms 287.8580 Ops/s 279.8242 Ops/s $\color{#35bf28}+2.87\%$
test_collector_stack_then_write[100-img_shape2-large_img] 5.8636ms 5.7194ms 174.8442 Ops/s 176.2766 Ops/s $\color{#d91a1a}-0.81\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.4849ms 7.2974ms 137.0347 Ops/s 143.1313 Ops/s $\color{#d91a1a}-4.26\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4706ms 0.2832ms 3.5317 KOps/s 3.5978 KOps/s $\color{#d91a1a}-1.84\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6288ms 1.5146ms 660.2510 Ops/s 654.4794 Ops/s $\color{#35bf28}+0.88\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.6219ms 2.4569ms 407.0180 Ops/s 412.4483 Ops/s $\color{#d91a1a}-1.32\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.3526ms 3.0931ms 323.2980 Ops/s 314.4927 Ops/s $\color{#35bf28}+2.80\%$
test_collector_without_rb[100-img_shape0-atari] 0.5772s 50.1121ms 19.9553 Ops/s 30.2757 Ops/s $\textbf{\color{#d91a1a}-34.09\%}$
test_collector_without_rb[200-img_shape1-large_batch] 64.8920ms 64.5469ms 15.4926 Ops/s 15.2638 Ops/s $\color{#35bf28}+1.50\%$
test_collector_with_rb[100-img_shape0-atari] 39.0515ms 37.6902ms 26.5321 Ops/s 26.3223 Ops/s $\color{#35bf28}+0.80\%$
test_collector_with_rb[200-img_shape1-large_batch] 75.4763ms 73.8052ms 13.5492 Ops/s 13.5442 Ops/s $\color{#35bf28}+0.04\%$

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this, this PR should be dropped entirely

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Performance Performance issue or suggestion for improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant