Skip to content

[BugFix] Fix stale model reference in MultiCollector weight sync after device-cast#3587

Open
vmoens wants to merge 3 commits intomainfrom
fix-update-weights
Open

[BugFix] Fix stale model reference in MultiCollector weight sync after device-cast#3587
vmoens wants to merge 3 commits intomainfrom
fix-update-weights

Conversation

@vmoens
Copy link
Copy Markdown
Collaborator

@vmoens vmoens commented Mar 30, 2026

Summary

  • Fixes a bug where update_policy_weights_() silently fails to update one worker's policy in MultiAsyncCollector/MultiSyncCollector when workers use different policy_device values
  • Root cause: _make_policy_factory calls scheme.init_on_receiver(model=policy) storing a weakref to the original policy, but _get_policy_and_device later deepcopies the policy to place it on the target device. The scheme's model reference becomes stale — weight updates go to the original (unused) object
  • After register_scheme_receiver, we now check if the scheme's model matches the collector's actual policy and fix it if they diverge
  • Adds a non-regression test that zeros weights and verifies all workers produce zero actions

Test plan

  • New test test_weight_update_after_device_cast passes (4 variants: Sync/Async × MP/SharedMem)
  • All existing TestPolicyFactory tests still pass
  • CI

🤖 Generated with Claude Code

…eepcopy

When policy_device differs from the policy's native device,
_get_policy_and_device creates a deepcopy on the target device. However,
the weight sync scheme's model reference was set before the deepcopy
(in _make_policy_factory), so subsequent weight updates via the background
thread would silently update the original (unused) object instead of the
collector's actual policy. This caused one worker to never receive weight
updates in MultiAsyncCollector when workers had heterogeneous devices.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 30, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3587

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 6784bce with merge base 79e6135 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 30, 2026
Adds logging at key points in the weight sync pipeline to diagnose
why one async worker may not be receiving weight updates:

- MultiAsyncCollector: log which worker produced each batch, and
  weight param fingerprint (sum) when update_policy_weights_ is called
- _runner.py: log policy param fingerprint at rollout start, and
  whether the stale-model-reference fix fires
- _mp.py send(): log number of transports and weight fingerprint
- _mp.py _background_receive_loop(): log param fingerprint BEFORE and
  AFTER weight application per worker, plus model identity

All gated behind DEBUG level (torchrl_logger.isEnabledFor(10)).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 81.6171μs 80.3397μs 12.4471 KOps/s 12.0971 KOps/s $\color{#35bf28}+2.89\%$
test_tensor_to_bytestream_speed[torch.save] 0.1414ms 0.1411ms 7.0865 KOps/s 7.0504 KOps/s $\color{#35bf28}+0.51\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1370s 0.1363s 7.3346 Ops/s 7.2124 Ops/s $\color{#35bf28}+1.69\%$
test_tensor_to_bytestream_speed[numpy] 2.8431μs 2.8362μs 352.5785 KOps/s 347.3869 KOps/s $\color{#35bf28}+1.49\%$
test_tensor_to_bytestream_speed[safetensors] 37.8101μs 37.5384μs 26.6394 KOps/s 26.7029 KOps/s $\color{#d91a1a}-0.24\%$
test_simple 0.5566s 0.5518s 1.8123 Ops/s 1.7131 Ops/s $\textbf{\color{#35bf28}+5.79\%}$
test_transformed 1.1041s 1.1013s 0.9080 Ops/s 0.8718 Ops/s $\color{#35bf28}+4.16\%$
test_serial 1.7369s 1.7340s 0.5767 Ops/s 0.5587 Ops/s $\color{#35bf28}+3.22\%$
test_parallel 1.0426s 1.0374s 0.9639 Ops/s 0.9295 Ops/s $\color{#35bf28}+3.71\%$
test_step_mdp_speed[True-True-True-True-True] 0.2442ms 45.2169μs 22.1156 KOps/s 24.0368 KOps/s $\textbf{\color{#d91a1a}-7.99\%}$
test_step_mdp_speed[True-True-True-True-False] 53.9210μs 23.6441μs 42.2939 KOps/s 42.1124 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[True-True-True-False-True] 55.5900μs 23.7322μs 42.1368 KOps/s 43.2781 KOps/s $\color{#d91a1a}-2.64\%$
test_step_mdp_speed[True-True-True-False-False] 43.1510μs 13.0778μs 76.4655 KOps/s 78.0419 KOps/s $\color{#d91a1a}-2.02\%$
test_step_mdp_speed[True-True-False-True-True] 83.9910μs 47.0439μs 21.2568 KOps/s 22.5862 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_step_mdp_speed[True-True-False-True-False] 61.3610μs 26.0540μs 38.3818 KOps/s 38.8819 KOps/s $\color{#d91a1a}-1.29\%$
test_step_mdp_speed[True-True-False-False-True] 55.8310μs 25.7433μs 38.8451 KOps/s 38.5517 KOps/s $\color{#35bf28}+0.76\%$
test_step_mdp_speed[True-True-False-False-False] 47.2810μs 15.3888μs 64.9825 KOps/s 65.1655 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-False-True-True-True] 81.4010μs 48.9588μs 20.4254 KOps/s 21.0245 KOps/s $\color{#d91a1a}-2.85\%$
test_step_mdp_speed[True-False-True-True-False] 55.5910μs 27.9446μs 35.7850 KOps/s 34.5406 KOps/s $\color{#35bf28}+3.60\%$
test_step_mdp_speed[True-False-True-False-True] 56.3100μs 27.9846μs 35.7339 KOps/s 38.4835 KOps/s $\textbf{\color{#d91a1a}-7.14\%}$
test_step_mdp_speed[True-False-True-False-False] 45.3700μs 15.4706μs 64.6389 KOps/s 64.0486 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[True-False-False-True-True] 78.0610μs 49.8056μs 20.0781 KOps/s 20.3079 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[True-False-False-True-False] 59.8600μs 30.4466μs 32.8444 KOps/s 32.2304 KOps/s $\color{#35bf28}+1.91\%$
test_step_mdp_speed[True-False-False-False-True] 64.9300μs 29.0883μs 34.3781 KOps/s 35.5198 KOps/s $\color{#d91a1a}-3.21\%$
test_step_mdp_speed[True-False-False-False-False] 60.7200μs 18.0048μs 55.5406 KOps/s 56.0737 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-True-True-True-True] 84.3510μs 47.2214μs 21.1769 KOps/s 21.3313 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[False-True-True-True-False] 61.9710μs 29.7110μs 33.6575 KOps/s 35.9196 KOps/s $\textbf{\color{#d91a1a}-6.30\%}$
test_step_mdp_speed[False-True-True-False-True] 2.6497ms 31.6654μs 31.5802 KOps/s 33.4356 KOps/s $\textbf{\color{#d91a1a}-5.55\%}$
test_step_mdp_speed[False-True-True-False-False] 51.1410μs 17.9901μs 55.5860 KOps/s 58.2389 KOps/s $\color{#d91a1a}-4.56\%$
test_step_mdp_speed[False-True-False-True-True] 87.0210μs 49.6968μs 20.1220 KOps/s 20.5773 KOps/s $\color{#d91a1a}-2.21\%$
test_step_mdp_speed[False-True-False-True-False] 60.3800μs 30.9018μs 32.3605 KOps/s 32.5882 KOps/s $\color{#d91a1a}-0.70\%$
test_step_mdp_speed[False-True-False-False-True] 56.5110μs 32.0966μs 31.1559 KOps/s 30.7161 KOps/s $\color{#35bf28}+1.43\%$
test_step_mdp_speed[False-True-False-False-False] 46.1610μs 19.6405μs 50.9151 KOps/s 51.5120 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[False-False-True-True-True] 85.1310μs 52.8909μs 18.9068 KOps/s 19.2048 KOps/s $\color{#d91a1a}-1.55\%$
test_step_mdp_speed[False-False-True-True-False] 59.4610μs 34.0091μs 29.4039 KOps/s 29.6414 KOps/s $\color{#d91a1a}-0.80\%$
test_step_mdp_speed[False-False-True-False-True] 75.2310μs 32.3988μs 30.8653 KOps/s 30.8769 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[False-False-True-False-False] 50.9800μs 19.5633μs 51.1160 KOps/s 51.5320 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[False-False-False-True-True] 84.8110μs 54.8175μs 18.2424 KOps/s 18.4484 KOps/s $\color{#d91a1a}-1.12\%$
test_step_mdp_speed[False-False-False-True-False] 65.5710μs 35.3206μs 28.3121 KOps/s 27.5660 KOps/s $\color{#35bf28}+2.71\%$
test_step_mdp_speed[False-False-False-False-True] 68.2110μs 34.5016μs 28.9841 KOps/s 29.1773 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[False-False-False-False-False] 51.7010μs 22.0329μs 45.3867 KOps/s 45.6183 KOps/s $\color{#d91a1a}-0.51\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8547s 0.7555s 1.3236 Ops/s 1.3253 Ops/s $\color{#d91a1a}-0.13\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7199s 0.6131s 1.6310 Ops/s 1.6303 Ops/s $\color{#35bf28}+0.04\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7628s 1.6671s 0.5999 Ops/s 0.6007 Ops/s $\color{#d91a1a}-0.14\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5183s 1.4332s 0.6977 Ops/s 0.6959 Ops/s $\color{#35bf28}+0.26\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0044s 1.9185s 0.5212 Ops/s 0.5199 Ops/s $\color{#35bf28}+0.26\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7782s 1.6912s 0.5913 Ops/s 0.5862 Ops/s $\color{#35bf28}+0.86\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7057s 4.6414s 0.2155 Ops/s 0.2126 Ops/s $\color{#35bf28}+1.33\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5760s 4.4343s 0.2255 Ops/s 0.2226 Ops/s $\color{#35bf28}+1.30\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9619s 1.8685s 0.5352 Ops/s 0.5273 Ops/s $\color{#35bf28}+1.50\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6937s 1.5907s 0.6287 Ops/s 0.6203 Ops/s $\color{#35bf28}+1.35\%$
test_values[generalized_advantage_estimate-True-True] 11.7037ms 11.4988ms 86.9653 Ops/s 86.1374 Ops/s $\color{#35bf28}+0.96\%$
test_values[vec_generalized_advantage_estimate-True-True] 18.8674ms 16.8285ms 59.4228 Ops/s 55.6695 Ops/s $\textbf{\color{#35bf28}+6.74\%}$
test_values[td0_return_estimate-False-False] 0.2266ms 0.1376ms 7.2682 KOps/s 7.4941 KOps/s $\color{#d91a1a}-3.01\%$
test_values[td1_return_estimate-False-False] 31.5985ms 31.0765ms 32.1786 Ops/s 32.0601 Ops/s $\color{#35bf28}+0.37\%$
test_values[vec_td1_return_estimate-False-False] 18.5965ms 18.1686ms 55.0399 Ops/s 55.5372 Ops/s $\color{#d91a1a}-0.90\%$
test_values[td_lambda_return_estimate-True-False] 46.8070ms 45.8978ms 21.7875 Ops/s 21.5848 Ops/s $\color{#35bf28}+0.94\%$
test_values[vec_td_lambda_return_estimate-True-False] 18.3903ms 18.1272ms 55.1658 Ops/s 55.6362 Ops/s $\color{#d91a1a}-0.85\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.4745ms 10.2112ms 97.9313 Ops/s 96.6126 Ops/s $\color{#35bf28}+1.36\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.8080ms 1.5644ms 639.2152 Ops/s 645.8442 Ops/s $\color{#d91a1a}-1.03\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5543ms 0.4411ms 2.2668 KOps/s 2.1915 KOps/s $\color{#35bf28}+3.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 30.3121ms 29.4197ms 33.9908 Ops/s 28.7007 Ops/s $\textbf{\color{#35bf28}+18.43\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.0103ms 1.7888ms 559.0200 Ops/s 562.1534 Ops/s $\color{#d91a1a}-0.56\%$
test_dqn_speed[False-None] 1.7082ms 1.4365ms 696.1250 Ops/s 696.4642 Ops/s $\color{#d91a1a}-0.05\%$
test_dqn_speed[False-backward] 2.0392ms 1.9635ms 509.2910 Ops/s 509.7779 Ops/s $\color{#d91a1a}-0.10\%$
test_dqn_speed[True-None] 0.6651ms 0.5708ms 1.7519 KOps/s 1.7277 KOps/s $\color{#35bf28}+1.40\%$
test_dqn_speed[True-backward] 1.0662ms 1.0296ms 971.2825 Ops/s 891.4622 Ops/s $\textbf{\color{#35bf28}+8.95\%}$
test_dqn_speed[reduce-overhead-None] 1.1406ms 0.5552ms 1.8011 KOps/s 1.7459 KOps/s $\color{#35bf28}+3.16\%$
test_ddpg_speed[False-None] 3.2515ms 2.8742ms 347.9217 Ops/s 343.5534 Ops/s $\color{#35bf28}+1.27\%$
test_ddpg_speed[False-backward] 4.2599ms 4.1081ms 243.4226 Ops/s 242.2984 Ops/s $\color{#35bf28}+0.46\%$
test_ddpg_speed[True-None] 1.6021ms 1.4745ms 678.1976 Ops/s 664.4796 Ops/s $\color{#35bf28}+2.06\%$
test_ddpg_speed[True-backward] 2.5676ms 2.5059ms 399.0521 Ops/s 332.5599 Ops/s $\textbf{\color{#35bf28}+19.99\%}$
test_ddpg_speed[reduce-overhead-None] 1.6232ms 1.4754ms 677.7601 Ops/s 669.7014 Ops/s $\color{#35bf28}+1.20\%$
test_sac_speed[False-None] 8.9331ms 8.2482ms 121.2385 Ops/s 120.6880 Ops/s $\color{#35bf28}+0.46\%$
test_sac_speed[False-backward] 11.9812ms 11.5102ms 86.8798 Ops/s 85.9037 Ops/s $\color{#35bf28}+1.14\%$
test_sac_speed[True-None] 2.3500ms 2.2296ms 448.5137 Ops/s 439.6260 Ops/s $\color{#35bf28}+2.02\%$
test_sac_speed[True-backward] 4.2884ms 4.1878ms 238.7912 Ops/s 218.6229 Ops/s $\textbf{\color{#35bf28}+9.23\%}$
test_sac_speed[reduce-overhead-None] 2.3914ms 2.2320ms 448.0205 Ops/s 441.5127 Ops/s $\color{#35bf28}+1.47\%$
test_redq_speed[False-None] 14.5879ms 10.9728ms 91.1346 Ops/s 91.8490 Ops/s $\color{#d91a1a}-0.78\%$
test_redq_speed[False-backward] 23.4740ms 18.5585ms 53.8836 Ops/s 54.4412 Ops/s $\color{#d91a1a}-1.02\%$
test_redq_speed[True-None] 4.7426ms 4.5692ms 218.8579 Ops/s 212.5116 Ops/s $\color{#35bf28}+2.99\%$
test_redq_speed[reduce-overhead-None] 4.6975ms 4.5152ms 221.4728 Ops/s 218.0831 Ops/s $\color{#35bf28}+1.55\%$
test_redq_deprec_speed[False-None] 11.9410ms 11.4423ms 87.3947 Ops/s 88.5173 Ops/s $\color{#d91a1a}-1.27\%$
test_redq_deprec_speed[False-backward] 17.2236ms 16.5839ms 60.2993 Ops/s 61.8759 Ops/s $\color{#d91a1a}-2.55\%$
test_redq_deprec_speed[True-None] 4.6107ms 3.7727ms 265.0613 Ops/s 251.4972 Ops/s $\textbf{\color{#35bf28}+5.39\%}$
test_redq_deprec_speed[True-backward] 8.0163ms 7.6451ms 130.8023 Ops/s 130.2709 Ops/s $\color{#35bf28}+0.41\%$
test_redq_deprec_speed[reduce-overhead-None] 3.9021ms 3.6705ms 272.4437 Ops/s 265.0777 Ops/s $\color{#35bf28}+2.78\%$
test_td3_speed[False-None] 8.4151ms 8.2468ms 121.2586 Ops/s 118.5140 Ops/s $\color{#35bf28}+2.32\%$
test_td3_speed[False-backward] 11.5320ms 11.1388ms 89.7765 Ops/s 87.5026 Ops/s $\color{#35bf28}+2.60\%$
test_td3_speed[True-None] 1.9170ms 1.8836ms 530.9004 Ops/s 526.7198 Ops/s $\color{#35bf28}+0.79\%$
test_td3_speed[True-backward] 3.8206ms 3.6570ms 273.4485 Ops/s 250.6461 Ops/s $\textbf{\color{#35bf28}+9.10\%}$
test_td3_speed[reduce-overhead-None] 1.8633ms 1.8370ms 544.3692 Ops/s 561.8707 Ops/s $\color{#d91a1a}-3.11\%$
test_cql_speed[False-None] 27.7698ms 26.6946ms 37.4608 Ops/s 38.6548 Ops/s $\color{#d91a1a}-3.09\%$
test_cql_speed[False-backward] 36.9478ms 36.1531ms 27.6601 Ops/s 28.1349 Ops/s $\color{#d91a1a}-1.69\%$
test_cql_speed[True-None] 13.1470ms 12.7144ms 78.6513 Ops/s 79.8972 Ops/s $\color{#d91a1a}-1.56\%$
test_cql_speed[True-backward] 18.5429ms 18.2503ms 54.7938 Ops/s 56.3087 Ops/s $\color{#d91a1a}-2.69\%$
test_cql_speed[reduce-overhead-None] 13.2662ms 12.8121ms 78.0513 Ops/s 77.4567 Ops/s $\color{#35bf28}+0.77\%$
test_a2c_speed[False-None] 5.7741ms 5.5841ms 179.0787 Ops/s 177.1000 Ops/s $\color{#35bf28}+1.12\%$
test_a2c_speed[False-backward] 12.5184ms 12.2168ms 81.8548 Ops/s 81.0775 Ops/s $\color{#35bf28}+0.96\%$
test_a2c_speed[True-None] 4.2185ms 3.7919ms 263.7201 Ops/s 253.7586 Ops/s $\color{#35bf28}+3.93\%$
test_a2c_speed[True-backward] 9.0238ms 8.8038ms 113.5869 Ops/s 114.3049 Ops/s $\color{#d91a1a}-0.63\%$
test_a2c_speed[reduce-overhead-None] 3.9830ms 3.8436ms 260.1731 Ops/s 278.9603 Ops/s $\textbf{\color{#d91a1a}-6.73\%}$
test_ppo_speed[False-None] 7.2889ms 6.1262ms 163.2331 Ops/s 174.6394 Ops/s $\textbf{\color{#d91a1a}-6.53\%}$
test_ppo_speed[False-backward] 13.3759ms 13.0904ms 76.3920 Ops/s 78.7291 Ops/s $\color{#d91a1a}-2.97\%$
test_ppo_speed[True-None] 4.1506ms 3.7930ms 263.6469 Ops/s 285.7538 Ops/s $\textbf{\color{#d91a1a}-7.74\%}$
test_ppo_speed[True-backward] 9.2808ms 8.9447ms 111.7983 Ops/s 118.5044 Ops/s $\textbf{\color{#d91a1a}-5.66\%}$
test_ppo_speed[reduce-overhead-None] 4.1470ms 3.7666ms 265.4925 Ops/s 284.8313 Ops/s $\textbf{\color{#d91a1a}-6.79\%}$
test_reinforce_speed[False-None] 5.2146ms 4.7535ms 210.3709 Ops/s 207.4315 Ops/s $\color{#35bf28}+1.42\%$
test_reinforce_speed[False-backward] 8.2564ms 7.7607ms 128.8549 Ops/s 132.8197 Ops/s $\color{#d91a1a}-2.99\%$
test_reinforce_speed[True-None] 3.6808ms 3.0909ms 323.5286 Ops/s 323.7869 Ops/s $\color{#d91a1a}-0.08\%$
test_reinforce_speed[True-backward] 8.3798ms 8.0084ms 124.8686 Ops/s 123.6423 Ops/s $\color{#35bf28}+0.99\%$
test_reinforce_speed[reduce-overhead-None] 3.1148ms 2.9725ms 336.4153 Ops/s 334.2193 Ops/s $\color{#35bf28}+0.66\%$
test_iql_speed[False-None] 20.7221ms 20.0678ms 49.8311 Ops/s 45.8398 Ops/s $\textbf{\color{#35bf28}+8.71\%}$
test_iql_speed[False-backward] 31.4770ms 30.8429ms 32.4224 Ops/s 31.4061 Ops/s $\color{#35bf28}+3.24\%$
test_iql_speed[True-None] 9.0443ms 8.6761ms 115.2592 Ops/s 110.7446 Ops/s $\color{#35bf28}+4.08\%$
test_iql_speed[True-backward] 17.6052ms 17.1044ms 58.4646 Ops/s 56.3397 Ops/s $\color{#35bf28}+3.77\%$
test_iql_speed[reduce-overhead-None] 9.2354ms 8.7599ms 114.1560 Ops/s 108.2226 Ops/s $\textbf{\color{#35bf28}+5.48\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3124ms 6.0884ms 164.2469 Ops/s 161.7902 Ops/s $\color{#35bf28}+1.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.1962ms 0.3010ms 3.3219 KOps/s 2.7179 KOps/s $\textbf{\color{#35bf28}+22.22\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7744ms 0.2804ms 3.5658 KOps/s 2.8612 KOps/s $\textbf{\color{#35bf28}+24.63\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1159ms 5.8839ms 169.9558 Ops/s 169.8851 Ops/s $\color{#35bf28}+0.04\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0760ms 0.2904ms 3.4429 KOps/s 3.0179 KOps/s $\textbf{\color{#35bf28}+14.08\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7167ms 0.2736ms 3.6545 KOps/s 3.2290 KOps/s $\textbf{\color{#35bf28}+13.18\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5828ms 1.3370ms 747.9529 Ops/s 737.7404 Ops/s $\color{#35bf28}+1.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6723ms 1.2549ms 796.8845 Ops/s 795.5255 Ops/s $\color{#35bf28}+0.17\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 10.0799ms 6.0774ms 164.5429 Ops/s 164.8314 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9037ms 0.4490ms 2.2273 KOps/s 2.0518 KOps/s $\textbf{\color{#35bf28}+8.55\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9206ms 0.4325ms 2.3122 KOps/s 2.2408 KOps/s $\color{#35bf28}+3.19\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9914ms 5.8212ms 171.7873 Ops/s 167.5242 Ops/s $\color{#35bf28}+2.54\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.4058ms 0.3793ms 2.6361 KOps/s 2.7784 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5122ms 0.2836ms 3.5255 KOps/s 2.9898 KOps/s $\textbf{\color{#35bf28}+17.92\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9541ms 5.7091ms 175.1580 Ops/s 170.0292 Ops/s $\color{#35bf28}+3.02\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1447ms 0.3782ms 2.6438 KOps/s 2.9395 KOps/s $\textbf{\color{#d91a1a}-10.06\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6110ms 0.3828ms 2.6122 KOps/s 3.4881 KOps/s $\textbf{\color{#d91a1a}-25.11\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1454ms 5.9225ms 168.8474 Ops/s 165.4211 Ops/s $\color{#35bf28}+2.07\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8510ms 0.5289ms 1.8907 KOps/s 2.2149 KOps/s $\textbf{\color{#d91a1a}-14.64\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7110ms 0.5110ms 1.9568 KOps/s 2.3319 KOps/s $\textbf{\color{#d91a1a}-16.09\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 1.0192s 25.3497ms 39.4482 Ops/s 47.3628 Ops/s $\textbf{\color{#d91a1a}-16.71\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 12.1261ms 2.0603ms 485.3726 Ops/s 493.7770 Ops/s $\color{#d91a1a}-1.70\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.2906ms 0.9264ms 1.0795 KOps/s 810.0415 Ops/s $\textbf{\color{#35bf28}+33.26\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 10.1072ms 5.1438ms 194.4099 Ops/s 194.2527 Ops/s $\color{#35bf28}+0.08\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.9876ms 1.7899ms 558.6832 Ops/s 487.4504 Ops/s $\textbf{\color{#35bf28}+14.61\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.2105ms 0.9169ms 1.0906 KOps/s 877.8488 Ops/s $\textbf{\color{#35bf28}+24.24\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.6402s 18.0124ms 55.5173 Ops/s 185.1125 Ops/s $\textbf{\color{#d91a1a}-70.01\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.2478ms 1.9679ms 508.1444 Ops/s 514.3521 Ops/s $\color{#d91a1a}-1.21\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.1034ms 1.1029ms 906.6803 Ops/s 918.9297 Ops/s $\color{#d91a1a}-1.33\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 42.7326ms 39.2884ms 25.4528 Ops/s 25.2529 Ops/s $\color{#35bf28}+0.79\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 24.5931ms 19.0755ms 52.4233 Ops/s 53.6775 Ops/s $\color{#d91a1a}-2.34\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 45.0947ms 40.6560ms 24.5966 Ops/s 24.3553 Ops/s $\color{#35bf28}+0.99\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 21.1913ms 19.2373ms 51.9824 Ops/s 53.2450 Ops/s $\color{#d91a1a}-2.37\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 44.6426ms 42.4565ms 23.5535 Ops/s 23.2954 Ops/s $\color{#35bf28}+1.11\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.4536ms 21.0971ms 47.3999 Ops/s 48.6895 Ops/s $\color{#d91a1a}-2.65\%$
test_storage_write_lazystack[50-img_shape0-small] 1.0033ms 0.2339ms 4.2760 KOps/s 4.3140 KOps/s $\color{#d91a1a}-0.88\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7531ms 1.4886ms 671.7675 Ops/s 675.9874 Ops/s $\color{#d91a1a}-0.62\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.7675ms 2.4681ms 405.1683 Ops/s 391.0582 Ops/s $\color{#35bf28}+3.61\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.3526ms 3.0760ms 325.0965 Ops/s 321.0458 Ops/s $\color{#35bf28}+1.26\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2199ms 0.1447ms 6.9087 KOps/s 7.1697 KOps/s $\color{#d91a1a}-3.64\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3520ms 0.1983ms 5.0428 KOps/s 5.0926 KOps/s $\color{#d91a1a}-0.98\%$
test_storage_write_contiguous[100-img_shape2-large_img] 2.2055ms 1.8657ms 535.9910 Ops/s 525.3081 Ops/s $\color{#35bf28}+2.03\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.6409ms 1.4054ms 711.5177 Ops/s 727.1865 Ops/s $\color{#d91a1a}-2.15\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2968ms 1.1496ms 869.8430 Ops/s 877.2183 Ops/s $\color{#d91a1a}-0.84\%$
test_collector_stack_then_write[100-img_shape1-atari] 4.0876ms 3.7969ms 263.3730 Ops/s 266.3938 Ops/s $\color{#d91a1a}-1.13\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.5065ms 5.8744ms 170.2294 Ops/s 167.3271 Ops/s $\color{#35bf28}+1.73\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.9877ms 7.2588ms 137.7646 Ops/s 131.5080 Ops/s $\color{#35bf28}+4.76\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4524ms 0.2867ms 3.4881 KOps/s 3.4796 KOps/s $\color{#35bf28}+0.24\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.8743ms 1.6028ms 623.9071 Ops/s 628.6266 Ops/s $\color{#d91a1a}-0.75\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8552ms 2.5884ms 386.3326 Ops/s 373.9048 Ops/s $\color{#35bf28}+3.32\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.7303ms 3.3080ms 302.3008 Ops/s 296.4395 Ops/s $\color{#35bf28}+1.98\%$
test_collector_without_rb[100-img_shape0-atari] 35.6171ms 34.5057ms 28.9808 Ops/s 28.9026 Ops/s $\color{#35bf28}+0.27\%$
test_collector_without_rb[200-img_shape1-large_batch] 69.5685ms 68.3074ms 14.6397 Ops/s 14.6975 Ops/s $\color{#d91a1a}-0.39\%$
test_collector_with_rb[100-img_shape0-atari] 40.1256ms 39.0293ms 25.6218 Ops/s 25.5134 Ops/s $\color{#35bf28}+0.42\%$
test_collector_with_rb[200-img_shape1-large_batch] 78.2764ms 77.1463ms 12.9624 Ops/s 12.9627 Ops/s $-0.00\%$

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 87.8761μs 86.4700μs 11.5647 KOps/s 11.5777 KOps/s $\color{#d91a1a}-0.11\%$
test_tensor_to_bytestream_speed[torch.save] 0.1421ms 0.1414ms 7.0716 KOps/s 7.0023 KOps/s $\color{#35bf28}+0.99\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1125s 0.1121s 8.9173 Ops/s 8.8551 Ops/s $\color{#35bf28}+0.70\%$
test_tensor_to_bytestream_speed[numpy] 2.5963μs 2.5842μs 386.9670 KOps/s 388.9144 KOps/s $\color{#d91a1a}-0.50\%$
test_tensor_to_bytestream_speed[safetensors] 39.6138μs 39.1469μs 25.5448 KOps/s 26.1960 KOps/s $\color{#d91a1a}-2.49\%$
test_simple 0.8047s 0.7868s 1.2710 Ops/s 1.2376 Ops/s $\color{#35bf28}+2.71\%$
test_transformed 1.4176s 1.4033s 0.7126 Ops/s 0.7134 Ops/s $\color{#d91a1a}-0.12\%$
test_serial 2.3614s 2.3305s 0.4291 Ops/s 0.4299 Ops/s $\color{#d91a1a}-0.19\%$
test_parallel 1.9073s 1.8234s 0.5484 Ops/s 0.5516 Ops/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-True-True-True-True] 0.2034ms 43.0472μs 23.2303 KOps/s 23.5891 KOps/s $\color{#d91a1a}-1.52\%$
test_step_mdp_speed[True-True-True-True-False] 0.4509ms 22.7203μs 44.0135 KOps/s 42.9919 KOps/s $\color{#35bf28}+2.38\%$
test_step_mdp_speed[True-True-True-False-True] 0.4516ms 24.0509μs 41.5784 KOps/s 42.5970 KOps/s $\color{#d91a1a}-2.39\%$
test_step_mdp_speed[True-True-True-False-False] 40.0010μs 12.6641μs 78.9632 KOps/s 76.8914 KOps/s $\color{#35bf28}+2.69\%$
test_step_mdp_speed[True-True-False-True-True] 0.4840ms 44.6145μs 22.4142 KOps/s 22.1750 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[True-True-False-True-False] 0.4500ms 25.5768μs 39.0980 KOps/s 39.1047 KOps/s $\color{#d91a1a}-0.02\%$
test_step_mdp_speed[True-True-False-False-True] 99.5720μs 26.0377μs 38.4059 KOps/s 37.6182 KOps/s $\color{#35bf28}+2.09\%$
test_step_mdp_speed[True-True-False-False-False] 0.4497ms 15.2715μs 65.4815 KOps/s 63.5123 KOps/s $\color{#35bf28}+3.10\%$
test_step_mdp_speed[True-False-True-True-True] 0.4698ms 46.7323μs 21.3985 KOps/s 20.9808 KOps/s $\color{#35bf28}+1.99\%$
test_step_mdp_speed[True-False-True-True-False] 58.5920μs 27.8920μs 35.8526 KOps/s 34.9149 KOps/s $\color{#35bf28}+2.69\%$
test_step_mdp_speed[True-False-True-False-True] 0.4603ms 25.8715μs 38.6525 KOps/s 38.5320 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[True-False-True-False-False] 0.4568ms 15.2285μs 65.6662 KOps/s 65.0480 KOps/s $\color{#35bf28}+0.95\%$
test_step_mdp_speed[True-False-False-True-True] 81.4410μs 48.3240μs 20.6937 KOps/s 20.2611 KOps/s $\color{#35bf28}+2.14\%$
test_step_mdp_speed[True-False-False-True-False] 0.4467ms 30.4695μs 32.8197 KOps/s 32.2826 KOps/s $\color{#35bf28}+1.66\%$
test_step_mdp_speed[True-False-False-False-True] 0.4705ms 28.8389μs 34.6753 KOps/s 35.7326 KOps/s $\color{#d91a1a}-2.96\%$
test_step_mdp_speed[True-False-False-False-False] 42.8610μs 17.9431μs 55.7318 KOps/s 55.2285 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[False-True-True-True-True] 0.4786ms 46.6291μs 21.4458 KOps/s 20.7111 KOps/s $\color{#35bf28}+3.55\%$
test_step_mdp_speed[False-True-True-True-False] 0.4486ms 27.9393μs 35.7919 KOps/s 34.9336 KOps/s $\color{#35bf28}+2.46\%$
test_step_mdp_speed[False-True-True-False-True] 2.2780ms 30.1447μs 33.1733 KOps/s 32.7478 KOps/s $\color{#35bf28}+1.30\%$
test_step_mdp_speed[False-True-True-False-False] 0.4560ms 17.1469μs 58.3197 KOps/s 57.9628 KOps/s $\color{#35bf28}+0.62\%$
test_step_mdp_speed[False-True-False-True-True] 0.4939ms 48.7120μs 20.5288 KOps/s 19.7073 KOps/s $\color{#35bf28}+4.17\%$
test_step_mdp_speed[False-True-False-True-False] 0.4552ms 30.5262μs 32.7587 KOps/s 31.9739 KOps/s $\color{#35bf28}+2.45\%$
test_step_mdp_speed[False-True-False-False-True] 56.7310μs 32.0208μs 31.2297 KOps/s 30.6224 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[False-True-False-False-False] 0.4477ms 19.4217μs 51.4889 KOps/s 50.3761 KOps/s $\color{#35bf28}+2.21\%$
test_step_mdp_speed[False-False-True-True-True] 0.4776ms 53.1399μs 18.8183 KOps/s 18.9249 KOps/s $\color{#d91a1a}-0.56\%$
test_step_mdp_speed[False-False-True-True-False] 0.4559ms 33.0708μs 30.2382 KOps/s 29.8296 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[False-False-True-False-True] 62.7410μs 32.1503μs 31.1039 KOps/s 30.6618 KOps/s $\color{#35bf28}+1.44\%$
test_step_mdp_speed[False-False-True-False-False] 0.4489ms 19.2445μs 51.9628 KOps/s 49.9584 KOps/s $\color{#35bf28}+4.01\%$
test_step_mdp_speed[False-False-False-True-True] 0.4817ms 53.5598μs 18.6707 KOps/s 18.1843 KOps/s $\color{#35bf28}+2.68\%$
test_step_mdp_speed[False-False-False-True-False] 0.4593ms 35.4262μs 28.2277 KOps/s 27.0769 KOps/s $\color{#35bf28}+4.25\%$
test_step_mdp_speed[False-False-False-False-True] 70.3410μs 34.2905μs 29.1626 KOps/s 28.4465 KOps/s $\color{#35bf28}+2.52\%$
test_step_mdp_speed[False-False-False-False-False] 0.4533ms 22.0928μs 45.2636 KOps/s 44.7521 KOps/s $\color{#35bf28}+1.14\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.7375s 0.7318s 1.3665 Ops/s 1.3384 Ops/s $\color{#35bf28}+2.10\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7120s 0.6142s 1.6280 Ops/s 1.6407 Ops/s $\color{#d91a1a}-0.77\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7718s 1.6717s 0.5982 Ops/s 0.6033 Ops/s $\color{#d91a1a}-0.84\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5341s 1.4567s 0.6865 Ops/s 0.6969 Ops/s $\color{#d91a1a}-1.49\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0185s 1.9469s 0.5136 Ops/s 0.5280 Ops/s $\color{#d91a1a}-2.73\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7566s 1.6837s 0.5939 Ops/s 0.5952 Ops/s $\color{#d91a1a}-0.22\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7709s 4.6329s 0.2158 Ops/s 0.2166 Ops/s $\color{#d91a1a}-0.33\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.6206s 4.5086s 0.2218 Ops/s 0.2255 Ops/s $\color{#d91a1a}-1.64\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9509s 1.8818s 0.5314 Ops/s 0.5233 Ops/s $\color{#35bf28}+1.54\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6761s 1.5894s 0.6292 Ops/s 0.6190 Ops/s $\color{#35bf28}+1.64\%$
test_values[generalized_advantage_estimate-True-True] 20.0294ms 19.4662ms 51.3710 Ops/s 50.6130 Ops/s $\color{#35bf28}+1.50\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1347s 3.6064ms 277.2874 Ops/s 253.2625 Ops/s $\textbf{\color{#35bf28}+9.49\%}$
test_values[td0_return_estimate-False-False] 0.1061ms 80.4719μs 12.4267 KOps/s 12.1750 KOps/s $\color{#35bf28}+2.07\%$
test_values[td1_return_estimate-False-False] 47.2985ms 46.8284ms 21.3545 Ops/s 21.1182 Ops/s $\color{#35bf28}+1.12\%$
test_values[vec_td1_return_estimate-False-False] 1.3296ms 1.0679ms 936.3977 Ops/s 931.5202 Ops/s $\color{#35bf28}+0.52\%$
test_values[td_lambda_return_estimate-True-False] 77.1968ms 76.1776ms 13.1272 Ops/s 12.9764 Ops/s $\color{#35bf28}+1.16\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2555ms 1.0677ms 936.6329 Ops/s 931.4399 Ops/s $\color{#35bf28}+0.56\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 20.1798ms 19.7146ms 50.7239 Ops/s 50.3691 Ops/s $\color{#35bf28}+0.70\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0103ms 0.7391ms 1.3530 KOps/s 1.3640 KOps/s $\color{#d91a1a}-0.80\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7511ms 0.6591ms 1.5173 KOps/s 1.5021 KOps/s $\color{#35bf28}+1.02\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5372ms 1.4798ms 675.7484 Ops/s 673.9993 Ops/s $\color{#35bf28}+0.26\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7932ms 0.6764ms 1.4784 KOps/s 1.4715 KOps/s $\color{#35bf28}+0.47\%$
test_dqn_speed[False-None] 1.6781ms 1.5660ms 638.5631 Ops/s 630.6820 Ops/s $\color{#35bf28}+1.25\%$
test_dqn_speed[False-backward] 2.3980ms 2.1776ms 459.2185 Ops/s 455.5152 Ops/s $\color{#35bf28}+0.81\%$
test_dqn_speed[True-None] 1.2622ms 0.5962ms 1.6772 KOps/s 1.5952 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_dqn_speed[True-backward] 1.1988ms 1.1513ms 868.5858 Ops/s 805.1471 Ops/s $\textbf{\color{#35bf28}+7.88\%}$
test_dqn_speed[reduce-overhead-None] 0.6817ms 0.6190ms 1.6156 KOps/s 1.5823 KOps/s $\color{#35bf28}+2.10\%$
test_ddpg_speed[False-None] 3.5535ms 3.0294ms 330.0945 Ops/s 337.4547 Ops/s $\color{#d91a1a}-2.18\%$
test_ddpg_speed[False-backward] 4.7440ms 4.2381ms 235.9538 Ops/s 230.6380 Ops/s $\color{#35bf28}+2.30\%$
test_ddpg_speed[True-None] 1.4491ms 1.3683ms 730.8433 Ops/s 720.4128 Ops/s $\color{#35bf28}+1.45\%$
test_ddpg_speed[True-backward] 2.5186ms 2.3948ms 417.5761 Ops/s 393.7779 Ops/s $\textbf{\color{#35bf28}+6.04\%}$
test_ddpg_speed[reduce-overhead-None] 1.5151ms 1.4022ms 713.1600 Ops/s 711.9272 Ops/s $\color{#35bf28}+0.17\%$
test_sac_speed[False-None] 8.8219ms 8.4248ms 118.6974 Ops/s 120.0774 Ops/s $\color{#d91a1a}-1.15\%$
test_sac_speed[False-backward] 11.9354ms 11.2453ms 88.9262 Ops/s 87.3264 Ops/s $\color{#35bf28}+1.83\%$
test_sac_speed[True-None] 2.0217ms 1.9299ms 518.1558 Ops/s 502.2436 Ops/s $\color{#35bf28}+3.17\%$
test_sac_speed[True-backward] 4.1394ms 3.6874ms 271.1947 Ops/s 282.1939 Ops/s $\color{#d91a1a}-3.90\%$
test_sac_speed[reduce-overhead-None] 17.0384ms 10.2915ms 97.1680 Ops/s 97.2557 Ops/s $\color{#d91a1a}-0.09\%$
test_redq_deprec_speed[False-None] 10.2452ms 9.3967ms 106.4207 Ops/s 106.2804 Ops/s $\color{#35bf28}+0.13\%$
test_redq_deprec_speed[False-backward] 13.4211ms 12.6535ms 79.0293 Ops/s 81.2341 Ops/s $\color{#d91a1a}-2.71\%$
test_redq_deprec_speed[True-None] 2.7871ms 2.7084ms 369.2213 Ops/s 366.8435 Ops/s $\color{#35bf28}+0.65\%$
test_redq_deprec_speed[True-backward] 4.7557ms 4.3474ms 230.0241 Ops/s 229.1098 Ops/s $\color{#35bf28}+0.40\%$
test_redq_deprec_speed[reduce-overhead-None] 14.8124ms 9.7859ms 102.1882 Ops/s 101.5381 Ops/s $\color{#35bf28}+0.64\%$
test_td3_speed[False-None] 8.7488ms 8.2213ms 121.6353 Ops/s 121.6352 Ops/s $+0.00\%$
test_td3_speed[False-backward] 11.3607ms 10.6816ms 93.6193 Ops/s 93.5423 Ops/s $\color{#35bf28}+0.08\%$
test_td3_speed[True-None] 1.7128ms 1.6849ms 593.5105 Ops/s 582.0257 Ops/s $\color{#35bf28}+1.97\%$
test_td3_speed[True-backward] 3.6504ms 3.2171ms 310.8419 Ops/s 307.7485 Ops/s $\color{#35bf28}+1.01\%$
test_td3_speed[reduce-overhead-None] 0.1006s 26.5860ms 37.6138 Ops/s 37.9397 Ops/s $\color{#d91a1a}-0.86\%$
test_cql_speed[False-None] 17.8446ms 17.5530ms 56.9705 Ops/s 56.7858 Ops/s $\color{#35bf28}+0.33\%$
test_cql_speed[False-backward] 23.2329ms 22.8103ms 43.8399 Ops/s 43.7197 Ops/s $\color{#35bf28}+0.27\%$
test_cql_speed[True-None] 3.5115ms 3.3782ms 296.0119 Ops/s 281.1122 Ops/s $\textbf{\color{#35bf28}+5.30\%}$
test_cql_speed[True-backward] 6.0522ms 5.6153ms 178.0856 Ops/s 180.0059 Ops/s $\color{#d91a1a}-1.07\%$
test_cql_speed[reduce-overhead-None] 0.8460s 17.5440ms 56.9995 Ops/s 81.7328 Ops/s $\textbf{\color{#d91a1a}-30.26\%}$
test_a2c_speed[False-None] 3.3927ms 3.2794ms 304.9321 Ops/s 296.6518 Ops/s $\color{#35bf28}+2.79\%$
test_a2c_speed[False-backward] 6.6966ms 6.2854ms 159.0979 Ops/s 158.1387 Ops/s $\color{#35bf28}+0.61\%$
test_a2c_speed[True-None] 1.5159ms 1.4089ms 709.7781 Ops/s 704.0820 Ops/s $\color{#35bf28}+0.81\%$
test_a2c_speed[True-backward] 3.2539ms 3.1995ms 312.5480 Ops/s 308.3205 Ops/s $\color{#35bf28}+1.37\%$
test_a2c_speed[reduce-overhead-None] 1.1033ms 1.0362ms 965.0680 Ops/s 946.8833 Ops/s $\color{#35bf28}+1.92\%$
test_ppo_speed[False-None] 4.0083ms 3.8878ms 257.2170 Ops/s 250.2556 Ops/s $\color{#35bf28}+2.78\%$
test_ppo_speed[False-backward] 7.6437ms 7.1788ms 139.2997 Ops/s 138.8661 Ops/s $\color{#35bf28}+0.31\%$
test_ppo_speed[True-None] 1.6597ms 1.5277ms 654.5754 Ops/s 649.8319 Ops/s $\color{#35bf28}+0.73\%$
test_ppo_speed[True-backward] 3.3736ms 3.3268ms 300.5907 Ops/s 293.8159 Ops/s $\color{#35bf28}+2.31\%$
test_ppo_speed[reduce-overhead-None] 1.2577ms 1.1020ms 907.4700 Ops/s 893.1358 Ops/s $\color{#35bf28}+1.60\%$
test_reinforce_speed[False-None] 2.5227ms 2.3463ms 426.1952 Ops/s 418.6126 Ops/s $\color{#35bf28}+1.81\%$
test_reinforce_speed[False-backward] 3.9136ms 3.4636ms 288.7152 Ops/s 292.6482 Ops/s $\color{#d91a1a}-1.34\%$
test_reinforce_speed[True-None] 2.0897ms 1.3825ms 723.3025 Ops/s 714.3732 Ops/s $\color{#35bf28}+1.25\%$
test_reinforce_speed[True-backward] 3.1044ms 3.0265ms 330.4160 Ops/s 320.9467 Ops/s $\color{#35bf28}+2.95\%$
test_reinforce_speed[reduce-overhead-None] 16.5509ms 9.0967ms 109.9294 Ops/s 109.2826 Ops/s $\color{#35bf28}+0.59\%$
test_iql_speed[False-None] 10.1445ms 9.6239ms 103.9082 Ops/s 103.9648 Ops/s $\color{#d91a1a}-0.05\%$
test_iql_speed[False-backward] 13.9184ms 13.1247ms 76.1925 Ops/s 76.3577 Ops/s $\color{#d91a1a}-0.22\%$
test_iql_speed[True-None] 2.3882ms 2.2903ms 436.6227 Ops/s 423.6187 Ops/s $\color{#35bf28}+3.07\%$
test_iql_speed[True-backward] 5.3957ms 4.9789ms 200.8494 Ops/s 198.9778 Ops/s $\color{#35bf28}+0.94\%$
test_iql_speed[reduce-overhead-None] 16.5587ms 10.1710ms 98.3189 Ops/s 97.7149 Ops/s $\color{#35bf28}+0.62\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4565ms 6.0345ms 165.7127 Ops/s 165.5984 Ops/s $\color{#35bf28}+0.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7259ms 0.3787ms 2.6407 KOps/s 2.3452 KOps/s $\textbf{\color{#35bf28}+12.60\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5983ms 0.3583ms 2.7908 KOps/s 2.4737 KOps/s $\textbf{\color{#35bf28}+12.82\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1252ms 5.8655ms 170.4894 Ops/s 170.1819 Ops/s $\color{#35bf28}+0.18\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3436ms 0.3433ms 2.9129 KOps/s 3.0896 KOps/s $\textbf{\color{#d91a1a}-5.72\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6081ms 0.3703ms 2.7006 KOps/s 3.3022 KOps/s $\textbf{\color{#d91a1a}-18.22\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5508ms 1.2756ms 783.9425 Ops/s 768.3979 Ops/s $\color{#35bf28}+2.02\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4739ms 1.2238ms 817.1536 Ops/s 825.8726 Ops/s $\color{#d91a1a}-1.06\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 9.8250ms 6.2022ms 161.2336 Ops/s 166.8532 Ops/s $\color{#d91a1a}-3.37\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2982ms 0.4497ms 2.2237 KOps/s 2.2423 KOps/s $\color{#d91a1a}-0.83\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6590ms 0.4231ms 2.3634 KOps/s 1.9834 KOps/s $\textbf{\color{#35bf28}+19.16\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0104ms 5.9173ms 168.9974 Ops/s 169.8268 Ops/s $\color{#d91a1a}-0.49\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7450ms 0.3788ms 2.6401 KOps/s 2.5140 KOps/s $\textbf{\color{#35bf28}+5.01\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6512ms 0.3593ms 2.7834 KOps/s 2.6095 KOps/s $\textbf{\color{#35bf28}+6.67\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0663ms 5.7937ms 172.6002 Ops/s 170.9727 Ops/s $\color{#35bf28}+0.95\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8709ms 0.3449ms 2.8990 KOps/s 2.8068 KOps/s $\color{#35bf28}+3.28\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5435ms 0.3364ms 2.9723 KOps/s 2.8013 KOps/s $\textbf{\color{#35bf28}+6.10\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3994ms 6.0516ms 165.2464 Ops/s 167.5507 Ops/s $\color{#d91a1a}-1.38\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9936ms 0.5154ms 1.9404 KOps/s 2.2429 KOps/s $\textbf{\color{#d91a1a}-13.49\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7665ms 0.5151ms 1.9413 KOps/s 2.3320 KOps/s $\textbf{\color{#d91a1a}-16.75\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.7339ms 5.1216ms 195.2517 Ops/s 34.8958 Ops/s $\textbf{\color{#35bf28}+459.53\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.9752ms 1.8809ms 531.6726 Ops/s 514.1057 Ops/s $\color{#35bf28}+3.42\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.2691ms 0.9713ms 1.0295 KOps/s 860.1341 Ops/s $\textbf{\color{#35bf28}+19.69\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.5711ms 5.0683ms 197.3050 Ops/s 189.3217 Ops/s $\color{#35bf28}+4.22\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.9201ms 1.8308ms 546.1949 Ops/s 496.3091 Ops/s $\textbf{\color{#35bf28}+10.05\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.3550ms 0.9899ms 1.0102 KOps/s 969.0616 Ops/s $\color{#35bf28}+4.25\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.7140s 19.5911ms 51.0435 Ops/s 181.9367 Ops/s $\textbf{\color{#d91a1a}-71.94\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.0689ms 2.0538ms 486.9133 Ops/s 458.6131 Ops/s $\textbf{\color{#35bf28}+6.17\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 12.4052ms 1.5248ms 655.8283 Ops/s 828.8971 Ops/s $\textbf{\color{#d91a1a}-20.88\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 42.2320ms 39.1292ms 25.5564 Ops/s 25.4239 Ops/s $\color{#35bf28}+0.52\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.1077ms 18.6410ms 53.6452 Ops/s 53.8509 Ops/s $\color{#d91a1a}-0.38\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 45.3635ms 40.6523ms 24.5989 Ops/s 23.8602 Ops/s $\color{#35bf28}+3.10\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.7576ms 18.3092ms 54.6173 Ops/s 52.5696 Ops/s $\color{#35bf28}+3.90\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 43.7056ms 42.1602ms 23.7191 Ops/s 22.9639 Ops/s $\color{#35bf28}+3.29\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.0347ms 19.8691ms 50.3293 Ops/s 50.3392 Ops/s $\color{#d91a1a}-0.02\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8538ms 0.2204ms 4.5377 KOps/s 4.5159 KOps/s $\color{#35bf28}+0.48\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7866ms 1.3837ms 722.6945 Ops/s 685.9459 Ops/s $\textbf{\color{#35bf28}+5.36\%}$
test_storage_write_lazystack[100-img_shape2-large_img] 2.5723ms 2.3764ms 420.7962 Ops/s 432.5904 Ops/s $\color{#d91a1a}-2.73\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.1576ms 2.9269ms 341.6629 Ops/s 334.8991 Ops/s $\color{#35bf28}+2.02\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2667ms 0.1636ms 6.1124 KOps/s 6.0882 KOps/s $\color{#35bf28}+0.40\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3752ms 0.2330ms 4.2921 KOps/s 4.2649 KOps/s $\color{#35bf28}+0.64\%$
test_storage_write_contiguous[100-img_shape2-large_img] 2.0844ms 1.8976ms 526.9746 Ops/s 534.9127 Ops/s $\color{#d91a1a}-1.48\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5848ms 1.3895ms 719.6845 Ops/s 744.2503 Ops/s $\color{#d91a1a}-3.30\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2876ms 1.1737ms 851.9729 Ops/s 853.8876 Ops/s $\color{#d91a1a}-0.22\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.8610ms 3.6197ms 276.2662 Ops/s 274.2993 Ops/s $\color{#35bf28}+0.72\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.2019ms 5.9344ms 168.5084 Ops/s 169.4569 Ops/s $\color{#d91a1a}-0.56\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.6209ms 7.3982ms 135.1680 Ops/s 130.4157 Ops/s $\color{#35bf28}+3.64\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4503ms 0.2814ms 3.5536 KOps/s 3.6046 KOps/s $\color{#d91a1a}-1.42\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6980ms 1.5050ms 664.4396 Ops/s 633.1448 Ops/s $\color{#35bf28}+4.94\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.6386ms 2.4698ms 404.8843 Ops/s 403.1057 Ops/s $\color{#35bf28}+0.44\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.4414ms 3.1282ms 319.6757 Ops/s 310.3202 Ops/s $\color{#35bf28}+3.01\%$
test_collector_without_rb[100-img_shape0-atari] 34.0251ms 33.1046ms 30.2073 Ops/s 29.3147 Ops/s $\color{#35bf28}+3.04\%$
test_collector_without_rb[200-img_shape1-large_batch] 67.4577ms 65.1559ms 15.3478 Ops/s 14.9810 Ops/s $\color{#35bf28}+2.45\%$
test_collector_with_rb[100-img_shape0-atari] 38.7358ms 37.5312ms 26.6445 Ops/s 25.9406 Ops/s $\color{#35bf28}+2.71\%$
test_collector_with_rb[200-img_shape1-large_batch] 75.6043ms 74.0700ms 13.5007 Ops/s 13.1333 Ops/s $\color{#35bf28}+2.80\%$
test_collector_without_rb_cuda[100-img_shape0-atari] 57.0875ms 55.2974ms 18.0840 Ops/s 17.2463 Ops/s $\color{#35bf28}+4.86\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1129s 0.1110s 9.0124 Ops/s 8.6676 Ops/s $\color{#35bf28}+3.98\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 59.0874ms 57.4881ms 17.3949 Ops/s 16.7027 Ops/s $\color{#35bf28}+4.14\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1157s 0.1127s 8.8747 Ops/s 8.4521 Ops/s $\color{#35bf28}+5.00\%$

Adds worker startup logging (INFO level) showing:
- policy id, wrapped_policy id, and whether they match
- scheme model id and whether it matches policy/wrapped_policy
- param fingerprint at end of rollout (not just start)

This will reveal whether the scheme updates the same object the
collector uses for inference.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BugFix CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors WeightUpdate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant