+
Skip to content

Conversation

vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 7, 2025

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 7, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2767

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens pushed a commit that referenced this pull request Feb 7, 2025
ghstack-source-id: 4cb5741
Pull Request resolved: #2767
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 7, 2025
Copy link

github-actions bot commented Feb 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6083s 0.5221s 1.9154 Ops/s 1.8755 Ops/s $\color{#35bf28}+2.13\%$
test_transformed 1.1106s 1.0167s 0.9836 Ops/s 0.9621 Ops/s $\color{#35bf28}+2.24\%$
test_serial 1.6236s 1.5306s 0.6533 Ops/s 0.6483 Ops/s $\color{#35bf28}+0.78\%$
test_parallel 1.3824s 1.3094s 0.7637 Ops/s 0.7409 Ops/s $\color{#35bf28}+3.09\%$
test_step_mdp_speed[True-True-True-True-True] 0.1919ms 30.1518μs 33.1655 KOps/s 33.2117 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-True-True-True-False] 41.9690μs 17.9221μs 55.7972 KOps/s 55.5984 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[True-True-True-False-True] 48.2600μs 17.1021μs 58.4724 KOps/s 58.1506 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[True-True-True-False-False] 30.9890μs 9.9969μs 100.0313 KOps/s 99.8509 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[True-True-False-True-True] 85.8820μs 32.2128μs 31.0435 KOps/s 31.0183 KOps/s $\color{#35bf28}+0.08\%$
test_step_mdp_speed[True-True-False-True-False] 48.2510μs 19.8549μs 50.3653 KOps/s 50.7264 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[True-True-False-False-True] 96.5510μs 19.1189μs 52.3042 KOps/s 53.6995 KOps/s $\color{#d91a1a}-2.60\%$
test_step_mdp_speed[True-True-False-False-False] 38.8730μs 11.9227μs 83.8738 KOps/s 84.4108 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[True-False-True-True-True] 72.9170μs 34.1962μs 29.2430 KOps/s 29.1019 KOps/s $\color{#35bf28}+0.48\%$
test_step_mdp_speed[True-False-True-True-False] 49.0520μs 21.6921μs 46.0997 KOps/s 46.8435 KOps/s $\color{#d91a1a}-1.59\%$
test_step_mdp_speed[True-False-True-False-True] 42.5090μs 18.7835μs 53.2382 KOps/s 53.2443 KOps/s $\color{#d91a1a}-0.01\%$
test_step_mdp_speed[True-False-True-False-False] 36.0780μs 11.9703μs 83.5399 KOps/s 84.8338 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[True-False-False-True-True] 79.9400μs 35.1857μs 28.4206 KOps/s 27.7808 KOps/s $\color{#35bf28}+2.30\%$
test_step_mdp_speed[True-False-False-True-False] 63.9000μs 23.5298μs 42.4992 KOps/s 42.9934 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[True-False-False-False-True] 60.7440μs 20.2585μs 49.3621 KOps/s 48.0235 KOps/s $\color{#35bf28}+2.79\%$
test_step_mdp_speed[True-False-False-False-False] 46.8480μs 13.6701μs 73.1523 KOps/s 74.1862 KOps/s $\color{#d91a1a}-1.39\%$
test_step_mdp_speed[False-True-True-True-True] 66.0240μs 33.6211μs 29.7432 KOps/s 29.2987 KOps/s $\color{#35bf28}+1.52\%$
test_step_mdp_speed[False-True-True-True-False] 48.0800μs 21.4746μs 46.5666 KOps/s 46.2623 KOps/s $\color{#35bf28}+0.66\%$
test_step_mdp_speed[False-True-True-False-True] 43.8930μs 21.3857μs 46.7601 KOps/s 45.8186 KOps/s $\color{#35bf28}+2.05\%$
test_step_mdp_speed[False-True-True-False-False] 39.5350μs 13.0955μs 76.3622 KOps/s 74.6246 KOps/s $\color{#35bf28}+2.33\%$
test_step_mdp_speed[False-True-False-True-True] 74.7700μs 35.0090μs 28.5641 KOps/s 27.6376 KOps/s $\color{#35bf28}+3.35\%$
test_step_mdp_speed[False-True-False-True-False] 54.4830μs 23.2099μs 43.0851 KOps/s 42.4248 KOps/s $\color{#35bf28}+1.56\%$
test_step_mdp_speed[False-True-False-False-True] 2.8648ms 23.3345μs 42.8549 KOps/s 43.2090 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[False-True-False-False-False] 40.3660μs 15.0975μs 66.2362 KOps/s 67.1574 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[False-False-True-True-True] 74.8500μs 36.8585μs 27.1307 KOps/s 26.3924 KOps/s $\color{#35bf28}+2.80\%$
test_step_mdp_speed[False-False-True-True-False] 54.0710μs 25.1248μs 39.8014 KOps/s 39.7839 KOps/s $\color{#35bf28}+0.04\%$
test_step_mdp_speed[False-False-True-False-True] 50.9050μs 22.8884μs 43.6903 KOps/s 43.5529 KOps/s $\color{#35bf28}+0.32\%$
test_step_mdp_speed[False-False-True-False-False] 55.4240μs 15.4013μs 64.9297 KOps/s 66.6170 KOps/s $\color{#d91a1a}-2.53\%$
test_step_mdp_speed[False-False-False-True-True] 94.7570μs 38.6818μs 25.8519 KOps/s 25.3439 KOps/s $\color{#35bf28}+2.00\%$
test_step_mdp_speed[False-False-False-True-False] 55.9150μs 26.8598μs 37.2303 KOps/s 37.4970 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[False-False-False-False-True] 50.0340μs 24.7584μs 40.3903 KOps/s 40.6465 KOps/s $\color{#d91a1a}-0.63\%$
test_step_mdp_speed[False-False-False-False-False] 47.9300μs 16.6941μs 59.9013 KOps/s 60.9425 KOps/s $\color{#d91a1a}-1.71\%$
test_values[generalized_advantage_estimate-True-True] 10.1497ms 9.8901ms 101.1113 Ops/s 104.1373 Ops/s $\color{#d91a1a}-2.91\%$
test_values[vec_generalized_advantage_estimate-True-True] 25.8456ms 24.1556ms 41.3983 Ops/s 40.3074 Ops/s $\color{#35bf28}+2.71\%$
test_values[td0_return_estimate-False-False] 0.2595ms 0.1861ms 5.3738 KOps/s 5.5389 KOps/s $\color{#d91a1a}-2.98\%$
test_values[td1_return_estimate-False-False] 28.9386ms 24.9266ms 40.1178 Ops/s 41.0751 Ops/s $\color{#d91a1a}-2.33\%$
test_values[vec_td1_return_estimate-False-False] 26.6954ms 24.0406ms 41.5963 Ops/s 40.9714 Ops/s $\color{#35bf28}+1.53\%$
test_values[td_lambda_return_estimate-True-False] 36.7035ms 35.4386ms 28.2178 Ops/s 28.6696 Ops/s $\color{#d91a1a}-1.58\%$
test_values[vec_td_lambda_return_estimate-True-False] 26.3985ms 24.1648ms 41.3825 Ops/s 41.1810 Ops/s $\color{#35bf28}+0.49\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.4625ms 8.5105ms 117.5014 Ops/s 118.2493 Ops/s $\color{#d91a1a}-0.63\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2698ms 1.9459ms 513.8992 Ops/s 508.3250 Ops/s $\color{#35bf28}+1.10\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5809ms 0.3686ms 2.7126 KOps/s 2.7112 KOps/s $\color{#35bf28}+0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.8458ms 45.2290ms 22.1097 Ops/s 24.9586 Ops/s $\textbf{\color{#d91a1a}-11.41\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.1019ms 3.4300ms 291.5479 Ops/s 290.8762 Ops/s $\color{#35bf28}+0.23\%$
test_dqn_speed[False-None] 5.8624ms 1.4296ms 699.4929 Ops/s 702.0721 Ops/s $\color{#d91a1a}-0.37\%$
test_dqn_speed[False-backward] 1.9973ms 1.9126ms 522.8356 Ops/s 522.0233 Ops/s $\color{#35bf28}+0.16\%$
test_dqn_speed[True-None] 0.7822ms 0.4995ms 2.0019 KOps/s 1.9773 KOps/s $\color{#35bf28}+1.24\%$
test_dqn_speed[True-backward] 1.0007ms 0.9254ms 1.0806 KOps/s 1.0659 KOps/s $\color{#35bf28}+1.38\%$
test_dqn_speed[reduce-overhead-None] 0.7773ms 0.5007ms 1.9971 KOps/s 1.9792 KOps/s $\color{#35bf28}+0.90\%$
test_dqn_speed[reduce-overhead-backward] 0.9658ms 0.9266ms 1.0793 KOps/s 1.0763 KOps/s $\color{#35bf28}+0.28\%$
test_ddpg_speed[False-None] 3.2710ms 2.9186ms 342.6352 Ops/s 336.6318 Ops/s $\color{#35bf28}+1.78\%$
test_ddpg_speed[False-backward] 4.7110ms 4.0895ms 244.5269 Ops/s 245.0350 Ops/s $\color{#d91a1a}-0.21\%$
test_ddpg_speed[True-None] 2.8530ms 1.2575ms 795.2150 Ops/s 800.7237 Ops/s $\color{#d91a1a}-0.69\%$
test_ddpg_speed[True-backward] 2.2199ms 2.1575ms 463.4965 Ops/s 463.3028 Ops/s $\color{#35bf28}+0.04\%$
test_ddpg_speed[reduce-overhead-None] 2.1235ms 1.2828ms 779.5273 Ops/s 787.9968 Ops/s $\color{#d91a1a}-1.07\%$
test_ddpg_speed[reduce-overhead-backward] 2.2146ms 2.1588ms 463.2302 Ops/s 461.7667 Ops/s $\color{#35bf28}+0.32\%$
test_sac_speed[False-None] 11.1065ms 8.3135ms 120.2862 Ops/s 120.6562 Ops/s $\color{#d91a1a}-0.31\%$
test_sac_speed[False-backward] 12.0305ms 10.8740ms 91.9624 Ops/s 91.6696 Ops/s $\color{#35bf28}+0.32\%$
test_sac_speed[True-None] 2.7301ms 2.1369ms 467.9645 Ops/s 466.9305 Ops/s $\color{#35bf28}+0.22\%$
test_sac_speed[True-backward] 4.0331ms 3.8414ms 260.3211 Ops/s 248.0575 Ops/s $\color{#35bf28}+4.94\%$
test_sac_speed[reduce-overhead-None] 2.6641ms 2.1667ms 461.5344 Ops/s 458.6876 Ops/s $\color{#35bf28}+0.62\%$
test_sac_speed[reduce-overhead-backward] 3.9266ms 3.8416ms 260.3072 Ops/s 260.9920 Ops/s $\color{#d91a1a}-0.26\%$
test_redq_speed[False-None] 14.8965ms 12.9424ms 77.2655 Ops/s 75.7953 Ops/s $\color{#35bf28}+1.94\%$
test_redq_speed[False-backward] 23.5382ms 22.3326ms 44.7775 Ops/s 44.1698 Ops/s $\color{#35bf28}+1.38\%$
test_redq_speed[True-None] 6.4720ms 4.9683ms 201.2773 Ops/s 189.6937 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_redq_speed[True-backward] 13.8249ms 12.5586ms 79.6264 Ops/s 79.4045 Ops/s $\color{#35bf28}+0.28\%$
test_redq_speed[reduce-overhead-None] 6.1393ms 4.9679ms 201.2927 Ops/s 197.1904 Ops/s $\color{#35bf28}+2.08\%$
test_redq_speed[reduce-overhead-backward] 13.7940ms 12.5079ms 79.9498 Ops/s 77.4308 Ops/s $\color{#35bf28}+3.25\%$
test_redq_deprec_speed[False-None] 14.4954ms 12.9259ms 77.3642 Ops/s 75.6339 Ops/s $\color{#35bf28}+2.29\%$
test_redq_deprec_speed[False-backward] 20.2577ms 18.8698ms 52.9947 Ops/s 52.9809 Ops/s $\color{#35bf28}+0.03\%$
test_redq_deprec_speed[True-None] 4.6874ms 3.9366ms 254.0231 Ops/s 257.0214 Ops/s $\color{#d91a1a}-1.17\%$
test_redq_deprec_speed[True-backward] 8.9537ms 8.4118ms 118.8812 Ops/s 118.1057 Ops/s $\color{#35bf28}+0.66\%$
test_redq_deprec_speed[reduce-overhead-None] 4.5130ms 4.0039ms 249.7545 Ops/s 255.8883 Ops/s $\color{#d91a1a}-2.40\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.0225ms 8.4159ms 118.8226 Ops/s 109.7761 Ops/s $\textbf{\color{#35bf28}+8.24\%}$
test_td3_speed[False-None] 9.8077ms 8.2254ms 121.5748 Ops/s 120.7200 Ops/s $\color{#35bf28}+0.71\%$
test_td3_speed[False-backward] 12.8480ms 10.6450ms 93.9407 Ops/s 93.4499 Ops/s $\color{#35bf28}+0.53\%$
test_td3_speed[True-None] 2.1998ms 1.8507ms 540.3474 Ops/s 542.4417 Ops/s $\color{#d91a1a}-0.39\%$
test_td3_speed[True-backward] 3.5522ms 3.4572ms 289.2529 Ops/s 276.3958 Ops/s $\color{#35bf28}+4.65\%$
test_td3_speed[reduce-overhead-None] 2.1605ms 1.8659ms 535.9483 Ops/s 541.9883 Ops/s $\color{#d91a1a}-1.11\%$
test_td3_speed[reduce-overhead-backward] 4.1270ms 3.4907ms 286.4723 Ops/s 288.1425 Ops/s $\color{#d91a1a}-0.58\%$
test_cql_speed[False-None] 41.4383ms 36.6818ms 27.2615 Ops/s 26.5018 Ops/s $\color{#35bf28}+2.87\%$
test_cql_speed[False-backward] 48.7880ms 46.9858ms 21.2830 Ops/s 21.0123 Ops/s $\color{#35bf28}+1.29\%$
test_cql_speed[True-None] 17.2363ms 16.1766ms 61.8177 Ops/s 61.6651 Ops/s $\color{#35bf28}+0.25\%$
test_cql_speed[True-backward] 25.4110ms 23.3458ms 42.8343 Ops/s 43.2166 Ops/s $\color{#d91a1a}-0.88\%$
test_cql_speed[reduce-overhead-None] 17.0664ms 16.3865ms 61.0258 Ops/s 61.4734 Ops/s $\color{#d91a1a}-0.73\%$
test_cql_speed[reduce-overhead-backward] 26.4501ms 23.4234ms 42.6923 Ops/s 42.4903 Ops/s $\color{#35bf28}+0.48\%$
test_a2c_speed[False-None] 8.3778ms 7.2690ms 137.5711 Ops/s 137.7597 Ops/s $\color{#d91a1a}-0.14\%$
test_a2c_speed[False-backward] 15.1749ms 14.6139ms 68.4282 Ops/s 67.5399 Ops/s $\color{#35bf28}+1.32\%$
test_a2c_speed[True-None] 4.1242ms 3.7527ms 266.4779 Ops/s 265.8674 Ops/s $\color{#35bf28}+0.23\%$
test_a2c_speed[True-backward] 11.2160ms 10.1972ms 98.0657 Ops/s 96.5868 Ops/s $\color{#35bf28}+1.53\%$
test_a2c_speed[reduce-overhead-None] 4.6830ms 3.7550ms 266.3135 Ops/s 265.0103 Ops/s $\color{#35bf28}+0.49\%$
test_a2c_speed[reduce-overhead-backward] 12.0063ms 10.4849ms 95.3749 Ops/s 97.9115 Ops/s $\color{#d91a1a}-2.59\%$
test_ppo_speed[False-None] 8.4915ms 7.5904ms 131.7460 Ops/s 131.7066 Ops/s $\color{#35bf28}+0.03\%$
test_ppo_speed[False-backward] 16.1682ms 15.0674ms 66.3683 Ops/s 66.7310 Ops/s $\color{#d91a1a}-0.54\%$
test_ppo_speed[True-None] 4.4084ms 4.1225ms 242.5706 Ops/s 237.3191 Ops/s $\color{#35bf28}+2.21\%$
test_ppo_speed[True-backward] 10.8322ms 10.0542ms 99.4606 Ops/s 98.9101 Ops/s $\color{#35bf28}+0.56\%$
test_ppo_speed[reduce-overhead-None] 4.4971ms 4.1666ms 240.0030 Ops/s 240.4194 Ops/s $\color{#d91a1a}-0.17\%$
test_ppo_speed[reduce-overhead-backward] 10.7729ms 10.0099ms 99.9009 Ops/s 99.3660 Ops/s $\color{#35bf28}+0.54\%$
test_reinforce_speed[False-None] 7.5023ms 6.5830ms 151.9075 Ops/s 147.8534 Ops/s $\color{#35bf28}+2.74\%$
test_reinforce_speed[False-backward] 11.7373ms 9.9154ms 100.8529 Ops/s 98.9602 Ops/s $\color{#35bf28}+1.91\%$
test_reinforce_speed[True-None] 3.8207ms 3.1307ms 319.4186 Ops/s 315.4961 Ops/s $\color{#35bf28}+1.24\%$
test_reinforce_speed[True-backward] 10.2916ms 9.0912ms 109.9961 Ops/s 105.3294 Ops/s $\color{#35bf28}+4.43\%$
test_reinforce_speed[reduce-overhead-None] 3.8657ms 3.1115ms 321.3856 Ops/s 314.7082 Ops/s $\color{#35bf28}+2.12\%$
test_reinforce_speed[reduce-overhead-backward] 10.6583ms 9.1330ms 109.4931 Ops/s 110.7833 Ops/s $\color{#d91a1a}-1.16\%$
test_iql_speed[False-None] 35.0250ms 32.5960ms 30.6786 Ops/s 30.0187 Ops/s $\color{#35bf28}+2.20\%$
test_iql_speed[False-backward] 0.3693s 51.5339ms 19.4047 Ops/s 21.5246 Ops/s $\textbf{\color{#d91a1a}-9.85\%}$
test_iql_speed[True-None] 12.8808ms 11.3269ms 88.2857 Ops/s 85.0131 Ops/s $\color{#35bf28}+3.85\%$
test_iql_speed[True-backward] 24.5947ms 22.2941ms 44.8550 Ops/s 43.6831 Ops/s $\color{#35bf28}+2.68\%$
test_iql_speed[reduce-overhead-None] 12.5674ms 11.2807ms 88.6471 Ops/s 86.3139 Ops/s $\color{#35bf28}+2.70\%$
test_iql_speed[reduce-overhead-backward] 24.0326ms 22.3177ms 44.8076 Ops/s 44.5332 Ops/s $\color{#35bf28}+0.62\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.7026ms 4.8392ms 206.6464 Ops/s 203.0679 Ops/s $\color{#35bf28}+1.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.4665ms 0.5191ms 1.9265 KOps/s 1.9087 KOps/s $\color{#35bf28}+0.93\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8376ms 0.4966ms 2.0136 KOps/s 2.0195 KOps/s $\color{#d91a1a}-0.29\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3111ms 4.6340ms 215.7946 Ops/s 215.1751 Ops/s $\color{#35bf28}+0.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3280ms 0.5120ms 1.9531 KOps/s 1.9825 KOps/s $\color{#d91a1a}-1.48\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7089ms 0.4824ms 2.0730 KOps/s 2.0612 KOps/s $\color{#35bf28}+0.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4728ms 1.6732ms 597.6654 Ops/s 598.7603 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2201ms 1.5849ms 630.9404 Ops/s 631.8070 Ops/s $\color{#d91a1a}-0.14\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1941ms 4.7676ms 209.7495 Ops/s 198.1804 Ops/s $\textbf{\color{#35bf28}+5.84\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0679ms 0.6636ms 1.5068 KOps/s 1.5192 KOps/s $\color{#d91a1a}-0.82\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8664ms 0.6246ms 1.6011 KOps/s 1.5694 KOps/s $\color{#35bf28}+2.02\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.3779ms 4.6594ms 214.6196 Ops/s 206.5799 Ops/s $\color{#35bf28}+3.89\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.2511ms 0.5238ms 1.9090 KOps/s 1.8851 KOps/s $\color{#35bf28}+1.26\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6795ms 0.4880ms 2.0490 KOps/s 2.0327 KOps/s $\color{#35bf28}+0.80\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.0703ms 4.6622ms 214.4902 Ops/s 211.1828 Ops/s $\color{#35bf28}+1.57\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.9613ms 0.5139ms 1.9459 KOps/s 1.9523 KOps/s $\color{#d91a1a}-0.33\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7998ms 0.4915ms 2.0345 KOps/s 2.0152 KOps/s $\color{#35bf28}+0.96\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 9.0254ms 4.8801ms 204.9157 Ops/s 205.4284 Ops/s $\color{#d91a1a}-0.25\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.4164ms 0.6639ms 1.5062 KOps/s 1.5123 KOps/s $\color{#d91a1a}-0.40\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8383ms 0.6266ms 1.5958 KOps/s 1.5679 KOps/s $\color{#35bf28}+1.78\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5777ms 4.3031ms 232.3930 Ops/s 220.6408 Ops/s $\textbf{\color{#35bf28}+5.33\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.5246ms 2.2436ms 445.7125 Ops/s 417.0745 Ops/s $\textbf{\color{#35bf28}+6.87\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.5025ms 1.4201ms 704.1976 Ops/s 729.8764 Ops/s $\color{#d91a1a}-3.52\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.7021ms 4.3391ms 230.4626 Ops/s 230.9440 Ops/s $\color{#d91a1a}-0.21\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.4743s 11.7867ms 84.8411 Ops/s 427.1290 Ops/s $\textbf{\color{#d91a1a}-80.14\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.1115ms 1.3938ms 717.4739 Ops/s 772.6173 Ops/s $\textbf{\color{#d91a1a}-7.14\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.9309ms 4.5424ms 220.1489 Ops/s 224.5331 Ops/s $\color{#d91a1a}-1.95\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.0929ms 2.5885ms 386.3265 Ops/s 395.6458 Ops/s $\color{#d91a1a}-2.36\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.2092ms 1.5641ms 639.3502 Ops/s 696.9755 Ops/s $\textbf{\color{#d91a1a}-8.27\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.2261ms 12.0152ms 83.2282 Ops/s 79.6553 Ops/s $\color{#35bf28}+4.49\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.5215ms 14.8367ms 67.4005 Ops/s 68.2201 Ops/s $\color{#d91a1a}-1.20\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.9265ms 20.8756ms 47.9028 Ops/s 45.9646 Ops/s $\color{#35bf28}+4.22\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 23.0393ms 14.9733ms 66.7856 Ops/s 66.2272 Ops/s $\color{#35bf28}+0.84\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.2465ms 20.7643ms 48.1596 Ops/s 46.4886 Ops/s $\color{#35bf28}+3.59\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.5161ms 16.1915ms 61.7609 Ops/s 59.8830 Ops/s $\color{#35bf28}+3.14\%$

@vmoens vmoens added the BE Better errors, logs, docs or test utils label Feb 10, 2025
[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: e09b03b
Pull Request resolved: #2767
Copy link

github-actions bot commented Feb 10, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.9065s 0.8195s 1.2202 Ops/s 1.1844 Ops/s $\color{#35bf28}+3.03\%$
test_transformed 1.5245s 1.4395s 0.6947 Ops/s 0.6906 Ops/s $\color{#35bf28}+0.59\%$
test_serial 2.3013s 2.3007s 0.4346 Ops/s 0.4255 Ops/s $\color{#35bf28}+2.16\%$
test_parallel 1.9243s 1.8697s 0.5348 Ops/s 0.5238 Ops/s $\color{#35bf28}+2.11\%$
test_step_mdp_speed[True-True-True-True-True] 0.1844ms 40.5221μs 24.6779 KOps/s 24.9496 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-True-True-True-False] 67.6010μs 23.4662μs 42.6144 KOps/s 42.1089 KOps/s $\color{#35bf28}+1.20\%$
test_step_mdp_speed[True-True-True-False-True] 60.2510μs 21.9366μs 45.5858 KOps/s 43.3870 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_step_mdp_speed[True-True-True-False-False] 92.7810μs 13.0647μs 76.5419 KOps/s 75.9258 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-True-False-True-True] 74.1610μs 42.1012μs 23.7523 KOps/s 23.4243 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[True-True-False-True-False] 0.1058ms 25.3373μs 39.4676 KOps/s 38.4572 KOps/s $\color{#35bf28}+2.63\%$
test_step_mdp_speed[True-True-False-False-True] 51.5110μs 24.7710μs 40.3698 KOps/s 40.7126 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[True-True-False-False-False] 41.6110μs 15.5028μs 64.5046 KOps/s 63.9775 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[True-False-True-True-True] 81.7910μs 45.1219μs 22.1622 KOps/s 21.9509 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[True-False-True-True-False] 92.6620μs 27.7452μs 36.0423 KOps/s 35.4085 KOps/s $\color{#35bf28}+1.79\%$
test_step_mdp_speed[True-False-True-False-True] 88.9710μs 24.7049μs 40.4778 KOps/s 39.9616 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[True-False-True-False-False] 42.7210μs 15.4165μs 64.8654 KOps/s 63.9681 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[True-False-False-True-True] 77.1810μs 47.7743μs 20.9318 KOps/s 21.1052 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[True-False-False-True-False] 60.3510μs 30.8019μs 32.4655 KOps/s 32.6343 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[True-False-False-False-True] 58.5110μs 27.2783μs 36.6592 KOps/s 36.9739 KOps/s $\color{#d91a1a}-0.85\%$
test_step_mdp_speed[True-False-False-False-False] 43.2500μs 17.8588μs 55.9949 KOps/s 56.2055 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[False-True-True-True-True] 77.3910μs 45.6191μs 21.9207 KOps/s 22.2379 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[False-True-True-True-False] 63.4210μs 28.5569μs 35.0178 KOps/s 36.0495 KOps/s $\color{#d91a1a}-2.86\%$
test_step_mdp_speed[False-True-True-False-True] 2.5667ms 29.1804μs 34.2696 KOps/s 35.3223 KOps/s $\color{#d91a1a}-2.98\%$
test_step_mdp_speed[False-True-True-False-False] 48.5310μs 17.3924μs 57.4965 KOps/s 57.2778 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[False-True-False-True-True] 83.8520μs 47.7056μs 20.9619 KOps/s 21.2397 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[False-True-False-True-False] 70.2610μs 31.0020μs 32.2560 KOps/s 32.9669 KOps/s $\color{#d91a1a}-2.16\%$
test_step_mdp_speed[False-True-False-False-True] 57.9910μs 31.0268μs 32.2302 KOps/s 32.5782 KOps/s $\color{#d91a1a}-1.07\%$
test_step_mdp_speed[False-True-False-False-False] 48.7510μs 19.4289μs 51.4696 KOps/s 51.2994 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[False-False-True-True-True] 80.7710μs 48.7043μs 20.5321 KOps/s 20.0412 KOps/s $\color{#35bf28}+2.45\%$
test_step_mdp_speed[False-False-True-True-False] 71.2910μs 33.1205μs 30.1928 KOps/s 30.7865 KOps/s $\color{#d91a1a}-1.93\%$
test_step_mdp_speed[False-False-True-False-True] 56.7210μs 31.4874μs 31.7587 KOps/s 32.0628 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-False-True-False-False] 0.1604ms 19.5911μs 51.0435 KOps/s 50.9652 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-False-False-True-True] 0.1110ms 50.1713μs 19.9317 KOps/s 19.4426 KOps/s $\color{#35bf28}+2.52\%$
test_step_mdp_speed[False-False-False-True-False] 64.9010μs 35.3699μs 28.2726 KOps/s 28.3492 KOps/s $\color{#d91a1a}-0.27\%$
test_step_mdp_speed[False-False-False-False-True] 60.3810μs 32.8997μs 30.3954 KOps/s 30.5602 KOps/s $\color{#d91a1a}-0.54\%$
test_step_mdp_speed[False-False-False-False-False] 55.0510μs 21.9121μs 45.6369 KOps/s 46.2130 KOps/s $\color{#d91a1a}-1.25\%$
test_values[generalized_advantage_estimate-True-True] 26.0450ms 24.9723ms 40.0444 Ops/s 39.8114 Ops/s $\color{#35bf28}+0.59\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1051s 2.9979ms 333.5702 Ops/s 332.6954 Ops/s $\color{#35bf28}+0.26\%$
test_values[td0_return_estimate-False-False] 0.1045ms 78.7124μs 12.7045 KOps/s 12.3993 KOps/s $\color{#35bf28}+2.46\%$
test_values[td1_return_estimate-False-False] 55.6031ms 55.2240ms 18.1081 Ops/s 17.9769 Ops/s $\color{#35bf28}+0.73\%$
test_values[vec_td1_return_estimate-False-False] 1.3622ms 1.0831ms 923.2978 Ops/s 918.3690 Ops/s $\color{#35bf28}+0.54\%$
test_values[td_lambda_return_estimate-True-False] 90.9978ms 87.2552ms 11.4606 Ops/s 11.3489 Ops/s $\color{#35bf28}+0.98\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4261ms 1.0872ms 919.7571 Ops/s 921.0586 Ops/s $\color{#d91a1a}-0.14\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.7364ms 24.5448ms 40.7419 Ops/s 40.1826 Ops/s $\color{#35bf28}+1.39\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0256ms 0.7534ms 1.3273 KOps/s 1.3136 KOps/s $\color{#35bf28}+1.04\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7520ms 0.6670ms 1.4993 KOps/s 1.4886 KOps/s $\color{#35bf28}+0.72\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5572ms 1.4845ms 673.6464 Ops/s 669.6658 Ops/s $\color{#35bf28}+0.59\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7245ms 0.6810ms 1.4683 KOps/s 1.4548 KOps/s $\color{#35bf28}+0.93\%$
test_dqn_speed[False-None] 6.9460ms 1.5327ms 652.4260 Ops/s 649.6275 Ops/s $\color{#35bf28}+0.43\%$
test_dqn_speed[False-backward] 2.3279ms 2.1404ms 467.1918 Ops/s 465.9230 Ops/s $\color{#35bf28}+0.27\%$
test_dqn_speed[True-None] 0.1529s 0.6619ms 1.5107 KOps/s 1.7107 KOps/s $\textbf{\color{#d91a1a}-11.69\%}$
test_dqn_speed[True-backward] 1.1958ms 1.1420ms 875.6553 Ops/s 796.5717 Ops/s $\textbf{\color{#35bf28}+9.93\%}$
test_dqn_speed[reduce-overhead-None] 0.6754ms 0.5964ms 1.6767 KOps/s 1.6357 KOps/s $\color{#35bf28}+2.51\%$
test_dqn_speed[reduce-overhead-backward] 1.0634ms 0.9835ms 1.0167 KOps/s 915.0131 Ops/s $\textbf{\color{#35bf28}+11.12\%}$
test_ddpg_speed[False-None] 3.1995ms 2.9009ms 344.7205 Ops/s 343.0502 Ops/s $\color{#35bf28}+0.49\%$
test_ddpg_speed[False-backward] 4.5953ms 4.1520ms 240.8471 Ops/s 231.8091 Ops/s $\color{#35bf28}+3.90\%$
test_ddpg_speed[True-None] 1.5251ms 1.3688ms 730.5864 Ops/s 720.1539 Ops/s $\color{#35bf28}+1.45\%$
test_ddpg_speed[True-backward] 2.5480ms 2.4650ms 405.6843 Ops/s 372.4156 Ops/s $\textbf{\color{#35bf28}+8.93\%}$
test_ddpg_speed[reduce-overhead-None] 1.4634ms 1.3831ms 723.0250 Ops/s 716.1467 Ops/s $\color{#35bf28}+0.96\%$
test_ddpg_speed[reduce-overhead-backward] 2.0041ms 1.9180ms 521.3664 Ops/s 479.8766 Ops/s $\textbf{\color{#35bf28}+8.65\%}$
test_sac_speed[False-None] 8.3983ms 8.0378ms 124.4119 Ops/s 122.2373 Ops/s $\color{#35bf28}+1.78\%$
test_sac_speed[False-backward] 11.7881ms 11.0395ms 90.5841 Ops/s 87.8869 Ops/s $\color{#35bf28}+3.07\%$
test_sac_speed[True-None] 2.0788ms 1.9245ms 519.6247 Ops/s 518.7192 Ops/s $\color{#35bf28}+0.17\%$
test_sac_speed[True-backward] 3.9405ms 3.8089ms 262.5453 Ops/s 259.6737 Ops/s $\color{#35bf28}+1.11\%$
test_sac_speed[reduce-overhead-None] 20.8763ms 11.8922ms 84.0884 Ops/s 83.2746 Ops/s $\color{#35bf28}+0.98\%$
test_sac_speed[reduce-overhead-backward] 1.9132ms 1.8507ms 540.3387 Ops/s 589.2742 Ops/s $\textbf{\color{#d91a1a}-8.30\%}$
test_redq_speed[False-None] 8.3307ms 7.8774ms 126.9456 Ops/s 130.6074 Ops/s $\color{#d91a1a}-2.80\%$
test_redq_speed[False-backward] 12.5558ms 11.8876ms 84.1216 Ops/s 86.6710 Ops/s $\color{#d91a1a}-2.94\%$
test_redq_speed[True-None] 2.4952ms 2.3625ms 423.2880 Ops/s 413.9854 Ops/s $\color{#35bf28}+2.25\%$
test_redq_speed[True-backward] 4.3059ms 4.2645ms 234.4951 Ops/s 237.9444 Ops/s $\color{#d91a1a}-1.45\%$
test_redq_speed[reduce-overhead-None] 2.5065ms 2.3951ms 417.5137 Ops/s 409.3661 Ops/s $\color{#35bf28}+1.99\%$
test_redq_speed[reduce-overhead-backward] 4.8716ms 4.3580ms 229.4645 Ops/s 228.6126 Ops/s $\color{#35bf28}+0.37\%$
test_redq_deprec_speed[False-None] 9.7801ms 9.3932ms 106.4602 Ops/s 108.9788 Ops/s $\color{#d91a1a}-2.31\%$
test_redq_deprec_speed[False-backward] 13.1625ms 12.5834ms 79.4696 Ops/s 79.9831 Ops/s $\color{#d91a1a}-0.64\%$
test_redq_deprec_speed[True-None] 2.8330ms 2.6962ms 370.8868 Ops/s 364.6516 Ops/s $\color{#35bf28}+1.71\%$
test_redq_deprec_speed[True-backward] 4.9812ms 4.5237ms 221.0602 Ops/s 214.6295 Ops/s $\color{#35bf28}+3.00\%$
test_redq_deprec_speed[reduce-overhead-None] 2.7400ms 2.6722ms 374.2216 Ops/s 365.3120 Ops/s $\color{#35bf28}+2.44\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6209ms 4.5262ms 220.9348 Ops/s 216.1778 Ops/s $\color{#35bf28}+2.20\%$
test_td3_speed[False-None] 7.9979ms 7.9549ms 125.7087 Ops/s 124.4603 Ops/s $\color{#35bf28}+1.00\%$
test_td3_speed[False-backward] 11.0020ms 10.4735ms 95.4793 Ops/s 94.8641 Ops/s $\color{#35bf28}+0.65\%$
test_td3_speed[True-None] 1.7918ms 1.7056ms 586.3198 Ops/s 573.2037 Ops/s $\color{#35bf28}+2.29\%$
test_td3_speed[True-backward] 3.6583ms 3.4902ms 286.5159 Ops/s 288.2067 Ops/s $\color{#d91a1a}-0.59\%$
test_td3_speed[reduce-overhead-None] 52.6431ms 26.7051ms 37.4460 Ops/s 38.5602 Ops/s $\color{#d91a1a}-2.89\%$
test_td3_speed[reduce-overhead-backward] 1.5813ms 1.5322ms 652.6702 Ops/s 644.1579 Ops/s $\color{#35bf28}+1.32\%$
test_cql_speed[False-None] 17.3303ms 16.7522ms 59.6938 Ops/s 57.4458 Ops/s $\color{#35bf28}+3.91\%$
test_cql_speed[False-backward] 23.1233ms 22.4057ms 44.6316 Ops/s 44.2826 Ops/s $\color{#35bf28}+0.79\%$
test_cql_speed[True-None] 3.5702ms 3.3503ms 298.4794 Ops/s 292.5054 Ops/s $\color{#35bf28}+2.04\%$
test_cql_speed[True-backward] 6.4184ms 5.9570ms 167.8702 Ops/s 175.1567 Ops/s $\color{#d91a1a}-4.16\%$
test_cql_speed[reduce-overhead-None] 21.1618ms 13.2024ms 75.7435 Ops/s 74.3376 Ops/s $\color{#35bf28}+1.89\%$
test_cql_speed[reduce-overhead-backward] 2.1894ms 2.0533ms 487.0308 Ops/s 487.4001 Ops/s $\color{#d91a1a}-0.08\%$
test_a2c_speed[False-None] 3.4139ms 3.2497ms 307.7242 Ops/s 305.8265 Ops/s $\color{#35bf28}+0.62\%$
test_a2c_speed[False-backward] 6.8631ms 6.3380ms 157.7780 Ops/s 157.0733 Ops/s $\color{#35bf28}+0.45\%$
test_a2c_speed[True-None] 1.4252ms 1.3711ms 729.3434 Ops/s 717.4123 Ops/s $\color{#35bf28}+1.66\%$
test_a2c_speed[True-backward] 3.1445ms 3.0744ms 325.2701 Ops/s 332.9585 Ops/s $\color{#d91a1a}-2.31\%$
test_a2c_speed[reduce-overhead-None] 15.9256ms 8.9561ms 111.6556 Ops/s 111.8258 Ops/s $\color{#d91a1a}-0.15\%$
test_a2c_speed[reduce-overhead-backward] 1.7534ms 1.6328ms 612.4478 Ops/s 654.2428 Ops/s $\textbf{\color{#d91a1a}-6.39\%}$
test_ppo_speed[False-None] 3.8324ms 3.6899ms 271.0114 Ops/s 255.2771 Ops/s $\textbf{\color{#35bf28}+6.16\%}$
test_ppo_speed[False-backward] 7.4361ms 7.0405ms 142.0347 Ops/s 140.8838 Ops/s $\color{#35bf28}+0.82\%$
test_ppo_speed[True-None] 1.5421ms 1.4314ms 698.6375 Ops/s 679.9503 Ops/s $\color{#35bf28}+2.75\%$
test_ppo_speed[True-backward] 3.2748ms 3.2373ms 308.8980 Ops/s 300.8698 Ops/s $\color{#35bf28}+2.67\%$
test_ppo_speed[reduce-overhead-None] 1.3349ms 0.9979ms 1.0021 KOps/s 978.0749 Ops/s $\color{#35bf28}+2.46\%$
test_ppo_speed[reduce-overhead-backward] 1.7534ms 1.5838ms 631.3811 Ops/s 667.2244 Ops/s $\textbf{\color{#d91a1a}-5.37\%}$
test_reinforce_speed[False-None] 2.3732ms 2.2684ms 440.8329 Ops/s 425.3326 Ops/s $\color{#35bf28}+3.64\%$
test_reinforce_speed[False-backward] 3.7689ms 3.3727ms 296.4986 Ops/s 296.3111 Ops/s $\color{#35bf28}+0.06\%$
test_reinforce_speed[True-None] 1.4148ms 1.3183ms 758.5254 Ops/s 733.5228 Ops/s $\color{#35bf28}+3.41\%$
test_reinforce_speed[True-backward] 3.2240ms 3.0870ms 323.9436 Ops/s 332.6395 Ops/s $\color{#d91a1a}-2.61\%$
test_reinforce_speed[reduce-overhead-None] 18.0530ms 10.0092ms 99.9080 Ops/s 100.7416 Ops/s $\color{#d91a1a}-0.83\%$
test_reinforce_speed[reduce-overhead-backward] 1.7645ms 1.6629ms 601.3507 Ops/s 635.4900 Ops/s $\textbf{\color{#d91a1a}-5.37\%}$
test_iql_speed[False-None] 9.6150ms 9.1875ms 108.8436 Ops/s 105.5441 Ops/s $\color{#35bf28}+3.13\%$
test_iql_speed[False-backward] 13.5928ms 13.0751ms 76.4814 Ops/s 74.8952 Ops/s $\color{#35bf28}+2.12\%$
test_iql_speed[True-None] 2.4356ms 2.2735ms 439.8512 Ops/s 420.3618 Ops/s $\color{#35bf28}+4.64\%$
test_iql_speed[True-backward] 4.8863ms 4.8144ms 207.7111 Ops/s 190.5781 Ops/s $\textbf{\color{#35bf28}+8.99\%}$
test_iql_speed[reduce-overhead-None] 0.4728s 12.7832ms 78.2275 Ops/s 89.3181 Ops/s $\textbf{\color{#d91a1a}-12.42\%}$
test_iql_speed[reduce-overhead-backward] 2.1711ms 1.9888ms 502.8089 Ops/s 494.5381 Ops/s $\color{#35bf28}+1.67\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8670ms 6.3882ms 156.5377 Ops/s 154.7447 Ops/s $\color{#35bf28}+1.16\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6790ms 0.3264ms 3.0633 KOps/s 2.8522 KOps/s $\textbf{\color{#35bf28}+7.40\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6730ms 0.3370ms 2.9673 KOps/s 2.9771 KOps/s $\color{#d91a1a}-0.33\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3922ms 6.1264ms 163.2293 Ops/s 162.9571 Ops/s $\color{#35bf28}+0.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7548ms 0.2989ms 3.3454 KOps/s 3.0561 KOps/s $\textbf{\color{#35bf28}+9.46\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6495ms 0.3153ms 3.1720 KOps/s 2.9294 KOps/s $\textbf{\color{#35bf28}+8.28\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6048ms 1.3554ms 737.7968 Ops/s 783.0644 Ops/s $\textbf{\color{#d91a1a}-5.78\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5594ms 1.3134ms 761.3789 Ops/s 745.0355 Ops/s $\color{#35bf28}+2.19\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3995ms 6.2806ms 159.2198 Ops/s 158.3259 Ops/s $\color{#35bf28}+0.56\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0965ms 0.4505ms 2.2197 KOps/s 2.2773 KOps/s $\color{#d91a1a}-2.53\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7173ms 0.4562ms 2.1919 KOps/s 2.4354 KOps/s $\textbf{\color{#d91a1a}-10.00\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2485ms 6.1543ms 162.4886 Ops/s 161.8795 Ops/s $\color{#35bf28}+0.38\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8326ms 0.3596ms 2.7812 KOps/s 3.2415 KOps/s $\textbf{\color{#d91a1a}-14.20\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.3846ms 0.3322ms 3.0104 KOps/s 3.1253 KOps/s $\color{#d91a1a}-3.68\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.4305ms 6.1170ms 163.4783 Ops/s 162.9895 Ops/s $\color{#35bf28}+0.30\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0382ms 0.3229ms 3.0968 KOps/s 3.6739 KOps/s $\textbf{\color{#d91a1a}-15.71\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5088ms 0.2711ms 3.6887 KOps/s 4.1085 KOps/s $\textbf{\color{#d91a1a}-10.22\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4739ms 6.2995ms 158.7438 Ops/s 157.7522 Ops/s $\color{#35bf28}+0.63\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9750ms 0.4638ms 2.1560 KOps/s 2.2114 KOps/s $\color{#d91a1a}-2.51\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6483ms 0.4471ms 2.2365 KOps/s 2.2605 KOps/s $\color{#d91a1a}-1.06\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9867ms 5.4218ms 184.4418 Ops/s 178.3239 Ops/s $\color{#35bf28}+3.43\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.6662ms 2.2240ms 449.6471 Ops/s 422.9596 Ops/s $\textbf{\color{#35bf28}+6.31\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.1683ms 1.1199ms 892.9022 Ops/s 843.9157 Ops/s $\textbf{\color{#35bf28}+5.80\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4452s 14.3631ms 69.6227 Ops/s 181.8967 Ops/s $\textbf{\color{#d91a1a}-61.72\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.8536ms 1.7247ms 579.8254 Ops/s 458.4102 Ops/s $\textbf{\color{#35bf28}+26.49\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.8216ms 1.2968ms 771.1416 Ops/s 779.9706 Ops/s $\color{#d91a1a}-1.13\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.9816ms 5.6830ms 175.9620 Ops/s 31.5587 Ops/s $\textbf{\color{#35bf28}+457.57\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.3816ms 2.1931ms 455.9697 Ops/s 463.7493 Ops/s $\color{#d91a1a}-1.68\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.2381ms 1.3964ms 716.1182 Ops/s 839.2593 Ops/s $\textbf{\color{#d91a1a}-14.67\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.2842ms 13.0834ms 76.4326 Ops/s 69.7064 Ops/s $\textbf{\color{#35bf28}+9.65\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.9995ms 17.0267ms 58.7312 Ops/s 57.0860 Ops/s $\color{#35bf28}+2.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.2843ms 17.8548ms 56.0073 Ops/s 53.1751 Ops/s $\textbf{\color{#35bf28}+5.33\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.2180ms 17.6484ms 56.6622 Ops/s 57.0305 Ops/s $\color{#d91a1a}-0.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 19.0815ms 17.7540ms 56.3253 Ops/s 54.0897 Ops/s $\color{#35bf28}+4.13\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 0.3888s 26.3214ms 37.9919 Ops/s 54.3565 Ops/s $\textbf{\color{#d91a1a}-30.11\%}$

vmoens pushed a commit that referenced this pull request Feb 14, 2025
ghstack-source-id: e09b03b
Pull Request resolved: #2767
[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Feb 14, 2025
ghstack-source-id: e3d10c6
Pull Request resolved: #2767
[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Feb 17, 2025
ghstack-source-id: 2b165f4
Pull Request resolved: #2767
[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Feb 17, 2025
ghstack-source-id: 717bb31
Pull Request resolved: #2767
@vmoens vmoens merged commit e996c2f into gh/vmoens/85/base Feb 17, 2025
25 of 42 checks passed
@vmoens vmoens deleted the gh/vmoens/85/head branch February 17, 2025 11:13
@vmoens vmoens added the Suitable for minor Suitable to be integrated in minor release (no new feature) label Feb 17, 2025
vmoens pushed a commit that referenced this pull request Feb 17, 2025
ghstack-source-id: 717bb31
Pull Request resolved: #2767

(cherry picked from commit 27a8ecc)
kylelevy pushed a commit to kylelevy/rl that referenced this pull request Aug 4, 2025
ghstack-source-id: 717bb31
Pull Request resolved: pytorch#2767

(cherry picked from commit 27a8ecc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BE Better errors, logs, docs or test utils CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Suitable for minor Suitable to be integrated in minor release (no new feature)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载