+
Skip to content

Conversation

vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 10, 2025

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Feb 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2778

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens pushed a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: f33b49b
Pull Request resolved: #2778
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 10, 2025
@vmoens vmoens merged commit 4148190 into gh/vmoens/88/base Feb 10, 2025
54 of 58 checks passed
vmoens pushed a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: f33b49b
Pull Request resolved: #2778
@vmoens vmoens deleted the gh/vmoens/88/head branch February 10, 2025 17:40
vmoens pushed a commit that referenced this pull request Feb 10, 2025
ghstack-source-id: f33b49b
Pull Request resolved: #2778

(cherry picked from commit c2a149d)
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}47$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.6016s 0.5051s 1.9799 Ops/s 2.0943 Ops/s $\textbf{\color{#d91a1a}-5.46\%}$
test_transformed 1.0666s 0.9804s 1.0200 Ops/s 1.0159 Ops/s $\color{#35bf28}+0.40\%$
test_serial 1.5938s 1.4947s 0.6690 Ops/s 0.6561 Ops/s $\color{#35bf28}+1.97\%$
test_parallel 1.4031s 1.2974s 0.7708 Ops/s 0.7798 Ops/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[True-True-True-True-True] 0.1973ms 32.2481μs 31.0096 KOps/s 32.8160 KOps/s $\textbf{\color{#d91a1a}-5.50\%}$
test_step_mdp_speed[True-True-True-True-False] 0.5323ms 19.0655μs 52.4506 KOps/s 56.1487 KOps/s $\textbf{\color{#d91a1a}-6.59\%}$
test_step_mdp_speed[True-True-True-False-True] 79.8890μs 18.1584μs 55.0709 KOps/s 58.2405 KOps/s $\textbf{\color{#d91a1a}-5.44\%}$
test_step_mdp_speed[True-True-True-False-False] 29.6750μs 10.6002μs 94.3379 KOps/s 99.9729 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_step_mdp_speed[True-True-False-True-True] 80.0200μs 34.9105μs 28.6446 KOps/s 29.7501 KOps/s $\color{#d91a1a}-3.72\%$
test_step_mdp_speed[True-True-False-True-False] 47.2390μs 21.4815μs 46.5516 KOps/s 49.2838 KOps/s $\textbf{\color{#d91a1a}-5.54\%}$
test_step_mdp_speed[True-True-False-False-True] 0.1920ms 20.5273μs 48.7156 KOps/s 50.6142 KOps/s $\color{#d91a1a}-3.75\%$
test_step_mdp_speed[True-True-False-False-False] 38.6920μs 12.7977μs 78.1390 KOps/s 84.0368 KOps/s $\textbf{\color{#d91a1a}-7.02\%}$
test_step_mdp_speed[True-False-True-True-True] 77.4150μs 36.6921μs 27.2539 KOps/s 29.2959 KOps/s $\textbf{\color{#d91a1a}-6.97\%}$
test_step_mdp_speed[True-False-True-True-False] 57.6080μs 23.5755μs 42.4168 KOps/s 46.1649 KOps/s $\textbf{\color{#d91a1a}-8.12\%}$
test_step_mdp_speed[True-False-True-False-True] 50.8750μs 20.3821μs 49.0626 KOps/s 52.9318 KOps/s $\textbf{\color{#d91a1a}-7.31\%}$
test_step_mdp_speed[True-False-True-False-False] 70.2550μs 12.6878μs 78.8159 KOps/s 83.7426 KOps/s $\textbf{\color{#d91a1a}-5.88\%}$
test_step_mdp_speed[True-False-False-True-True] 86.3110μs 39.1135μs 25.5666 KOps/s 27.9192 KOps/s $\textbf{\color{#d91a1a}-8.43\%}$
test_step_mdp_speed[True-False-False-True-False] 0.1519ms 26.4250μs 37.8430 KOps/s 42.9338 KOps/s $\textbf{\color{#d91a1a}-11.86\%}$
test_step_mdp_speed[True-False-False-False-True] 52.4980μs 22.5538μs 44.3384 KOps/s 48.0097 KOps/s $\textbf{\color{#d91a1a}-7.65\%}$
test_step_mdp_speed[True-False-False-False-False] 39.1030μs 15.0244μs 66.5586 KOps/s 73.5137 KOps/s $\textbf{\color{#d91a1a}-9.46\%}$
test_step_mdp_speed[False-True-True-True-True] 0.2097ms 38.1992μs 26.1785 KOps/s 29.3658 KOps/s $\textbf{\color{#d91a1a}-10.85\%}$
test_step_mdp_speed[False-True-True-True-False] 0.2881ms 23.6957μs 42.2017 KOps/s 46.7382 KOps/s $\textbf{\color{#d91a1a}-9.71\%}$
test_step_mdp_speed[False-True-True-False-True] 0.6308ms 24.0212μs 41.6300 KOps/s 45.9774 KOps/s $\textbf{\color{#d91a1a}-9.46\%}$
test_step_mdp_speed[False-True-True-False-False] 41.4370μs 14.3711μs 69.5842 KOps/s 76.4505 KOps/s $\textbf{\color{#d91a1a}-8.98\%}$
test_step_mdp_speed[False-True-False-True-True] 76.7330μs 39.0024μs 25.6395 KOps/s 27.4296 KOps/s $\textbf{\color{#d91a1a}-6.53\%}$
test_step_mdp_speed[False-True-False-True-False] 60.4830μs 25.4700μs 39.2619 KOps/s 43.1433 KOps/s $\textbf{\color{#d91a1a}-9.00\%}$
test_step_mdp_speed[False-True-False-False-True] 2.7608ms 25.6966μs 38.9157 KOps/s 42.5131 KOps/s $\textbf{\color{#d91a1a}-8.46\%}$
test_step_mdp_speed[False-True-False-False-False] 0.1877ms 16.4440μs 60.8123 KOps/s 66.5775 KOps/s $\textbf{\color{#d91a1a}-8.66\%}$
test_step_mdp_speed[False-False-True-True-True] 78.5170μs 41.1676μs 24.2910 KOps/s 26.7424 KOps/s $\textbf{\color{#d91a1a}-9.17\%}$
test_step_mdp_speed[False-False-True-True-False] 0.2512ms 27.7226μs 36.0716 KOps/s 39.9057 KOps/s $\textbf{\color{#d91a1a}-9.61\%}$
test_step_mdp_speed[False-False-True-False-True] 54.1010μs 25.4635μs 39.2720 KOps/s 43.1170 KOps/s $\textbf{\color{#d91a1a}-8.92\%}$
test_step_mdp_speed[False-False-True-False-False] 52.9490μs 16.2342μs 61.5983 KOps/s 66.7239 KOps/s $\textbf{\color{#d91a1a}-7.68\%}$
test_step_mdp_speed[False-False-False-True-True] 82.0730μs 42.9541μs 23.2806 KOps/s 25.6187 KOps/s $\textbf{\color{#d91a1a}-9.13\%}$
test_step_mdp_speed[False-False-False-True-False] 83.5230μs 29.1208μs 34.3398 KOps/s 37.7227 KOps/s $\textbf{\color{#d91a1a}-8.97\%}$
test_step_mdp_speed[False-False-False-False-True] 60.2730μs 27.0935μs 36.9093 KOps/s 40.5089 KOps/s $\textbf{\color{#d91a1a}-8.89\%}$
test_step_mdp_speed[False-False-False-False-False] 48.4510μs 18.1577μs 55.0729 KOps/s 60.2621 KOps/s $\textbf{\color{#d91a1a}-8.61\%}$
test_values[generalized_advantage_estimate-True-True] 13.2488ms 9.9060ms 100.9489 Ops/s 102.4238 Ops/s $\color{#d91a1a}-1.44\%$
test_values[vec_generalized_advantage_estimate-True-True] 26.1589ms 24.1135ms 41.4705 Ops/s 41.1818 Ops/s $\color{#35bf28}+0.70\%$
test_values[td0_return_estimate-False-False] 0.2396ms 0.1771ms 5.6454 KOps/s 5.6624 KOps/s $\color{#d91a1a}-0.30\%$
test_values[td1_return_estimate-False-False] 32.9126ms 24.8856ms 40.1839 Ops/s 41.0840 Ops/s $\color{#d91a1a}-2.19\%$
test_values[vec_td1_return_estimate-False-False] 25.9869ms 24.1803ms 41.3560 Ops/s 41.4584 Ops/s $\color{#d91a1a}-0.25\%$
test_values[td_lambda_return_estimate-True-False] 36.1772ms 35.4335ms 28.2219 Ops/s 28.4293 Ops/s $\color{#d91a1a}-0.73\%$
test_values[vec_td_lambda_return_estimate-True-False] 26.0617ms 24.2909ms 41.1678 Ops/s 41.3400 Ops/s $\color{#d91a1a}-0.42\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.6748ms 8.5239ms 117.3174 Ops/s 116.4538 Ops/s $\color{#35bf28}+0.74\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2219ms 1.9128ms 522.7814 Ops/s 504.3376 Ops/s $\color{#35bf28}+3.66\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6287ms 0.3716ms 2.6910 KOps/s 2.6924 KOps/s $\color{#d91a1a}-0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.2340ms 43.6593ms 22.9046 Ops/s 23.4599 Ops/s $\color{#d91a1a}-2.37\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 6.3628ms 3.4426ms 290.4764 Ops/s 291.4713 Ops/s $\color{#d91a1a}-0.34\%$
test_dqn_speed[False-None] 6.0027ms 1.4018ms 713.3882 Ops/s 699.2643 Ops/s $\color{#35bf28}+2.02\%$
test_dqn_speed[False-backward] 1.9336ms 1.8887ms 529.4738 Ops/s 523.6090 Ops/s $\color{#35bf28}+1.12\%$
test_dqn_speed[True-None] 0.8262ms 0.4817ms 2.0759 KOps/s 2.0361 KOps/s $\color{#35bf28}+1.95\%$
test_dqn_speed[True-backward] 0.9775ms 0.9217ms 1.0850 KOps/s 1.0971 KOps/s $\color{#d91a1a}-1.11\%$
test_dqn_speed[reduce-overhead-None] 1.2117ms 0.4900ms 2.0407 KOps/s 2.0061 KOps/s $\color{#35bf28}+1.72\%$
test_dqn_speed[reduce-overhead-backward] 0.9691ms 0.9113ms 1.0973 KOps/s 1.0940 KOps/s $\color{#35bf28}+0.31\%$
test_ddpg_speed[False-None] 3.7156ms 2.9346ms 340.7635 Ops/s 338.7049 Ops/s $\color{#35bf28}+0.61\%$
test_ddpg_speed[False-backward] 5.1186ms 4.1052ms 243.5949 Ops/s 243.9213 Ops/s $\color{#d91a1a}-0.13\%$
test_ddpg_speed[True-None] 1.9052ms 1.2397ms 806.6167 Ops/s 808.9756 Ops/s $\color{#d91a1a}-0.29\%$
test_ddpg_speed[True-backward] 2.1332ms 2.1036ms 475.3673 Ops/s 455.0322 Ops/s $\color{#35bf28}+4.47\%$
test_ddpg_speed[reduce-overhead-None] 1.7116ms 1.2516ms 799.0008 Ops/s 798.8970 Ops/s $\color{#35bf28}+0.01\%$
test_ddpg_speed[reduce-overhead-backward] 2.4324ms 2.1652ms 461.8405 Ops/s 460.7026 Ops/s $\color{#35bf28}+0.25\%$
test_sac_speed[False-None] 9.7796ms 8.2499ms 121.2140 Ops/s 119.7764 Ops/s $\color{#35bf28}+1.20\%$
test_sac_speed[False-backward] 11.7745ms 10.9712ms 91.1477 Ops/s 89.2404 Ops/s $\color{#35bf28}+2.14\%$
test_sac_speed[True-None] 2.5792ms 2.0928ms 477.8214 Ops/s 470.1018 Ops/s $\color{#35bf28}+1.64\%$
test_sac_speed[True-backward] 3.8360ms 3.7544ms 266.3567 Ops/s 264.2625 Ops/s $\color{#35bf28}+0.79\%$
test_sac_speed[reduce-overhead-None] 2.5423ms 2.0945ms 477.4356 Ops/s 471.2849 Ops/s $\color{#35bf28}+1.31\%$
test_sac_speed[reduce-overhead-backward] 4.3921ms 4.0219ms 248.6403 Ops/s 257.4824 Ops/s $\color{#d91a1a}-3.43\%$
test_redq_speed[False-None] 14.8689ms 13.2990ms 75.1934 Ops/s 75.0357 Ops/s $\color{#35bf28}+0.21\%$
test_redq_speed[False-backward] 30.0553ms 22.9279ms 43.6150 Ops/s 43.6197 Ops/s $\color{#d91a1a}-0.01\%$
test_redq_speed[True-None] 5.4469ms 4.7320ms 211.3290 Ops/s 173.7527 Ops/s $\textbf{\color{#35bf28}+21.63\%}$
test_redq_speed[True-backward] 12.5105ms 11.9498ms 83.6831 Ops/s 76.2481 Ops/s $\textbf{\color{#35bf28}+9.75\%}$
test_redq_speed[reduce-overhead-None] 7.2900ms 5.9548ms 167.9326 Ops/s 193.7734 Ops/s $\textbf{\color{#d91a1a}-13.34\%}$
test_redq_speed[reduce-overhead-backward] 14.3174ms 13.7126ms 72.9257 Ops/s 81.0425 Ops/s $\textbf{\color{#d91a1a}-10.02\%}$
test_redq_deprec_speed[False-None] 15.5863ms 13.4499ms 74.3502 Ops/s 77.8229 Ops/s $\color{#d91a1a}-4.46\%$
test_redq_deprec_speed[False-backward] 21.0238ms 18.9553ms 52.7556 Ops/s 53.4968 Ops/s $\color{#d91a1a}-1.39\%$
test_redq_deprec_speed[True-None] 4.1658ms 3.8310ms 261.0283 Ops/s 260.1608 Ops/s $\color{#35bf28}+0.33\%$
test_redq_deprec_speed[True-backward] 10.2464ms 9.1802ms 108.9304 Ops/s 121.4666 Ops/s $\textbf{\color{#d91a1a}-10.32\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.2858ms 3.8635ms 258.8357 Ops/s 246.9651 Ops/s $\color{#35bf28}+4.81\%$
test_redq_deprec_speed[reduce-overhead-backward] 10.3928ms 8.9202ms 112.1045 Ops/s 121.4105 Ops/s $\textbf{\color{#d91a1a}-7.66\%}$
test_td3_speed[False-None] 8.5626ms 8.0776ms 123.7995 Ops/s 121.8791 Ops/s $\color{#35bf28}+1.58\%$
test_td3_speed[False-backward] 11.9323ms 10.9774ms 91.0966 Ops/s 94.8234 Ops/s $\color{#d91a1a}-3.93\%$
test_td3_speed[True-None] 1.9315ms 1.7980ms 556.1805 Ops/s 553.8176 Ops/s $\color{#35bf28}+0.43\%$
test_td3_speed[True-backward] 3.7841ms 3.4009ms 294.0428 Ops/s 289.8537 Ops/s $\color{#35bf28}+1.45\%$
test_td3_speed[reduce-overhead-None] 1.9812ms 1.7854ms 560.1099 Ops/s 550.3500 Ops/s $\color{#35bf28}+1.77\%$
test_td3_speed[reduce-overhead-backward] 3.9029ms 3.6955ms 270.6024 Ops/s 293.2281 Ops/s $\textbf{\color{#d91a1a}-7.72\%}$
test_cql_speed[False-None] 41.7732ms 38.3383ms 26.0836 Ops/s 27.0632 Ops/s $\color{#d91a1a}-3.62\%$
test_cql_speed[False-backward] 53.7224ms 48.0836ms 20.7971 Ops/s 21.1051 Ops/s $\color{#d91a1a}-1.46\%$
test_cql_speed[True-None] 18.5371ms 17.1247ms 58.3953 Ops/s 62.7165 Ops/s $\textbf{\color{#d91a1a}-6.89\%}$
test_cql_speed[True-backward] 25.8893ms 24.6220ms 40.6140 Ops/s 42.0943 Ops/s $\color{#d91a1a}-3.52\%$
test_cql_speed[reduce-overhead-None] 17.4542ms 16.7676ms 59.6388 Ops/s 62.1518 Ops/s $\color{#d91a1a}-4.04\%$
test_cql_speed[reduce-overhead-backward] 24.8621ms 24.2373ms 41.2587 Ops/s 41.9581 Ops/s $\color{#d91a1a}-1.67\%$
test_a2c_speed[False-None] 9.0095ms 8.0553ms 124.1415 Ops/s 132.8601 Ops/s $\textbf{\color{#d91a1a}-6.56\%}$
test_a2c_speed[False-backward] 16.1444ms 15.5203ms 64.4317 Ops/s 66.3516 Ops/s $\color{#d91a1a}-2.89\%$
test_a2c_speed[True-None] 5.4238ms 4.3573ms 229.4985 Ops/s 267.4522 Ops/s $\textbf{\color{#d91a1a}-14.19\%}$
test_a2c_speed[True-backward] 11.4777ms 10.9720ms 91.1410 Ops/s 97.5147 Ops/s $\textbf{\color{#d91a1a}-6.54\%}$
test_a2c_speed[reduce-overhead-None] 4.7890ms 3.8673ms 258.5765 Ops/s 268.1514 Ops/s $\color{#d91a1a}-3.57\%$
test_a2c_speed[reduce-overhead-backward] 11.0950ms 10.5177ms 95.0778 Ops/s 97.5324 Ops/s $\color{#d91a1a}-2.52\%$
test_ppo_speed[False-None] 8.9321ms 7.7459ms 129.1006 Ops/s 131.2797 Ops/s $\color{#d91a1a}-1.66\%$
test_ppo_speed[False-backward] 17.5174ms 15.8006ms 63.2888 Ops/s 68.4650 Ops/s $\textbf{\color{#d91a1a}-7.56\%}$
test_ppo_speed[True-None] 4.5544ms 4.1718ms 239.7075 Ops/s 243.6661 Ops/s $\color{#d91a1a}-1.62\%$
test_ppo_speed[True-backward] 11.1508ms 10.5357ms 94.9151 Ops/s 98.8084 Ops/s $\color{#d91a1a}-3.94\%$
test_ppo_speed[reduce-overhead-None] 5.0836ms 4.1881ms 238.7729 Ops/s 243.1842 Ops/s $\color{#d91a1a}-1.81\%$
test_ppo_speed[reduce-overhead-backward] 18.0212ms 11.0092ms 90.8335 Ops/s 98.7870 Ops/s $\textbf{\color{#d91a1a}-8.05\%}$
test_reinforce_speed[False-None] 7.9635ms 6.7026ms 149.1960 Ops/s 149.0059 Ops/s $\color{#35bf28}+0.13\%$
test_reinforce_speed[False-backward] 11.8708ms 10.6682ms 93.7361 Ops/s 102.0499 Ops/s $\textbf{\color{#d91a1a}-8.15\%}$
test_reinforce_speed[True-None] 5.4792ms 3.8023ms 263.0015 Ops/s 322.7623 Ops/s $\textbf{\color{#d91a1a}-18.52\%}$
test_reinforce_speed[True-backward] 11.3424ms 9.6383ms 103.7523 Ops/s 110.5996 Ops/s $\textbf{\color{#d91a1a}-6.19\%}$
test_reinforce_speed[reduce-overhead-None] 4.5187ms 3.4629ms 288.7756 Ops/s 318.1196 Ops/s $\textbf{\color{#d91a1a}-9.22\%}$
test_reinforce_speed[reduce-overhead-backward] 10.0716ms 9.3975ms 106.4111 Ops/s 108.3912 Ops/s $\color{#d91a1a}-1.83\%$
test_iql_speed[False-None] 35.2027ms 32.9959ms 30.3068 Ops/s 30.2324 Ops/s $\color{#35bf28}+0.25\%$
test_iql_speed[False-backward] 63.1165ms 46.8637ms 21.3385 Ops/s 21.4939 Ops/s $\color{#d91a1a}-0.72\%$
test_iql_speed[True-None] 13.2460ms 11.7154ms 85.3578 Ops/s 85.5262 Ops/s $\color{#d91a1a}-0.20\%$
test_iql_speed[True-backward] 24.2210ms 22.1562ms 45.1340 Ops/s 42.9113 Ops/s $\textbf{\color{#35bf28}+5.18\%}$
test_iql_speed[reduce-overhead-None] 12.2781ms 11.5800ms 86.3556 Ops/s 83.3304 Ops/s $\color{#35bf28}+3.63\%$
test_iql_speed[reduce-overhead-backward] 24.3549ms 23.1061ms 43.2786 Ops/s 42.7048 Ops/s $\color{#35bf28}+1.34\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.7558ms 5.1575ms 193.8939 Ops/s 197.4456 Ops/s $\color{#d91a1a}-1.80\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8048ms 0.5097ms 1.9618 KOps/s 1.9062 KOps/s $\color{#35bf28}+2.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6994ms 0.4917ms 2.0337 KOps/s 1.9749 KOps/s $\color{#35bf28}+2.98\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.3724ms 4.8615ms 205.6971 Ops/s 206.0365 Ops/s $\color{#d91a1a}-0.16\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.4268ms 0.5084ms 1.9668 KOps/s 1.9572 KOps/s $\color{#35bf28}+0.49\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7719ms 0.4819ms 2.0753 KOps/s 2.0236 KOps/s $\color{#35bf28}+2.55\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9039ms 1.6574ms 603.3589 Ops/s 592.5155 Ops/s $\color{#35bf28}+1.83\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.4754ms 1.5837ms 631.4484 Ops/s 621.7403 Ops/s $\color{#35bf28}+1.56\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.2901ms 5.1578ms 193.8829 Ops/s 203.3058 Ops/s $\color{#d91a1a}-4.63\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.7296ms 0.6674ms 1.4983 KOps/s 1.5190 KOps/s $\color{#d91a1a}-1.36\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0838ms 0.6430ms 1.5551 KOps/s 1.5132 KOps/s $\color{#35bf28}+2.77\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.0271ms 4.7995ms 208.3566 Ops/s 205.1619 Ops/s $\color{#35bf28}+1.56\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.8974ms 0.5257ms 1.9022 KOps/s 1.9037 KOps/s $\color{#d91a1a}-0.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7212ms 0.4937ms 2.0254 KOps/s 1.9839 KOps/s $\color{#35bf28}+2.09\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.4094ms 4.8081ms 207.9809 Ops/s 205.7237 Ops/s $\color{#35bf28}+1.10\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9221ms 0.5247ms 1.9057 KOps/s 1.9446 KOps/s $\color{#d91a1a}-2.00\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7561ms 0.5029ms 1.9884 KOps/s 1.9782 KOps/s $\color{#35bf28}+0.52\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.5259ms 5.0829ms 196.7365 Ops/s 198.8454 Ops/s $\color{#d91a1a}-1.06\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.6920ms 0.6606ms 1.5137 KOps/s 1.4881 KOps/s $\color{#35bf28}+1.72\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.3716ms 0.6631ms 1.5081 KOps/s 1.5383 KOps/s $\color{#d91a1a}-1.96\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.8338ms 4.3379ms 230.5254 Ops/s 214.5297 Ops/s $\textbf{\color{#35bf28}+7.46\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 5.0974ms 2.2783ms 438.9302 Ops/s 433.8921 Ops/s $\color{#35bf28}+1.16\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.4638ms 1.3641ms 733.0666 Ops/s 694.7019 Ops/s $\textbf{\color{#35bf28}+5.52\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4722s 13.7077ms 72.9516 Ops/s 223.4278 Ops/s $\textbf{\color{#d91a1a}-67.35\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.3130ms 2.1526ms 464.5526 Ops/s 419.1090 Ops/s $\textbf{\color{#35bf28}+10.84\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.1982ms 1.3364ms 748.2870 Ops/s 708.0284 Ops/s $\textbf{\color{#35bf28}+5.69\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.9183ms 4.5711ms 218.7662 Ops/s 217.5198 Ops/s $\color{#35bf28}+0.57\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.1430ms 2.5073ms 398.8384 Ops/s 406.9261 Ops/s $\color{#d91a1a}-1.99\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.7322ms 1.5428ms 648.1677 Ops/s 673.1424 Ops/s $\color{#d91a1a}-3.71\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.3119ms 11.9085ms 83.9739 Ops/s 75.2746 Ops/s $\textbf{\color{#35bf28}+11.56\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.5066ms 14.3528ms 69.6727 Ops/s 68.1151 Ops/s $\color{#35bf28}+2.29\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.5715ms 20.6411ms 48.4471 Ops/s 46.0975 Ops/s $\textbf{\color{#35bf28}+5.10\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.3690ms 14.8228ms 67.4638 Ops/s 67.1383 Ops/s $\color{#35bf28}+0.48\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.1226ms 20.9359ms 47.7648 Ops/s 46.4635 Ops/s $\color{#35bf28}+2.80\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.8529ms 16.0023ms 62.4910 Ops/s 61.7242 Ops/s $\color{#35bf28}+1.24\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8668s 0.7805s 1.2813 Ops/s 1.2779 Ops/s $\color{#35bf28}+0.27\%$
test_transformed 1.4413s 1.3545s 0.7383 Ops/s 0.7410 Ops/s $\color{#d91a1a}-0.36\%$
test_serial 2.3456s 2.2565s 0.4432 Ops/s 0.4411 Ops/s $\color{#35bf28}+0.48\%$
test_parallel 2.0599s 1.8865s 0.5301 Ops/s 0.5372 Ops/s $\color{#d91a1a}-1.33\%$
test_step_mdp_speed[True-True-True-True-True] 0.1665ms 40.3853μs 24.7615 KOps/s 25.2511 KOps/s $\color{#d91a1a}-1.94\%$
test_step_mdp_speed[True-True-True-True-False] 58.1900μs 23.3485μs 42.8294 KOps/s 43.4132 KOps/s $\color{#d91a1a}-1.34\%$
test_step_mdp_speed[True-True-True-False-True] 85.1710μs 22.1036μs 45.2415 KOps/s 45.7422 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-True-True-False-False] 40.0210μs 13.0471μs 76.6451 KOps/s 78.5039 KOps/s $\color{#d91a1a}-2.37\%$
test_step_mdp_speed[True-True-False-True-True] 79.0710μs 43.0768μs 23.2143 KOps/s 23.6765 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[True-True-False-True-False] 62.2200μs 25.7791μs 38.7911 KOps/s 39.5653 KOps/s $\color{#d91a1a}-1.96\%$
test_step_mdp_speed[True-True-False-False-True] 60.9310μs 24.7424μs 40.4164 KOps/s 41.3097 KOps/s $\color{#d91a1a}-2.16\%$
test_step_mdp_speed[True-True-False-False-False] 79.4620μs 15.3414μs 65.1830 KOps/s 66.4835 KOps/s $\color{#d91a1a}-1.96\%$
test_step_mdp_speed[True-False-True-True-True] 87.7710μs 45.2855μs 22.0821 KOps/s 22.7967 KOps/s $\color{#d91a1a}-3.13\%$
test_step_mdp_speed[True-False-True-True-False] 67.9210μs 27.8226μs 35.9420 KOps/s 36.4317 KOps/s $\color{#d91a1a}-1.34\%$
test_step_mdp_speed[True-False-True-False-True] 64.9310μs 24.4321μs 40.9298 KOps/s 41.4986 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[True-False-True-False-False] 51.3710μs 15.3657μs 65.0799 KOps/s 67.6467 KOps/s $\color{#d91a1a}-3.79\%$
test_step_mdp_speed[True-False-False-True-True] 86.8010μs 47.5120μs 21.0473 KOps/s 21.2390 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[True-False-False-True-False] 64.5010μs 29.9162μs 33.4267 KOps/s 33.0962 KOps/s $\color{#35bf28}+1.00\%$
test_step_mdp_speed[True-False-False-False-True] 58.6010μs 26.4265μs 37.8408 KOps/s 37.8989 KOps/s $\color{#d91a1a}-0.15\%$
test_step_mdp_speed[True-False-False-False-False] 53.3900μs 17.4031μs 57.4610 KOps/s 57.1427 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[False-True-True-True-True] 84.4610μs 44.8287μs 22.3072 KOps/s 22.4145 KOps/s $\color{#d91a1a}-0.48\%$
test_step_mdp_speed[False-True-True-True-False] 63.2310μs 28.1763μs 35.4908 KOps/s 35.9053 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[False-True-True-False-True] 71.3110μs 28.4237μs 35.1819 KOps/s 37.0047 KOps/s $\color{#d91a1a}-4.93\%$
test_step_mdp_speed[False-True-True-False-False] 47.9810μs 17.1289μs 58.3809 KOps/s 59.8522 KOps/s $\color{#d91a1a}-2.46\%$
test_step_mdp_speed[False-True-False-True-True] 0.1333ms 46.4057μs 21.5491 KOps/s 21.3991 KOps/s $\color{#35bf28}+0.70\%$
test_step_mdp_speed[False-True-False-True-False] 55.0300μs 30.4598μs 32.8302 KOps/s 32.9705 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[False-True-False-False-True] 3.1431ms 31.2964μs 31.9525 KOps/s 32.0817 KOps/s $\color{#d91a1a}-0.40\%$
test_step_mdp_speed[False-True-False-False-False] 49.0710μs 19.5061μs 51.2660 KOps/s 52.1298 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[False-False-True-True-True] 84.2710μs 49.5668μs 20.1748 KOps/s 20.5252 KOps/s $\color{#d91a1a}-1.71\%$
test_step_mdp_speed[False-False-True-True-False] 0.1020ms 32.9009μs 30.3943 KOps/s 31.1229 KOps/s $\color{#d91a1a}-2.34\%$
test_step_mdp_speed[False-False-True-False-True] 69.2210μs 30.8928μs 32.3700 KOps/s 33.4292 KOps/s $\color{#d91a1a}-3.17\%$
test_step_mdp_speed[False-False-True-False-False] 49.0610μs 19.4839μs 51.3245 KOps/s 54.3031 KOps/s $\textbf{\color{#d91a1a}-5.49\%}$
test_step_mdp_speed[False-False-False-True-True] 85.3510μs 51.3867μs 19.4603 KOps/s 19.7190 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[False-False-False-True-False] 65.3910μs 35.0386μs 28.5399 KOps/s 28.5479 KOps/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[False-False-False-False-True] 75.0210μs 32.4184μs 30.8466 KOps/s 30.8732 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-False-False-False-False] 55.1410μs 21.6508μs 46.1876 KOps/s 47.4331 KOps/s $\color{#d91a1a}-2.63\%$
test_values[generalized_advantage_estimate-True-True] 24.2596ms 23.9595ms 41.7371 Ops/s 39.1633 Ops/s $\textbf{\color{#35bf28}+6.57\%}$
test_values[vec_generalized_advantage_estimate-True-True] 0.1025s 2.9394ms 340.2045 Ops/s 320.6511 Ops/s $\textbf{\color{#35bf28}+6.10\%}$
test_values[td0_return_estimate-False-False] 0.1053ms 78.7199μs 12.7033 KOps/s 12.5907 KOps/s $\color{#35bf28}+0.89\%$
test_values[td1_return_estimate-False-False] 53.5986ms 53.1484ms 18.8152 Ops/s 17.9872 Ops/s $\color{#35bf28}+4.60\%$
test_values[vec_td1_return_estimate-False-False] 1.2726ms 1.0701ms 934.4929 Ops/s 930.0704 Ops/s $\color{#35bf28}+0.48\%$
test_values[td_lambda_return_estimate-True-False] 85.0589ms 84.4878ms 11.8360 Ops/s 11.5089 Ops/s $\color{#35bf28}+2.84\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2801ms 1.0669ms 937.3067 Ops/s 935.1985 Ops/s $\color{#35bf28}+0.23\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.2990ms 23.9750ms 41.7102 Ops/s 38.9043 Ops/s $\textbf{\color{#35bf28}+7.21\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0091ms 0.7358ms 1.3590 KOps/s 1.3383 KOps/s $\color{#35bf28}+1.55\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7406ms 0.6555ms 1.5256 KOps/s 1.5112 KOps/s $\color{#35bf28}+0.95\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5360ms 1.4729ms 678.9206 Ops/s 676.8213 Ops/s $\color{#35bf28}+0.31\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7292ms 0.6703ms 1.4919 KOps/s 1.4845 KOps/s $\color{#35bf28}+0.50\%$
test_dqn_speed[False-None] 6.9382ms 1.5023ms 665.6540 Ops/s 662.9154 Ops/s $\color{#35bf28}+0.41\%$
test_dqn_speed[False-backward] 2.1744ms 2.0968ms 476.9112 Ops/s 471.5557 Ops/s $\color{#35bf28}+1.14\%$
test_dqn_speed[True-None] 0.6470ms 0.5439ms 1.8384 KOps/s 1.8157 KOps/s $\color{#35bf28}+1.25\%$
test_dqn_speed[True-backward] 1.2534ms 1.2079ms 827.8596 Ops/s 882.8864 Ops/s $\textbf{\color{#d91a1a}-6.23\%}$
test_dqn_speed[reduce-overhead-None] 0.6567ms 0.5644ms 1.7718 KOps/s 1.7737 KOps/s $\color{#d91a1a}-0.11\%$
test_dqn_speed[reduce-overhead-backward] 1.1014ms 1.0557ms 947.2225 Ops/s 1.0449 KOps/s $\textbf{\color{#d91a1a}-9.35\%}$
test_ddpg_speed[False-None] 3.1365ms 2.8178ms 354.8851 Ops/s 349.6496 Ops/s $\color{#35bf28}+1.50\%$
test_ddpg_speed[False-backward] 4.6327ms 4.2211ms 236.9067 Ops/s 242.0639 Ops/s $\color{#d91a1a}-2.13\%$
test_ddpg_speed[True-None] 1.3882ms 1.3250ms 754.7107 Ops/s 754.2544 Ops/s $\color{#35bf28}+0.06\%$
test_ddpg_speed[True-backward] 2.7538ms 2.5950ms 385.3532 Ops/s 412.0127 Ops/s $\textbf{\color{#d91a1a}-6.47\%}$
test_ddpg_speed[reduce-overhead-None] 1.4490ms 1.3762ms 726.6606 Ops/s 745.9037 Ops/s $\color{#d91a1a}-2.58\%$
test_ddpg_speed[reduce-overhead-backward] 2.0725ms 2.0222ms 494.5029 Ops/s 526.7932 Ops/s $\textbf{\color{#d91a1a}-6.13\%}$
test_sac_speed[False-None] 8.3108ms 7.8913ms 126.7211 Ops/s 124.2552 Ops/s $\color{#35bf28}+1.98\%$
test_sac_speed[False-backward] 11.5811ms 11.1031ms 90.0652 Ops/s 91.2021 Ops/s $\color{#d91a1a}-1.25\%$
test_sac_speed[True-None] 1.9051ms 1.8213ms 549.0463 Ops/s 544.9726 Ops/s $\color{#35bf28}+0.75\%$
test_sac_speed[True-backward] 4.1331ms 3.7150ms 269.1786 Ops/s 277.0160 Ops/s $\color{#d91a1a}-2.83\%$
test_sac_speed[reduce-overhead-None] 21.3288ms 12.0603ms 82.9170 Ops/s 85.8106 Ops/s $\color{#d91a1a}-3.37\%$
test_sac_speed[reduce-overhead-backward] 1.8208ms 1.7745ms 563.5322 Ops/s 596.5271 Ops/s $\textbf{\color{#d91a1a}-5.53\%}$
test_redq_speed[False-None] 7.8526ms 7.3507ms 136.0412 Ops/s 131.6041 Ops/s $\color{#35bf28}+3.37\%$
test_redq_speed[False-backward] 12.1162ms 11.5491ms 86.5865 Ops/s 87.4418 Ops/s $\color{#d91a1a}-0.98\%$
test_redq_speed[True-None] 2.3956ms 2.2988ms 435.0095 Ops/s 428.8774 Ops/s $\color{#35bf28}+1.43\%$
test_redq_speed[True-backward] 4.3724ms 4.2024ms 237.9607 Ops/s 248.2448 Ops/s $\color{#d91a1a}-4.14\%$
test_redq_speed[reduce-overhead-None] 2.4572ms 2.3312ms 428.9566 Ops/s 424.8418 Ops/s $\color{#35bf28}+0.97\%$
test_redq_speed[reduce-overhead-backward] 4.6586ms 4.2110ms 237.4708 Ops/s 238.9069 Ops/s $\color{#d91a1a}-0.60\%$
test_redq_deprec_speed[False-None] 9.5844ms 8.9958ms 111.1631 Ops/s 110.7328 Ops/s $\color{#35bf28}+0.39\%$
test_redq_deprec_speed[False-backward] 12.7058ms 12.2009ms 81.9612 Ops/s 81.3345 Ops/s $\color{#35bf28}+0.77\%$
test_redq_deprec_speed[True-None] 2.7129ms 2.6103ms 383.0967 Ops/s 375.6551 Ops/s $\color{#35bf28}+1.98\%$
test_redq_deprec_speed[True-backward] 4.9001ms 4.4652ms 223.9543 Ops/s 227.3044 Ops/s $\color{#d91a1a}-1.47\%$
test_redq_deprec_speed[reduce-overhead-None] 2.6783ms 2.6158ms 382.2928 Ops/s 377.9991 Ops/s $\color{#35bf28}+1.14\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.8166ms 4.4420ms 225.1219 Ops/s 227.8216 Ops/s $\color{#d91a1a}-1.19\%$
test_td3_speed[False-None] 8.0846ms 7.8720ms 127.0318 Ops/s 126.1872 Ops/s $\color{#35bf28}+0.67\%$
test_td3_speed[False-backward] 11.0959ms 10.4105ms 96.0565 Ops/s 97.6214 Ops/s $\color{#d91a1a}-1.60\%$
test_td3_speed[True-None] 1.7180ms 1.6383ms 610.4016 Ops/s 597.2819 Ops/s $\color{#35bf28}+2.20\%$
test_td3_speed[True-backward] 3.3843ms 3.3224ms 300.9881 Ops/s 309.6906 Ops/s $\color{#d91a1a}-2.81\%$
test_td3_speed[reduce-overhead-None] 51.1828ms 26.0749ms 38.3511 Ops/s 38.6738 Ops/s $\color{#d91a1a}-0.83\%$
test_td3_speed[reduce-overhead-backward] 1.5243ms 1.4793ms 675.9910 Ops/s 729.1283 Ops/s $\textbf{\color{#d91a1a}-7.29\%}$
test_cql_speed[False-None] 17.0930ms 16.5618ms 60.3799 Ops/s 59.6971 Ops/s $\color{#35bf28}+1.14\%$
test_cql_speed[False-backward] 22.9608ms 22.0348ms 45.3828 Ops/s 45.7995 Ops/s $\color{#d91a1a}-0.91\%$
test_cql_speed[True-None] 3.3786ms 3.2624ms 306.5204 Ops/s 298.0616 Ops/s $\color{#35bf28}+2.84\%$
test_cql_speed[True-backward] 6.1185ms 5.6681ms 176.4245 Ops/s 180.3437 Ops/s $\color{#d91a1a}-2.17\%$
test_cql_speed[reduce-overhead-None] 21.3320ms 13.1043ms 76.3110 Ops/s 76.0933 Ops/s $\color{#35bf28}+0.29\%$
test_cql_speed[reduce-overhead-backward] 1.9850ms 1.9159ms 521.9602 Ops/s 535.0573 Ops/s $\color{#d91a1a}-2.45\%$
test_a2c_speed[False-None] 3.2821ms 3.1431ms 318.1544 Ops/s 311.2862 Ops/s $\color{#35bf28}+2.21\%$
test_a2c_speed[False-backward] 6.8956ms 6.2580ms 159.7956 Ops/s 159.8970 Ops/s $\color{#d91a1a}-0.06\%$
test_a2c_speed[True-None] 1.4367ms 1.3429ms 744.6554 Ops/s 741.7609 Ops/s $\color{#35bf28}+0.39\%$
test_a2c_speed[True-backward] 2.9304ms 2.8806ms 347.1534 Ops/s 322.3978 Ops/s $\textbf{\color{#35bf28}+7.68\%}$
test_a2c_speed[reduce-overhead-None] 15.6966ms 8.9381ms 111.8812 Ops/s 112.4974 Ops/s $\color{#d91a1a}-0.55\%$
test_a2c_speed[reduce-overhead-backward] 1.5281ms 1.4607ms 684.6241 Ops/s 618.5315 Ops/s $\textbf{\color{#35bf28}+10.69\%}$
test_ppo_speed[False-None] 3.7443ms 3.6374ms 274.9198 Ops/s 270.3820 Ops/s $\color{#35bf28}+1.68\%$
test_ppo_speed[False-backward] 7.2279ms 6.7666ms 147.7851 Ops/s 141.6650 Ops/s $\color{#35bf28}+4.32\%$
test_ppo_speed[True-None] 1.5261ms 1.4085ms 709.9706 Ops/s 703.4189 Ops/s $\color{#35bf28}+0.93\%$
test_ppo_speed[True-backward] 3.1132ms 3.0475ms 328.1355 Ops/s 305.4509 Ops/s $\textbf{\color{#35bf28}+7.43\%}$
test_ppo_speed[reduce-overhead-None] 1.1747ms 0.9602ms 1.0415 KOps/s 1.0316 KOps/s $\color{#35bf28}+0.96\%$
test_ppo_speed[reduce-overhead-backward] 1.5230ms 1.4139ms 707.2868 Ops/s 616.6940 Ops/s $\textbf{\color{#35bf28}+14.69\%}$
test_reinforce_speed[False-None] 2.3363ms 2.2557ms 443.3205 Ops/s 435.5687 Ops/s $\color{#35bf28}+1.78\%$
test_reinforce_speed[False-backward] 3.5109ms 3.2565ms 307.0790 Ops/s 293.3500 Ops/s $\color{#35bf28}+4.68\%$
test_reinforce_speed[True-None] 1.3924ms 1.2912ms 774.4878 Ops/s 762.2343 Ops/s $\color{#35bf28}+1.61\%$
test_reinforce_speed[True-backward] 2.9665ms 2.9095ms 343.7032 Ops/s 340.8696 Ops/s $\color{#35bf28}+0.83\%$
test_reinforce_speed[reduce-overhead-None] 18.0567ms 10.0072ms 99.9285 Ops/s 99.9358 Ops/s $-0.01\%$
test_reinforce_speed[reduce-overhead-backward] 1.5713ms 1.4989ms 667.1452 Ops/s 652.3304 Ops/s $\color{#35bf28}+2.27\%$
test_iql_speed[False-None] 9.6338ms 9.0978ms 109.9164 Ops/s 107.7623 Ops/s $\color{#35bf28}+2.00\%$
test_iql_speed[False-backward] 13.0823ms 12.6581ms 79.0010 Ops/s 76.9645 Ops/s $\color{#35bf28}+2.65\%$
test_iql_speed[True-None] 2.3935ms 2.2158ms 451.2950 Ops/s 442.9562 Ops/s $\color{#35bf28}+1.88\%$
test_iql_speed[True-backward] 4.9481ms 4.7535ms 210.3721 Ops/s 199.1278 Ops/s $\textbf{\color{#35bf28}+5.65\%}$
test_iql_speed[reduce-overhead-None] 19.3702ms 10.7507ms 93.0171 Ops/s 89.5257 Ops/s $\color{#35bf28}+3.90\%$
test_iql_speed[reduce-overhead-backward] 1.9838ms 1.9004ms 526.1968 Ops/s 470.2510 Ops/s $\textbf{\color{#35bf28}+11.90\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6990ms 6.3278ms 158.0338 Ops/s 155.9150 Ops/s $\color{#35bf28}+1.36\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5981ms 0.3612ms 2.7683 KOps/s 3.4316 KOps/s $\textbf{\color{#d91a1a}-19.33\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4862ms 0.2461ms 4.0627 KOps/s 3.8651 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3006ms 6.0257ms 165.9555 Ops/s 164.9459 Ops/s $\color{#35bf28}+0.61\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9048ms 0.2823ms 3.5424 KOps/s 3.1188 KOps/s $\textbf{\color{#35bf28}+13.58\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4940ms 0.2626ms 3.8087 KOps/s 4.2524 KOps/s $\textbf{\color{#d91a1a}-10.43\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.8292ms 1.2831ms 779.3610 Ops/s 802.9565 Ops/s $\color{#d91a1a}-2.94\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7018ms 1.1861ms 843.0964 Ops/s 861.6634 Ops/s $\color{#d91a1a}-2.15\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6440ms 6.2334ms 160.4272 Ops/s 159.1904 Ops/s $\color{#35bf28}+0.78\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7129ms 0.4551ms 2.1975 KOps/s 2.4362 KOps/s $\textbf{\color{#d91a1a}-9.80\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7704ms 0.4271ms 2.3412 KOps/s 2.3398 KOps/s $\color{#35bf28}+0.06\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2020ms 6.0305ms 165.8241 Ops/s 162.0465 Ops/s $\color{#35bf28}+2.33\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.6776ms 0.2703ms 3.6996 KOps/s 3.6454 KOps/s $\color{#35bf28}+1.49\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4790ms 0.2826ms 3.5386 KOps/s 3.5924 KOps/s $\color{#d91a1a}-1.50\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.9531ms 6.0308ms 165.8161 Ops/s 163.8816 Ops/s $\color{#35bf28}+1.18\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8820ms 0.3145ms 3.1798 KOps/s 3.3728 KOps/s $\textbf{\color{#d91a1a}-5.72\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5161ms 0.2803ms 3.5681 KOps/s 3.5605 KOps/s $\color{#35bf28}+0.21\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4179ms 6.1824ms 161.7492 Ops/s 158.7559 Ops/s $\color{#35bf28}+1.89\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8036ms 0.4037ms 2.4772 KOps/s 2.2128 KOps/s $\textbf{\color{#35bf28}+11.95\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5887ms 0.3841ms 2.6032 KOps/s 2.4791 KOps/s $\textbf{\color{#35bf28}+5.01\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0603ms 5.4793ms 182.5038 Ops/s 179.8696 Ops/s $\color{#35bf28}+1.46\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.8566ms 2.0262ms 493.5276 Ops/s 423.7249 Ops/s $\textbf{\color{#35bf28}+16.47\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.0472ms 0.9384ms 1.0657 KOps/s 851.6349 Ops/s $\textbf{\color{#35bf28}+25.14\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.2576ms 5.5892ms 178.9154 Ops/s 182.3380 Ops/s $\color{#d91a1a}-1.88\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.1953ms 2.0706ms 482.9534 Ops/s 417.1184 Ops/s $\textbf{\color{#35bf28}+15.78\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.5188ms 1.1684ms 855.8680 Ops/s 815.3392 Ops/s $\color{#35bf28}+4.97\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5069s 15.8568ms 63.0646 Ops/s 176.7788 Ops/s $\textbf{\color{#d91a1a}-64.33\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.2070ms 1.8708ms 534.5334 Ops/s 35.3954 Ops/s $\textbf{\color{#35bf28}+1410.18\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2541ms 1.2338ms 810.4986 Ops/s 815.3902 Ops/s $\color{#d91a1a}-0.60\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.2583ms 13.0084ms 76.8736 Ops/s 75.3236 Ops/s $\color{#35bf28}+2.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.1877ms 16.7095ms 59.8463 Ops/s 59.9513 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 17.8970ms 17.6437ms 56.6774 Ops/s 55.3413 Ops/s $\color{#35bf28}+2.41\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.8391ms 16.9017ms 59.1656 Ops/s 59.3056 Ops/s $\color{#d91a1a}-0.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.8638ms 17.5810ms 56.8797 Ops/s 55.4461 Ops/s $\color{#35bf28}+2.59\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 24.3253ms 18.1201ms 55.1872 Ops/s 54.1841 Ops/s $\color{#35bf28}+1.85\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载