From 4a225bc4d7714e6479f40acb85384a1fe9e80217 Mon Sep 17 00:00:00 2001 From: chy <308604256@qq.com> Date: Sat, 23 Apr 2022 16:05:51 +0800 Subject: [PATCH 1/4] page --- docs/_static/images/mujoco_comparison.svg | 1317 +++++++++++++++++++++ docs/_static/images/mujoco_time.svg | 287 +++++ docs/tutorials/benchmark.rst | 13 +- 3 files changed, 1615 insertions(+), 2 deletions(-) create mode 100644 docs/_static/images/mujoco_comparison.svg create mode 100644 docs/_static/images/mujoco_time.svg diff --git a/docs/_static/images/mujoco_comparison.svg b/docs/_static/images/mujoco_comparison.svg new file mode 100644 index 000000000..631604c1f --- /dev/null +++ b/docs/_static/images/mujoco_comparison.svg @@ -0,0 +1,1317 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/_static/images/mujoco_time.svg b/docs/_static/images/mujoco_time.svg new file mode 100644 index 000000000..ac4981770 --- /dev/null +++ b/docs/_static/images/mujoco_time.svg @@ -0,0 +1,287 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/tutorials/benchmark.rst b/docs/tutorials/benchmark.rst index c3cb0676a..8a50ff273 100644 --- a/docs/tutorials/benchmark.rst +++ b/docs/tutorials/benchmark.rst @@ -5,9 +5,9 @@ Benchmark Mujoco Benchmark ---------------- -Tianshou's Mujoco benchmark contains state-of-the-art results (even better than `SpinningUp `_!). +Tianshou's Mujoco benchmark contains state-of-the-art results. -Please refer to https://github.com/thu-ml/tianshou/tree/master/examples/mujoco +Every experiment is conducted under 10 random seeds for 1-10M steps. Please refer to https://github.com/thu-ml/tianshou/tree/master/examples/mujoco for source code and detailed results. .. raw:: html @@ -18,6 +18,15 @@ Please refer to https://github.com/thu-ml/tianshou/tree/master/examples/mujoco
+The table below compares the performance of Tianshou against published results on OpenAI Gym MuJoCo benchmarks. We use max average return in 1M timesteps as the reward metric. ~ means the result is approximated from the plots because quantitative results are not provided. - means results are not provided. The best-performing baseline on each task is highlighted in boldface. Referenced baselines include `TD3 paper `_, `SAC paper `_, `PPO paper `_, `ACKTR paper `_, `OpenAI Baselines `_ and `Spinning Up `_. + +.. image:: /_static/images/mujoco_comparison.svg + +Runtime averaged on 8 benchmarked MuJoCo tasks is listed below. All results are obtained using a single Nvidia TITAN X GPU and +up to 48 CPU cores (at most one CPU core for each thread). + +.. image:: /_static/images/mujoco_time.svg + Atari Benchmark --------------- From ede6b811cea5b705c383461163f0ea1ff142a908 Mon Sep 17 00:00:00 2001 From: Jiayi Weng Date: Sat, 23 Apr 2022 10:27:47 -0400 Subject: [PATCH 2/4] add rst table --- docs/_static/images/mujoco_comparison.svg | 1317 --------------------- docs/_static/images/mujoco_time.svg | 287 ----- docs/spelling_wordlist.txt | 4 + docs/tutorials/benchmark.rst | 71 +- examples/mujoco/README.md | 2 +- 5 files changed, 72 insertions(+), 1609 deletions(-) delete mode 100644 docs/_static/images/mujoco_comparison.svg delete mode 100644 docs/_static/images/mujoco_time.svg diff --git a/docs/_static/images/mujoco_comparison.svg b/docs/_static/images/mujoco_comparison.svg deleted file mode 100644 index 631604c1f..000000000 --- a/docs/_static/images/mujoco_comparison.svg +++ /dev/null @@ -1,1317 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/docs/_static/images/mujoco_time.svg b/docs/_static/images/mujoco_time.svg deleted file mode 100644 index ac4981770..000000000 --- a/docs/_static/images/mujoco_time.svg +++ /dev/null @@ -1,287 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/docs/spelling_wordlist.txt b/docs/spelling_wordlist.txt index b334e1db9..cf78b00b8 100644 --- a/docs/spelling_wordlist.txt +++ b/docs/spelling_wordlist.txt @@ -150,3 +150,7 @@ ppo Jupyter Colab Colaboratory +IPendulum +Reacher +Runtime +Nvidia diff --git a/docs/tutorials/benchmark.rst b/docs/tutorials/benchmark.rst index 8a50ff273..9c12e62c1 100644 --- a/docs/tutorials/benchmark.rst +++ b/docs/tutorials/benchmark.rst @@ -20,12 +20,75 @@ Every experiment is conducted under 10 random seeds for 1-10M steps. Please refe The table below compares the performance of Tianshou against published results on OpenAI Gym MuJoCo benchmarks. We use max average return in 1M timesteps as the reward metric. ~ means the result is approximated from the plots because quantitative results are not provided. - means results are not provided. The best-performing baseline on each task is highlighted in boldface. Referenced baselines include `TD3 paper `_, `SAC paper `_, `PPO paper `_, `ACKTR paper `_, `OpenAI Baselines `_ and `Spinning Up `_. -.. image:: /_static/images/mujoco_comparison.svg ++---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +|Task |Ant |HalfCheetah|Hopper |Walker2d |Swimmer |Humanoid |Reacher |IPendulum |IDPendulum| ++=========+================+==========+===========+==========+==========+=========+==========+========+==========+==========+ +|DDPG |Tianshou |990.4 |**11718.7**|**2197.0**|1400.6 |**144.1**|**177.3** |**-3.3**|**1000.0**|8364.3 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |TD3 Paper |**1005.3**|3305.6 |**2020.5**|1843.6 |/ |/ |-6.51 |**1000.0**|**9355.5**| ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |TD3 Paper (Our) |888.8 |8577.3 |1860.0 |**3098.1**|/ |/ |-4.01 |**1000.0**|8370.0 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |Spinning Up |~840 |~11000 |~1800 |~1950 |~137 |/ |/ |/ |/ | ++---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +|TD3 |Tianshou |**5116.4**|**10201.2**|3472.2 |3982.4 |**104.2**|**5189.5**|**-2.7**|**1000.0**|**9349.2**| ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |TD3 Paper |4372.4 |9637.0 |**3564.1**|**4682.8**|/ |/ |-3.6 |**1000.0**|**9337.5**| ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |Spinning Up |~3800 |~9750 |~2860 |~4000 |~78 |/ |/ |/ |/ | ++---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +|SAC |Tianshou |**5850.2**|**12138.8**|**3542.2**|**5007.0**|**44.4** |**5488.5**|**-2.6**|**1000.0**|**9359.5**| ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |SAC Paper |~3720 |~10400 |~3370 |~3740 |/ |~5200 |/ |/ |/ | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |TD3 Paper |655.4 |2347.2 |2996.7 |1283.7 |/ |/ |-4.4 |**1000.0**|8487.2 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |Spinning Up |~3980 |~11520 |~3150 |~4250 |~41.7 |/ |/ |/ |/ | ++---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +|A2C |Tianshou |**3485.4**|**1829.9** |**1253.2**|**1091.6**|**36.6** |**1726.0**|**-6.7**|**1000.0**|**9257.7**| ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |PPO Paper |/ |~1000 |~900 |~850 |~31 |/ |~-24 |**1000.0**|~7100 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |PPO Paper (TR) |/ |~930 |~1220 |~700 |**~36** |/ |~-27 |**1000.0**|~8100 | ++---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +|PPO |Tianshou |**3258.4**|**5783.9** |**2609.3**|3588.5 |66.7 |**787.1** |**-4.1**|**1000.0**|**9231.3**| ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |PPO Paper |/ |~1800 |~2330 |~3460 |~108 |/ |~-7 |**1000.0**|~8000 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |TD3 Paper |1083.2 |1795.4 |2164.7 |3317.7 |/ |/ |-6.2 |**1000.0**|8977.9 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |OpenAI Baselines|/ |~1700 |~2400 |~3510 |~111 |/ |~-6 |~940 |~7350 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |Spinning Up |~650 |~1670 |~1850 |~1230 |**~120** |/ |/ |/ |/ | ++---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +|TRPO |Tianshou |**2866.7**|**4471.2** |2046.0 |**3826.7**|40.9 |**810.1** |**-5.1**|**1000.0**|**8435.2**| ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |ACKTR paper |~0 |~400 |~1400 |~550 |~40 |/ |-8 |**1000.0**|~800 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |PPO Paper |/ |~0 |~2100 |~1100 |**~121** |/ |~-115 |**1000.0**|~200 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |TD3 paper |-75.9 |-15.6 |**2471.3**|2321.5 |/ |/ |-111.4 |985.4 |205.9 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |OpenAI Baselines|/ |~1350 |**~2200** |~2350 |~95 |/ |**~-5** |~910 |~7000 | ++ +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ +| |Spinning Up (TF)|~150 |~850 |~1200 |~600 |~85 |/ |/ |/ |/ | ++---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ -Runtime averaged on 8 benchmarked MuJoCo tasks is listed below. All results are obtained using a single Nvidia TITAN X GPU and -up to 48 CPU cores (at most one CPU core for each thread). +Runtime averaged on 8 MuJoCo benchmark tasks is listed below. All results are obtained using a single Nvidia TITAN X GPU and +up to 48 CPU cores (at most one CPU core for each thread). -.. image:: /_static/images/mujoco_time.svg +========= ========= ============ ============== ============ ============== ========== +Algorithm # of Envs 1M timesteps Collecting (%) Updating (%) Evaluating (%) Others (%) +========= ========= ============ ============== ============ ============== ========== +DDPG 1 2.9h 12.0 80.2 2.4 5.4 +TD3 1 3.3h 11.4 81.7 1.7 5.2 +SAC 1 5.2h 10.9 83.8 1.8 3.5 +REINFORCE 64 4min 84.9 1.8 12.5 0.8 +A2C 16 7min 62.5 28.0 6.6 2.9 +PPO 64 24min 11.4 85.3 3.2 0.2 +NPG 16 7min 65.1 24.9 9.5 0.6 +TRPO 16 7min 62.9 26.5 10.1 0.6 +========= ========= ============ ============== ============ ============== ========== Atari Benchmark diff --git a/examples/mujoco/README.md b/examples/mujoco/README.md index ff37db4b2..8890466f8 100644 --- a/examples/mujoco/README.md +++ b/examples/mujoco/README.md @@ -247,7 +247,7 @@ For pretrained agents, detailed graphs (single agent, single game) and log detai ### TRPO -| Environment | Tianshou (1M) | [ACKTR paper](https://arxiv.org/pdf/1708.05144.pdf) | [PPO paper](https://arxiv.org/pdf/1707.06347.pdf) | [OpenAI Baselines](https://github.com/openai/baselines/blob/master/benchmarks_mujoco1M.htm) | [Spinning Up (PyTorch)](https://spinningup.openai.com/en/latest/spinningup/bench.html) | +| Environment | Tianshou (1M) | [ACKTR paper](https://arxiv.org/pdf/1708.05144.pdf) | [PPO paper](https://arxiv.org/pdf/1707.06347.pdf) | [OpenAI Baselines](https://github.com/openai/baselines/blob/master/benchmarks_mujoco1M.htm) | [Spinning Up (Tensorflow)](https://spinningup.openai.com/en/latest/spinningup/bench.html) | | :--------------------: | :---------------: | :-------------------------------------------------: | :-----------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | | Ant | **2866.7±707.9** | ~0 | N | N | ~150 | | HalfCheetah | **4471.2±804.9** | ~400 | ~0 | ~1350 | ~850 | From 95fd8f1704e4c1063e562dbec86f60119f6151d2 Mon Sep 17 00:00:00 2001 From: Jiayi Weng Date: Sat, 23 Apr 2022 22:34:32 +0800 Subject: [PATCH 3/4] Update docs/tutorials/benchmark.rst --- docs/tutorials/benchmark.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/tutorials/benchmark.rst b/docs/tutorials/benchmark.rst index 9c12e62c1..96e0b7a58 100644 --- a/docs/tutorials/benchmark.rst +++ b/docs/tutorials/benchmark.rst @@ -78,7 +78,7 @@ Runtime averaged on 8 MuJoCo benchmark tasks is listed below. All results are ob up to 48 CPU cores (at most one CPU core for each thread). ========= ========= ============ ============== ============ ============== ========== -Algorithm # of Envs 1M timesteps Collecting (%) Updating (%) Evaluating (%) Others (%) +Algorithm # of Envs 1M timesteps Collecting (%) Updating (%) Evaluating (%) Others (%) ========= ========= ============ ============== ============ ============== ========== DDPG 1 2.9h 12.0 80.2 2.4 5.4 TD3 1 3.3h 11.4 81.7 1.7 5.2 From 5587c5ce74e5d0e7d639db680cacfb73b02ec640 Mon Sep 17 00:00:00 2001 From: Jiayi Weng Date: Sat, 23 Apr 2022 11:55:40 -0400 Subject: [PATCH 4/4] fix table --- docs/tutorials/benchmark.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/tutorials/benchmark.rst b/docs/tutorials/benchmark.rst index 9c12e62c1..457101eea 100644 --- a/docs/tutorials/benchmark.rst +++ b/docs/tutorials/benchmark.rst @@ -25,15 +25,15 @@ The table below compares the performance of Tianshou against published results o +=========+================+==========+===========+==========+==========+=========+==========+========+==========+==========+ |DDPG |Tianshou |990.4 |**11718.7**|**2197.0**|1400.6 |**144.1**|**177.3** |**-3.3**|**1000.0**|8364.3 | + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ -| |TD3 Paper |**1005.3**|3305.6 |**2020.5**|1843.6 |/ |/ |-6.51 |**1000.0**|**9355.5**| +| |TD3 Paper |**1005.3**|3305.6 |**2020.5**|1843.6 |/ |/ |-6.5 |**1000.0**|**9355.5**| + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ -| |TD3 Paper (Our) |888.8 |8577.3 |1860.0 |**3098.1**|/ |/ |-4.01 |**1000.0**|8370.0 | +| |TD3 Paper (Our) |888.8 |8577.3 |1860.0 |**3098.1**|/ |/ |-4.0 |**1000.0**|8370.0 | + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ | |Spinning Up |~840 |~11000 |~1800 |~1950 |~137 |/ |/ |/ |/ | +---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ |TD3 |Tianshou |**5116.4**|**10201.2**|3472.2 |3982.4 |**104.2**|**5189.5**|**-2.7**|**1000.0**|**9349.2**| + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ -| |TD3 Paper |4372.4 |9637.0 |**3564.1**|**4682.8**|/ |/ |-3.6 |**1000.0**|**9337.5**| +| |TD3 Paper |4372.4 |9637.0 |**3564.1**|**4682.8**|/ |/ |-3.6 |**1000.0**|9337.5 | + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ | |Spinning Up |~3800 |~9750 |~2860 |~4000 |~78 |/ |/ |/ |/ | +---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ @@ -47,13 +47,13 @@ The table below compares the performance of Tianshou against published results o +---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ |A2C |Tianshou |**3485.4**|**1829.9** |**1253.2**|**1091.6**|**36.6** |**1726.0**|**-6.7**|**1000.0**|**9257.7**| + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ -| |PPO Paper |/ |~1000 |~900 |~850 |~31 |/ |~-24 |**1000.0**|~7100 | +| |PPO Paper |/ |~1000 |~900 |~850 |~31 |/ |~-24 |**~1000** |~7100 | + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ -| |PPO Paper (TR) |/ |~930 |~1220 |~700 |**~36** |/ |~-27 |**1000.0**|~8100 | +| |PPO Paper (TR) |/ |~930 |~1220 |~700 |**~36** |/ |~-27 |**~1000** |~8100 | +---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ |PPO |Tianshou |**3258.4**|**5783.9** |**2609.3**|3588.5 |66.7 |**787.1** |**-4.1**|**1000.0**|**9231.3**| + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ -| |PPO Paper |/ |~1800 |~2330 |~3460 |~108 |/ |~-7 |**1000.0**|~8000 | +| |PPO Paper |/ |~1800 |~2330 |~3460 |~108 |/ |~-7 |**~1000** |~8000 | + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ | |TD3 Paper |1083.2 |1795.4 |2164.7 |3317.7 |/ |/ |-6.2 |**1000.0**|8977.9 | + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ @@ -63,9 +63,9 @@ The table below compares the performance of Tianshou against published results o +---------+----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ |TRPO |Tianshou |**2866.7**|**4471.2** |2046.0 |**3826.7**|40.9 |**810.1** |**-5.1**|**1000.0**|**8435.2**| + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ -| |ACKTR paper |~0 |~400 |~1400 |~550 |~40 |/ |-8 |**1000.0**|~800 | +| |ACKTR paper |~0 |~400 |~1400 |~550 |~40 |/ |-8 |**~1000** |~800 | + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ -| |PPO Paper |/ |~0 |~2100 |~1100 |**~121** |/ |~-115 |**1000.0**|~200 | +| |PPO Paper |/ |~0 |~2100 |~1100 |**~121** |/ |~-115 |**~1000** |~200 | + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+ | |TD3 paper |-75.9 |-15.6 |**2471.3**|2321.5 |/ |/ |-111.4 |985.4 |205.9 | + +----------------+----------+-----------+----------+----------+---------+----------+--------+----------+----------+