这是indexloc提供的服务,不要输入任何密码
Skip to content

[question] Abnormal running speed of off-policy algorithms on Mujoco environments. #866

@muchvo

Description

@muchvo
  • I have marked all applicable categories:
    • exception-raising bug
    • RL algorithm bug
    • documentation request (i.e. "X is missing from the documentation.")
    • new feature request
  • I have visited the source website
  • I have searched through the issue tracker for duplicates
  • I have mentioned version numbers, operating system and environment, where applicable:
    About my environment:
    System: Ubuntu20.04 CPU:AMD EPYC 7H12 64-Core Processor GPU: [GeForce RTX 3090]*8
_libgcc_mutex             0.1                        main    defaults
_openmp_mutex             5.1                       1_gnu    defaults
absl-py                   1.4.0                    pypi_0    pypi
ca-certificates           2023.01.10           h06a4308_0    defaults
cachetools                5.3.0                    pypi_0    pypi
certifi                   2022.12.7        py38h06a4308_0    defaults
charset-normalizer        3.1.0                    pypi_0    pypi
cloudpickle               2.2.1                    pypi_0    pypi
dm-env                    1.6                      pypi_0    pypi
dm-tree                   0.1.8                    pypi_0    pypi
envpool                   0.8.2                    pypi_0    pypi
glfw                      2.5.6                    pypi_0    pypi
google-auth               2.16.2                   pypi_0    pypi
google-auth-oauthlib      0.4.6                    pypi_0    pypi
grpcio                    1.51.3                   pypi_0    pypi
gym                       0.26.2                   pypi_0    pypi
gym-notices               0.0.8                    pypi_0    pypi
gymnasium                 0.26.3                   pypi_0    pypi
gymnasium-notices         0.0.1                    pypi_0    pypi
h5py                      3.8.0                    pypi_0    pypi
idna                      3.4                      pypi_0    pypi
imageio                   2.25.0                   pypi_0    pypi
importlib-metadata        6.0.0                    pypi_0    pypi
jax-jumpy                 0.2.0                    pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1    defaults
libffi                    3.4.2                h6a678d5_6    defaults
libgcc-ng                 11.2.0               h1234567_1    defaults
libgomp                   11.2.0               h1234567_1    defaults
libstdcxx-ng              11.2.0               h1234567_1    defaults
llvmlite                  0.39.1                   pypi_0    pypi
markdown                  3.4.1                    pypi_0    pypi
markupsafe                2.1.2                    pypi_0    pypi
mujoco                    2.3.0                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0    defaults
numba                     0.56.4                   pypi_0    pypi
numpy                     1.23.5                   pypi_0    pypi
nvidia-cublas-cu11        11.10.3.66               pypi_0    pypi
nvidia-cuda-nvrtc-cu11    11.7.99                  pypi_0    pypi
nvidia-cuda-runtime-cu11  11.7.99                  pypi_0    pypi
nvidia-cudnn-cu11         8.5.0.96                 pypi_0    pypi
oauthlib                  3.2.2                    pypi_0    pypi
openssl                   1.1.1t               h7f8727e_0    defaults
optree                    0.9.0                    pypi_0    pypi
packaging                 23.0                     pypi_0    pypi
pettingzoo                1.22.3                   pypi_0    pypi
pillow                    9.4.0                    pypi_0    pypi
pip                       23.0.1           py38h06a4308_0    defaults
protobuf                  4.22.1                   pypi_0    pypi
pyasn1                    0.4.8                    pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
pygame                    2.1.0                    pypi_0    pypi
pyopengl                  3.1.6                    pypi_0    pypi
python                    3.8.16               h7a1cb2a_3    defaults
pyyaml                    6.0                      pypi_0    pypi
readline                  8.2                  h5eee18b_0    defaults
requests                  2.28.2                   pypi_0    pypi
requests-oauthlib         1.3.1                    pypi_0    pypi
rsa                       4.9                      pypi_0    pypi
safety-gymnasium          0.1.1                    pypi_0    pypi
setuptools                65.6.3           py38h06a4308_0    defaults
six                       1.16.0                   pypi_0    pypi
sqlite                    3.40.1               h5082296_0    defaults
tensorboard               2.12.0                   pypi_0    pypi
tensorboard-data-server   0.7.0                    pypi_0    pypi
tensorboard-plugin-wit    1.8.1                    pypi_0    pypi
tianshou                  0.4.11                    dev_0    <develop>
tk                        8.6.12               h1ccaba5_0    defaults
torch                     1.13.1                   pypi_0    pypi
tqdm                      4.65.0                   pypi_0    pypi
types-protobuf            4.22.0.2                 pypi_0    pypi
typing-extensions         4.5.0                    pypi_0    pypi
urllib3                   1.26.15                  pypi_0    pypi
werkzeug                  2.2.3                    pypi_0    pypi
wheel                     0.38.4           py38h06a4308_0    defaults
xmltodict                 0.13.0                   pypi_0    pypi
xz                        5.2.10               h5eee18b_1    defaults
zipp                      3.15.0                   pypi_0    pypi
zlib                      1.2.13               h5eee18b_0    defaults
import tianshou, gymnasium as gym, torch, numpy, sys
print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)

When I run the off-policy example in https://github.com/thu-ml/tianshou/blob/master/examples/mujoco/ like td3 the speed is unacceptable for 200 epoch cause I am trying to integrate TianShou's API into OmniSafe.

Observations shape: (111,)                                                                    
Actions shape: (8,)                                                                           
Action range: -1.0 1.0                                                                        
Epoch #1:   0%| | 5/5000 [00:07<1:57:17,  1.41s/it, env_step=4, len=0, loss/actor=0.142, l
oss/crit                                                                        

It is strange, any help will be appreciated. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    not reproduced yetNot yet tested or reproduced by a reviewerperformance issuesSlow execution or poor-quality results

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions