这是indexloc提供的服务,不要输入任何密码
Skip to content

Potential Bug  #162

@HOLMEScdk

Description

@HOLMEScdk
  • I have marked all applicable categories:
    • [√ ] exception-raising bug
    • RL algorithm bug
    • documentation request (i.e. "X is missing from the documentation.")
    • new feature request
  • [ √] I have visited the source website
  • [ √] I have searched through the issue tracker for duplicates
  • [ √] I have mentioned version numbers, operating system and environment, where applicable:
    import tianshou, torch, sys
    print(tianshou.__version__, torch.__version__, sys.version, sys.platform)
    # 0.2.4 1.4.0 3.7.6 | packaged by conda-forge | (default, Jun  1 2020, 18:57:50) [GCC 7.5.0] linux

Hi, first thanks for the excellent work!

One thing I found may has potential bug.
When I was using a2c policy, I found that you calculate a_loss in this way, which may not work if the action has multiple dimension, say if the shape of output of dist.log_prob(a) is [bsz, n] while the shape of (r-v) is [bsz, 1] or [bsz] and neither of them can do dot product with the dis.log_prob(a). In my opinion, it would be better to use dist.log_prob(a).transpose(0,1) so that it will do multiplication among the same dimension.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions