Loss Divergence 

- [ ] I have marked all applicable categories:
    + [x] exception-raising bug
    + [x] RL algorithm bug
    + [x] documentation request (i.e. "X is missing from the documentation.")
    + [x] new feature request
- [ ] I have visited the [source website](https://github.com/thu-ml/tianshou/)
- [ ] I have searched through the [issue tracker](https://github.com/thu-ml/tianshou/issues) for duplicates
- [ ] I have mentioned version numbers, operating system and environment, where applicable:
  ```python
  import tianshou, torch, numpy, sys
  print(tianshou.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)
  ```
During training, I found that the value of loss is really large, is this normal? I also found that my reward gradually decreased from fluctuations, and finally stabilized at a relatively low value. I guess maybe solving the problem of loss divergence then can solve the problem of reward.
this photo is a screenshot during training
![0](https://user-images.githubusercontent.com/36686362/130610432-78bab4cf-b5ed-4655-aeab-578273020b31.PNG)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Loss Divergence #425

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Loss Divergence #425

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions