Learning starts

- [ ] I have marked all applicable categories:
    + [ ] exception-raising bug
    + [ ] RL algorithm bug
    + [ ] documentation request (i.e. "X is missing from the documentation.")
    + [ ] new feature request
- [x] I have visited the [source website], and in particular read the [known issues]
- [x] I have searched through the [issue tracker] and [issue categories] for duplicates
- [ ] I have mentioned version numbers, operating system and environment, where applicable:
  ```python
  import tianshou, torch, sys
  print(tianshou.__version__, torch.__version__, sys.version, sys.platform)
  ```

  [source website]: https://github.com/thu-ml/tianshou/
  [known issues]: https://github.com/thu-ml/tianshou/#faq-and-known-issues
  [issue categories]: https://github.com/thu-ml/tianshou/projects/2
  [issue tracker]: https://github.com/thu-ml/tianshou/issues?q=

I was wondering how to handle learning starts. As far as I can tell from the code, the actions are sampled from the policy from the beginning. How would one start the learning with e.g. 1000 timesteps of uniformly sampled actions?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Learning starts #78

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Learning starts #78

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions