Can collector/venv be aware of the environment state?

- [ ] I have marked all applicable categories:
    + [ ] exception-raising bug
    + [ ] RL algorithm bug
    + [ ] documentation request (i.e. "X is missing from the documentation.")
    + [x] new feature request
- [x] I have visited the [source website](https://github.com/thu-ml/tianshou/)
- [x] I have searched through the [issue tracker](https://github.com/thu-ml/tianshou/issues) for duplicates
- [x] I have mentioned version numbers, operating system and environment, where applicable:
  ```python
  import tianshou, torch, sys
  print(tianshou.__version__, torch.__version__, sys.version, sys.platform)
  ```

```
0.4.0 1.8.0 3.8.8 (default, Feb 24 2021, 21:46:12) 
[GCC 7.3.0] linux
```

Hi there, thanks for this great project.

I want to know whether it's possible that collector can be aware of the environment state. For example, when environment depends on some generator that could run out (e.g., streaming data, one-epoch data, countable gaming scenario), after some episodes, the env cannot be reset anymore. Therefore, we cannot always know how many episodes in total collectors need to collect. In fact, in such cases, when collector should stop fully depends on when env runs out itself.

Actually, in our use case, we are testing our agent on a fixed dataset, where each sample in the dataset is an episode for the agent. And we split the dataset to do parallel sampling in venv. The problem here is, collector cannot accurately stop those envs that need to be stopped. Even with `n_episode`, collector could wrongly stop those envs that are still with more samples but keep envs with zero samples.

I can propose a design for this feature, but it involves at least env, venv and collector and will be a pretty large change. Therefore, I think it's worth a discussion to make sure whether people believe it is a valid use case.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can collector/venv be aware of the environment state? #322

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can collector/venv be aware of the environment state? #322

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions