这是indexloc提供的服务,不要输入任何密码
Skip to content

Can collector/venv be aware of the environment state? #322

@ultmaster

Description

@ultmaster
  • I have marked all applicable categories:
    • exception-raising bug
    • RL algorithm bug
    • documentation request (i.e. "X is missing from the documentation.")
    • new feature request
  • I have visited the source website
  • I have searched through the issue tracker for duplicates
  • I have mentioned version numbers, operating system and environment, where applicable:
    import tianshou, torch, sys
    print(tianshou.__version__, torch.__version__, sys.version, sys.platform)
0.4.0 1.8.0 3.8.8 (default, Feb 24 2021, 21:46:12) 
[GCC 7.3.0] linux

Hi there, thanks for this great project.

I want to know whether it's possible that collector can be aware of the environment state. For example, when environment depends on some generator that could run out (e.g., streaming data, one-epoch data, countable gaming scenario), after some episodes, the env cannot be reset anymore. Therefore, we cannot always know how many episodes in total collectors need to collect. In fact, in such cases, when collector should stop fully depends on when env runs out itself.

Actually, in our use case, we are testing our agent on a fixed dataset, where each sample in the dataset is an episode for the agent. And we split the dataset to do parallel sampling in venv. The problem here is, collector cannot accurately stop those envs that need to be stopped. Even with n_episode, collector could wrongly stop those envs that are still with more samples but keep envs with zero samples.

I can propose a design for this feature, but it involves at least env, venv and collector and will be a pretty large change. Therefore, I think it's worth a discussion to make sure whether people believe it is a valid use case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions