这是indexloc提供的服务,不要输入任何密码
Skip to content

code refactor for venv #179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 78 commits into from
Aug 19, 2020
Merged

code refactor for venv #179

merged 78 commits into from
Aug 19, 2020

Conversation

youkaichao
Copy link
Collaborator

@youkaichao youkaichao commented Aug 4, 2020

  • Refacor code to remove duplicate code

  • Enable async simulation for all vector envs

  • Remove collector.close and rename VectorEnv to DummyVectorEnv

The abstraction of vector env changed.

Prior to this pr, each vector env is almost independent.

After this pr, each env is wrapped into a worker, and vector envs differ with their worker type. In fact, users can just use BaseVectorEnv with different workers, I keep SubprocVectorEnv, ShmemVectorEnv for backward compatibility.

@youkaichao youkaichao changed the title code refactor for env WIP: code refactor for env Aug 4, 2020
@Trinkle23897 Trinkle23897 linked an issue Aug 4, 2020 that may be closed by this pull request
8 tasks
@youkaichao youkaichao requested a review from Trinkle23897 August 4, 2020 16:19
@youkaichao youkaichao changed the title WIP: code refactor for env code refactor for env Aug 4, 2020
@duburcqa
Copy link
Collaborator

duburcqa commented Aug 16, 2020

It is true that timeout should be supported for dynamic environments. In fact, action distribution depends on policy, which may change the time cost distribution of step functions.

I would add that in m'y opinion it is usually more common to deal with such dynamic environment than a static one. For example in my case I have to deal with dynamic environments, never static. It comes from the way the state is integrated over time. Some state requires to refine the computation to guarantee the accuracy, while for others it is not necessary.

@duburcqa
Copy link
Collaborator

Is it ready for review ? I didn't spent much time thinking to a better way to handle methods such as wait to avoid relying on static methods of the workers themself. I'm suite confident that using metaclass (in a way similar to metaprograming in c++) could do the trick but it is tricky to design and lack for clarity even for devs. So I don't know what to think about this.

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented Aug 16, 2020

Is it ready for review?

Yes, of course!

And I have another concern: can we reduce the async overhead as much as possible? I tested with wait_num == env_num for both sync and async mode (use test_drqn.py), and the result shows that the speed of async is 80% of sync. This could be the future work.

@youkaichao
Copy link
Collaborator Author

I would add that in m'y opinion it is usually more common to deal with such dynamic environment than a static one.

Agree. Only toy environments are static. Almost all realistic environments are dynamic.

@Trinkle23897 Trinkle23897 requested a review from duburcqa August 17, 2020 14:52
@duburcqa
Copy link
Collaborator

I'll have a look later today

duburcqa
duburcqa previously approved these changes Aug 19, 2020
@duburcqa
Copy link
Collaborator

As I said, I should be clear from the documentation that the vector env are not usual pool, a sense that it does not really handle parallelism directly. Being asynchronous or not is to the resort of the sole worker and at its discretion, with no explicit mechanism for the pool to know this information in advance (hence the unusual wait mechanism).

@youkaichao
Copy link
Collaborator Author

Being asynchronous or not is to the resort of the sole worker and at its discretion, with no explicit mechanism for the pool to know this information in advance

I don't agree. The workers have send_action / get_results interfaces so that they support asynchronous simulation. It is up to the user / vec env whether to use asynchronous or not. I don't think "the unusual wait mechanism" exists.

@Trinkle23897 Trinkle23897 merged commit a9f9940 into thu-ml:dev Aug 19, 2020
@duburcqa
Copy link
Collaborator

The workers have send_action / get_results interfaces so that they support asynchronous simulation. It is up to the user / vec env whether to use asynchronous or not.

No it is not. It requires the user to use a worker that support async behavior. It is not asynchronous by itself.

I don't think "the unusual wait mechanism" exists.

Are you kidding me ? Show me a single well-known library using the same "calling static method from workers" strategy to implement wait method.

@youkaichao
Copy link
Collaborator Author

youkaichao commented Aug 19, 2020

Are you kidding me ? Show me a single well-known library using the same "calling static method from workers" strategy to implement wait method.

I admit the current implementation requires the hack of calling static method from workers.

To remove the hack, each worker should implement a ready interface, which returns True if it is ready to return results. But I don't know how to implement it for ray, and I think it would be difficult to implement. Therefore, I rely on the readily implemented connection.wait and ray.wait.

The ready interface enables waiting by while-loop. An efficient wait should rely on something like semaphore, i.e. sleep on a set of semaphores. But I doubt something general like sleeping on a set of semaphores exists. Most likely, sleeping on some signal is worker-dependent.

@youkaichao youkaichao deleted the worker branch August 19, 2020 07:47
@duburcqa
Copy link
Collaborator

I agree it is tricky to implement and I'm fine with the current implementation. It could be refactor later if necessary, or if we have time for that.

BFAnas pushed a commit to BFAnas/tianshou that referenced this pull request May 5, 2024
- Refacor code to remove duplicate code

- Enable async simulation for all vector envs

- Remove `collector.close` and rename `VectorEnv` to `DummyVectorEnv`

The abstraction of vector env changed.

Prior to this pr, each vector env is almost independent.

After this pr, each env is wrapped into a worker, and vector envs differ with their worker type. In fact, users can just use `BaseVectorEnv` with different workers, I keep `SubprocVectorEnv`, `ShmemVectorEnv` for backward compatibility.

Co-authored-by: n+e <463003665@qq.com>
Co-authored-by: magicly <magicly007@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SubprocVectorEnv is not released when collector is closed Async Sampling
6 participants