ChainerRL is a deep reinforcement learning library that implements various state-of-the-art deep reinforcement algorithms in Python using Chainer, a flexible deep learning framework.
ChainerRL can be installed via PyPI:
pip install chainerrl
It can also be installed from the source code:
python setup.py install
You can try ChainerRL Quickstart Guide first, or check the examples ready for Atari 2600 and Open AI Gym.
Algorithm | Discrete Action | Continous Action | Recurrent Model | CPU Async Training |
---|---|---|---|---|
DQN (including DoubleDQN etc.) | o | o (NAF) | o | x |
DDPG | x | o | o | x |
A3C | o | o | o | o |
ACER | o | x | o | o |
NSQ (N-step Q-learning) | o | o (NAF) | o | o |
Following algorithms have been implemented in ChainerRL:
- A3C (Asynchronous Advantage Actor-Critic)
- ACER (Actor-Critic with Experience Replay) (only the discrete-action version for now)
- Asynchronous N-step Q-learning
- DQN (including Double DQN, Persistent Advantage Learning (PAL), Double PAL, Dynamic Policy Programming (DPP))
- DDPG (Deep Deterministic Poilcy Gradients) (including SVG(0))
- PGT (Policy Gradient Theorem)
Q-function based algorithms such as DQN can utilize a Normalized Advantage Function (NAF) to tackle continuous-action problems as well as DQN-like discrete output networks.
Environments that support the subset of OpenAI Gym's interface (reset
and step
methods) can be used.
To test chainerrl modules, install nose
and run nosetests
.
To test examples, run test_examples.sh
.
Any kind of contribution to ChainerRL would be highly appreciated! If you are interested in contributing to ChainerRL, please read CONTRIBUTING.md.