这是indexloc提供的服务,不要输入任何密码
Skip to content

Changes for release 1.2.0 (v1 release with fixes/improvements backported from v2) #1258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 58 commits into from
Jun 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
bbc36d0
Fix missing reset in discrete_dqn
opcode81 Mar 4, 2025
b5665e3
ActorFactoryDefault: Fix hidden sizes and activation not being passed…
opcode81 Mar 4, 2025
decb416
ExperimentConfig: Do not inherit from anything (breaks jsonargparse a…
opcode81 Mar 4, 2025
76a25d1
AutoAlphaFactoryDefault: Differentiate discrete and continuous action…
opcode81 Mar 7, 2025
eeb6610
Use DummyVectorEnv instead of Subproc in test_a2c_with_il
opcode81 Mar 7, 2025
8dbf0bf
Fix misleading docstring and corresponding errors pertaining to optim…
opcode81 Mar 17, 2025
1713331
Update changelog
opcode81 Apr 22, 2025
528fd2c
`BaseTrainer.run` and `__iter__`: Resetting was never optional prior …
opcode81 Apr 22, 2025
4f17673
Add basic implementation for determinism tests
opcode81 Apr 22, 2025
d78f0ed
Log parameters of ActorCritic components separately
opcode81 Apr 22, 2025
5061c22
Fix failure message
opcode81 Apr 23, 2025
c88f844
Add TorchDeterministicModeContext
opcode81 Apr 23, 2025
9fbfd99
Devcontainer
MischaPanch Apr 23, 2025
cf0e0d8
Update sensai-utils to 1.4.0
opcode81 May 5, 2025
3ed3c20
Support new mode of operation determinism tests, where each developer is
opcode81 May 5, 2025
364814d
Add determinism test for DiscreteBCQ
opcode81 May 5, 2025
7a8902a
Fix message assignment
opcode81 May 5, 2025
60e8cea
Log TrainingStats with TraceLogger after every training step
opcode81 May 5, 2025
57ec496
Log sampled batch indices with TraceLogger when performing update
opcode81 May 5, 2025
4c0699b
Formatting
opcode81 May 5, 2025
c1f580e
ReplayBuffer: Establish determinism by using a well-defined RandomState
opcode81 May 5, 2025
5f515a1
Improve change log entry pertaining to the breaking change in the tra…
opcode81 May 12, 2025
c05294f
Add determinism tests for virtually all algorithms
opcode81 May 12, 2025
61c9fa3
Fix determinism test name
opcode81 May 12, 2025
2816d04
Fix test name
opcode81 May 13, 2025
c7d48a3
Add more trace log messages for context
opcode81 May 13, 2025
63c5e95
Configure training eps value for initial data collection (DQN, BDQ)
opcode81 May 13, 2025
b735e0b
Fix test names
opcode81 May 13, 2025
809279b
Configure training eps value for initial data collection (C51, FQF, I…
opcode81 May 13, 2025
790dbb3
TraceLogger: Add flag 'verbose'
opcode81 May 14, 2025
3fc484a
v1: Removed unused and failing test
MischaPanch May 14, 2025
5b46038
v1: minor type validation
MischaPanch May 14, 2025
547f0a5
Merge branch 'dev-v1' of github.com:thu-ml/tianshou into dev-v1
MischaPanch May 14, 2025
f73b247
Fix test name
opcode81 May 14, 2025
b6fe90e
Use trainer run instead of direct iteration
opcode81 May 14, 2025
744561e
Improve trace log message
opcode81 May 14, 2025
a89cb14
Improve change log
opcode81 May 15, 2025
cd57fa7
Fix mypy issues
opcode81 May 15, 2025
2b57654
Relax determinism tests:
opcode81 May 16, 2025
0c385f9
test_drqn: Collect initial data in training mode
opcode81 May 16, 2025
a4e81ea
Formatting
opcode81 May 16, 2025
eaa7f96
Fix assertion (stats can be None)
opcode81 May 16, 2025
af0a959
Fix create_toc_py not accounting for spaces in paths
opcode81 May 19, 2025
619051c
Fix unquoted maths in docstring
opcode81 May 19, 2025
cf22adf
v1: improvement in doc-build commands
MischaPanch May 17, 2025
3192dbf
Fix ruff complaint
opcode81 May 19, 2025
de78ecb
Document determinism test usage
opcode81 May 19, 2025
802fb83
Mentioned determinism tests in PR template
MischaPanch May 19, 2025
2cd40cb
Allow collection of empty episodes (done on reset)
MischaPanch May 19, 2025
981e649
High-level API: Change the way in which seeding is handled
opcode81 May 19, 2025
5f5bab9
Fix syntax issue
opcode81 May 19, 2025
ffdc9d4
Merge remote-tracking branch 'thuml/master' into dev-v1
opcode81 Jun 5, 2025
a86e246
AtariEnvFactory: Fix super call
opcode81 Jun 19, 2025
856e2b8
v1: adjust range for seed to be compatible with envpool
MischaPanch Jun 21, 2025
d8daab2
v1: disable buffer hasnull checks by default
MischaPanch Jun 21, 2025
0db2e74
v1: fixes in rliable eval data loading, better logging
MischaPanch Jun 21, 2025
fd93ab3
v1: replace all isinstance checks from BatchProtocol to Batch
MischaPanch Jun 23, 2025
812afc8
v1: changelog [ci skip]
MischaPanch Jun 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"name": "Tianshou",
"dockerFile": "../Dockerfile",
"workspaceFolder": "/workspaces/tianshou",
"runArgs": ["--shm-size=1g"],
"customizations": {
"vscode": {
"settings": {
"terminal.integrated.shell.linux": "/bin/bash",
"python.pythonPath": "/usr/local/bin/python"
},
"extensions": [
"ms-python.python",
"ms-toolsai.jupyter",
"ms-python.vscode-pylance"
]
}
},
"forwardPorts": [],
"postCreateCommand": "poetry install --with dev",
"remoteUser": "root"
}
14 changes: 14 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
data
logs
test/log
docs/jupyter_execute
docs/.jupyter_cache
.lsp
.clj-kondo
docs/_build
coverage*
__pycache__
*.egg-info
*.egg
.*cache
dist
1 change: 1 addition & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
- [ ] I have provided a description of the changes in this Pull Request
- [ ] I have added documentation for my changes and have listed relevant changes in CHANGELOG.md
- [ ] If applicable, I have added tests to cover my changes.
- [ ] If applicable, I have made sure that the determinism tests run through, meaning that my changes haven't influenced any aspect of training. See info in the contributing documentation.
- [ ] I have reformatted the code using `poe format`
- [ ] I have checked style and types with `poe lint` and `poe type-check`
- [ ] (Optional) I ran tests locally with `poe test`
Expand Down
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -158,4 +158,7 @@ docs/conf.py

# temporary scripts (for ad-hoc testing), temp folder
/temp
/temp*.py
/temp*.py

# determinism test snapshots
/test/resources/determinism/
55 changes: 47 additions & 8 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,62 @@
# Changelog

## Unreleased
## Upcoming Release 1.2.0

### Changes/Improvements

- trainer:
- `trainer`:
- Custom scoring now supported for selecting the best model. #1202
- highlevel:
- `highlevel`:
- `DiscreteSACExperimentBuilder`: Expose method `with_actor_factory_default` #1248 #1250

- `ActorFactoryDefault`: Fix parameters for hidden sizes and activation not being
passed on in the discrete case (affects `with_actor_factory_default` method of experiment builders)
- `ExperimentConfig`: Do not inherit from other classes, as this breaks automatic handling by
`jsonargparse` when the class is used to define interfaces (as in high-level API examples)
- `AutoAlphaFactoryDefault`: Differentiate discrete and continuous action spaces
and allow coefficient to be modified, adding an informative docstring
(previous implementation was reasonable only for continuous action spaces)
- Adjust usage in `atari_sac_hl` example accordingly.
- `NPGAgentFactory`, `TRPOAgentFactory`: Fix optimizer instantiation including the actor parameters
(which was misleadingly suggested in the docstring in the respective policy classes; docstrings were fixed),
as the actor parameters are intended to be handled via natural gradients internally
- `data`:
- `ReplayBuffer`: Fix collection of empty episodes being disallowed
- Collection was slow due to `isinstance` checks on Protocols and due to Buffer integrity validation. This was solved
by no longer performing `isinstance` on Protocols and by making the integrity validation disabled by default.
- Tests:
- We have introduced extensive **determinism tests** which allow to validate whether
training processes deterministically compute the same results across different development branches.
This is an important step towards ensuring reproducibility and consistency, which will be
instrumental in supporting Tianshou developers in their work, especially in the context of
algorithm development and evaluation.

### Breaking Changes

- data:
- stats:
- `InfoStats` has a new non-optional field `best_score` which is used
for selecting the best model. #1202
- `trainer`:
- `BaseTrainer.run` and `__iter__`: Resetting was never optional prior to running the trainer,
yet the recently introduced parameter `reset_prior_to_run` of `run` suggested that it _was_ optional.
Yet the parameter was ultimately not respected, because `__iter__` would always call `reset(reset_collectors=True, reset_buffer=False)`
regardless. The parameter was removed; instead, the parameters of `run` now mirror the parameters of `reset`,
and the implicit `reset` call in `__iter__` was removed.
This aligns with upcoming changes in Tianshou v2.0.0.
* NOTE: If you have been using a trainer without calling `run` but by directly iterating over it, you
will need to call `reset` on the trainer explicitly before iterating over the trainer.
* Using a trainer as an iterator is considered deprecated and support for this will be removed in Tianshou v2.0.0.
- `data`:
- `InfoStats` has a new non-optional field `best_score` which is used
for selecting the best model. #1202
- `highlevel`:
- Change the way in which seeding is handled: The mechanism introduced in v1.1.0
was completely revised:
- The `train_seed` and `test_seed` attributes were removed from `SamplingConfig`.
Instead, the seeds are derived from the seed defined in `ExperimentConfig`.
- Seed attributes of `EnvFactory` classes were removed.
Instead, seeds are passed to methods of `EnvFactory`.

## Release 1.1.0

**NOTE**: This release introduced (potentially severe) performance regressions in data collection, please switch to a newer release for better performance.

### Highlights

#### Evaluation Package
Expand Down
42 changes: 42 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Use the official Python image for the base image.
FROM --platform=linux/amd64 python:3.11-slim

# Set environment variables to make Python print directly to the terminal and avoid .pyc files.
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1

# Install system dependencies required for the project.
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
build-essential \
git \
wget \
unzip \
libvips-dev \
gnupg2 \
&& rm -rf /var/lib/apt/lists/*


# Install pipx.
RUN python3 -m pip install --no-cache-dir pipx \
&& pipx ensurepath

# Add poetry to the path
ENV PATH="${PATH}:/root/.local/bin"

# Install the latest version of Poetry using pipx.
RUN pipx install poetry

# Set the working directory. IMPORTANT: can't be changed as needs to be in sync to the dir where the project is cloned
# to in the codespace
WORKDIR /workspaces/tianshou

# Copy the pyproject.toml and poetry.lock files (if available) into the image.
COPY pyproject.toml poetry.lock* README.md /workspaces/tianshou/

RUN poetry config virtualenvs.create false
RUN poetry install --no-root --with dev

# The entrypoint will perform an editable install, it is expected that the code is mounted in the container then
# If you don't want to mount the code, you should override the entrypoint
ENTRYPOINT ["/bin/bash", "-c", "poetry install --with dev && poetry run jupyter trust notebooks/*.ipynb docs/02_notebooks/*.ipynb && $0 $@"]
45 changes: 35 additions & 10 deletions docs/04_contributing/04_contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@ Contributing to Tianshou
========================


Install Develop Version
-----------------------
Install Development Environment
-------------------------------

Tianshou is built and managed by `poetry <https://python-poetry.org/>`_. For example,
to install all relevant requirements in editable mode you can simply call
to install all relevant requirements (and install Tianshou itself in editable mode)
you can simply call

.. code-block:: bash

Expand Down Expand Up @@ -36,9 +37,9 @@ Please set up pre-commit by running
in the main directory. This should make sure that your contribution is properly
formatted before every commit.

The code is inspected and formatted by `black` and `ruff`. They are executed as
pre-commit hooks. In addition, `poe the poet` tasks are configured.
Simply run `poe` to see the available tasks.
The code is inspected and formatted by ``black`` and ``ruff``. They are executed as
pre-commit hooks. In addition, ``poe the poet`` tasks are configured.
Simply run ``poe`` to see the available tasks.
E.g, to format and check the linting manually you can run:

.. code-block:: bash
Expand All @@ -47,8 +48,8 @@ E.g, to format and check the linting manually you can run:
$ poe lint


Type Check
----------
Type Checks
-----------

We use `mypy <https://github.com/python/mypy/>`_ to check the type annotations. To check, in the main directory, run:

Expand All @@ -57,8 +58,8 @@ We use `mypy <https://github.com/python/mypy/>`_ to check the type annotations.
$ poe type-check


Test Locally
------------
Testing Locally
---------------

This command will run automatic tests in the main directory

Expand All @@ -67,6 +68,30 @@ This command will run automatic tests in the main directory
$ poe test


Determinism Tests
~~~~~~~~~~~~~~~~~

We implemented "determinism tests" for Tianshou's algorithms, which allow us to determine
whether algorithms still compute exactly the same results even after large refactorings.
These tests are applied by

1. creating a behavior snapshot ine the old code branch before the changes and then
2. running the test in the new branch to ensure that the behavior is the same.

Unfortunately, full determinism is difficult to achieve across different platforms and even different
machines using the same platform an Python environment.
Therefore, these tests are not carried out in the CI pipeline.
Instead, it is up to the developer to run them locally and check the results whenever a change
is made to the code base that could affect algorithm behavior.

Technically, the two steps are handled by setting static flags in class ``AlgorithmDeterminismTest`` and then
running either the full test suite or a specific determinism test (``test_*_determinism``, e.g. ``test_ddpg_determinism``)
in the two branches to be compared.

1. On the old branch: (Temporarily) set ``ENABLED=True`` and ``FORCE_SNAPSHOT_UPDATE=True`` and run the test(s).
2. On the new branch: (Temporarily) set ``ENABLED=True`` and ``FORCE_SNAPSHOT_UPDATE=False`` and run the test(s).
3. Inspect the test results; find a summary in ``determinism_tests.log``

Test by GitHub Actions
----------------------

Expand Down
4 changes: 2 additions & 2 deletions docs/autogen_rst.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,8 @@ def make_rst(src_root, rst_root, clean=False, overwrite=False, package_prefix=""
for f in files_in_dir
if os.path.isdir(os.path.join(root, dirname, f)) and not f.startswith("_")
]
if not module_names:
log.debug(f"Skipping {dirname} as it does not contain any .py files")
if not module_names and "__init__.py" not in files_in_dir:
log.debug(f"Skipping {dirname} as it does not contain any modules or __init__.py")
continue
package_qualname = f"{base_package_qualname}.{dirname}"
package_index_rst_path = os.path.join(
Expand Down
2 changes: 1 addition & 1 deletion docs/create_toc.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@

# This script provides a platform-independent way of making the jupyter-book call (used in pyproject.toml)
toc_file = Path(__file__).parent / "_toc.yml"
cmd = f"jupyter-book toc from-project docs -e .rst -e .md -e .ipynb >{toc_file}"
cmd = f'jupyter-book toc from-project docs -e .rst -e .md -e .ipynb >"{toc_file}"'
print(cmd)
os.system(cmd)
Loading