Trainer relies on Logger for some values

[Here](https://github.com/thu-ml/tianshou/blob/18d2f25efff81561f3b47682227bc80d3787889d/tianshou/trainer/onpolicy.py#L103), onpolicy trainer relies on the value `rew` - which is the mean reward from the collector:
```python
best_reward, best_reward_std = test_result["rew"], test_result["rew_std"]
```
but this value is only computed by the logger [here](https://github.com/thu-ml/tianshou/blob/ebaca6f8da91e18e0192184c24f5d13e3a5d0092/tianshou/utils/log_tools.py#L131):
```python
collect_result["rew"] = collect_result["rews"].mean()
```

So it seems to me that a logger, if it is switched to another loger, which does not compute the mean - will break the whole thing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trainer relies on Logger for some values #431

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Trainer relies on Logger for some values #431

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions