-
Notifications
You must be signed in to change notification settings - Fork 1.2k
W&B: add artifacts support #441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
3a46660
add artifacts support
AyushExel 32b3507
remove print
AyushExel 1cc306e
Update tianshou/utils/logger/wandb.py
AyushExel 5cc42b4
monitor gym
AyushExel 5617fe2
monitor gym
AyushExel c751375
Merge branch 'master' into wandb
Trinkle23897 94de701
Update logger
AyushExel 0ac506c
repo label
AyushExel 00e4afb
add test
AyushExel e8616ff
update gym req.
AyushExel 8e5c71f
ignore mypy checks
AyushExel acc0d1b
flake8
AyushExel 3f4e9f8
update ci file
Trinkle23897 67943e3
try to fix ci
Trinkle23897 c7ad697
fix ci
AyushExel 86a274b
try to fix ci
AyushExel 7d423cd
try to fix ci
AyushExel 761c40b
try ci fix
AyushExel f3fc3ea
try ci fix
AyushExel 3c09be7
try ci fix
AyushExel 0931a08
update docs
Trinkle23897 0bd9134
Update wandb.py
AyushExel 261ee46
unify logger test on psrl
Trinkle23897 8d7e423
Merge branch 'wandb' of github.com:AyushExel/tianshou into wandb
Trinkle23897 b1698c0
Update wandb.py
AyushExel 684afbb
fix format
Trinkle23897 ebf6015
add config logging
AyushExel 0853f6b
Merge branch 'wandb' of https://github.com/AyushExel/tianshou into wandb
AyushExel 397de8a
update
AyushExel 1635000
merge atari_wandb into original file
Trinkle23897 dac8958
fix format
Trinkle23897 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,7 @@ | ||
import argparse | ||
import os | ||
from typing import Callable, Optional, Tuple | ||
|
||
from tianshou.utils import BaseLogger | ||
from tianshou.utils.logger.base import LOG_DATA_TYPE | ||
|
||
|
@@ -7,10 +11,10 @@ | |
pass | ||
|
||
|
||
class WandBLogger(BaseLogger): | ||
"""Weights and Biases logger that sends data to Weights and Biases. | ||
class WandbLogger(BaseLogger): | ||
"""Weights and Biases logger that sends data to https://wandb.ai/. | ||
|
||
Creates three panels with plots: train, test, and update. | ||
This logger creates three panels with plots: train, test, and update. | ||
Make sure to select the correct access for each panel in weights and biases: | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe we can here show where the example script is? |
||
- ``train/env_step`` for train plots | ||
|
@@ -29,16 +33,97 @@ class WandBLogger(BaseLogger): | |
:param int test_interval: the log interval in log_test_data(). Default to 1. | ||
:param int update_interval: the log interval in log_update_data(). | ||
Default to 1000. | ||
:param str project: W&B project name. Default to "tianshou". | ||
:param str name: W&B run name. Default to None. If None, random name is assigned. | ||
:param str entity: W&B team/organization name. Default to None. | ||
:param str run_id: run id of W&B run to be resumed. Default to None. | ||
:param argparse.Namespace config: experiment configurations. Default to None. | ||
""" | ||
|
||
def __init__( | ||
self, | ||
train_interval: int = 1000, | ||
test_interval: int = 1, | ||
update_interval: int = 1000, | ||
save_interval: int = 1000, | ||
project: str = 'tianshou', | ||
name: Optional[str] = None, | ||
entity: Optional[str] = None, | ||
run_id: Optional[str] = None, | ||
config: Optional[argparse.Namespace] = None, | ||
) -> None: | ||
super().__init__(train_interval, test_interval, update_interval) | ||
self.last_save_step = -1 | ||
self.save_interval = save_interval | ||
self.restored = False | ||
|
||
self.wandb_run = wandb.init( | ||
Trinkle23897 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
project=project, | ||
name=name, | ||
id=run_id, | ||
resume="allow", | ||
entity=entity, | ||
monitor_gym=True, | ||
config=config, # type: ignore | ||
) if not wandb.run else wandb.run | ||
self.wandb_run._label(repo="tianshou") # type: ignore | ||
|
||
def write(self, step_type: str, step: int, data: LOG_DATA_TYPE) -> None: | ||
data[step_type] = step | ||
wandb.log(data) | ||
|
||
def save_data( | ||
self, | ||
epoch: int, | ||
env_step: int, | ||
gradient_step: int, | ||
save_checkpoint_fn: Optional[Callable[[int, int, int], None]] = None, | ||
) -> None: | ||
"""Use writer to log metadata when calling ``save_checkpoint_fn`` in trainer. | ||
|
||
:param int epoch: the epoch in trainer. | ||
:param int env_step: the env_step in trainer. | ||
:param int gradient_step: the gradient_step in trainer. | ||
:param function save_checkpoint_fn: a hook defined by user, see trainer | ||
documentation for detail. | ||
""" | ||
if save_checkpoint_fn and epoch - self.last_save_step >= self.save_interval: | ||
self.last_save_step = epoch | ||
checkpoint_path = save_checkpoint_fn(epoch, env_step, gradient_step) | ||
|
||
checkpoint_artifact = wandb.Artifact( | ||
'run_' + self.wandb_run.id + '_checkpoint', # type: ignore | ||
type='model', | ||
metadata={ | ||
"save/epoch": epoch, | ||
"save/env_step": env_step, | ||
"save/gradient_step": gradient_step, | ||
"checkpoint_path": str(checkpoint_path) | ||
} | ||
) | ||
checkpoint_artifact.add_file(str(checkpoint_path)) | ||
self.wandb_run.log_artifact(checkpoint_artifact) # type: ignore | ||
|
||
def restore_data(self) -> Tuple[int, int, int]: | ||
checkpoint_artifact = self.wandb_run.use_artifact( # type: ignore | ||
'run_' + self.wandb_run.id + '_checkpoint:latest' # type: ignore | ||
) | ||
assert checkpoint_artifact is not None, "W&B dataset artifact doesn't exist" | ||
|
||
checkpoint_artifact.download( | ||
os.path.dirname(checkpoint_artifact.metadata['checkpoint_path']) | ||
) | ||
|
||
try: # epoch / gradient_step | ||
epoch = checkpoint_artifact.metadata["save/epoch"] | ||
self.last_save_step = self.last_log_test_step = epoch | ||
gradient_step = checkpoint_artifact.metadata["save/gradient_step"] | ||
self.last_log_update_step = gradient_step | ||
except KeyError: | ||
epoch, gradient_step = 0, 0 | ||
try: # offline trainer doesn't have env_step | ||
env_step = checkpoint_artifact.metadata["save/env_step"] | ||
self.last_log_train_step = env_step | ||
except KeyError: | ||
env_step = 0 | ||
return epoch, env_step, gradient_step |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and maybe we can add some instructions on how to use wandb (including resume) in
examples/atari/README.md
instead of tensorboard?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the instructions in doc also? I can make a separate PR tomorrow for adding the detailed instructions in
examples/atari/README.md
as well as docs.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just append to this pr