-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Instead of giving fixed awards , I think it would be better if we allocate dynamic awards. Based on the score we can allocate a positive and negative reward . We can formulate the reward in such a way that the negative reward keeps decreasing and the positive reward keeps increasing the longer the agent survives in the environment. This can give the agent an incentive to survive longer .
Something along these lines: code
In this code c is any constant : [1, 2, 3....]
what are your thoughts on this ? If you accept I can create a Pull Request .
Metadata
Metadata
Assignees
Labels
No labels