Ozdaglar et al., 2021 - Google Patents
Independent learning in stochastic gamesOzdaglar et al., 2021
View PDF- Document ID
- 11926321031325377223
- Author
- Ozdaglar A
- Sayin M
- Zhang K
- Publication year
- Publication venue
- International Congress of Mathematicians
External Links
Snippet
Reinforcement learning (RL) has recently achieved tremendous successes in many artificial intelligence applications. Many of the forefront applications of RL involve multiple agents, eg, playing chess and Go games, autonomous driving, and robotics. Unfortunately, the …
- 230000004044 response 0 abstract description 40
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/04—Architectures, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G06N5/043—Distributed expert systems, blackboards
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0265—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion
- G05B13/027—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric the criterion being a learning criterion using neural networks only
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/004—Artificial life, i.e. computers simulating life
- G06N3/006—Artificial life, i.e. computers simulating life based on simulated virtual individual or collective life forms, e.g. single "avatar", social simulations, virtual worlds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/02—Computer systems based on specific mathematical models using fuzzy logic
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/04—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ozdaglar et al. | Independent learning in stochastic games | |
Bloembergen et al. | Evolutionary dynamics of multi-agent learning: A survey | |
Er et al. | Online tuning of fuzzy inference systems using dynamic fuzzy Q-learning | |
US8112369B2 (en) | Methods and systems of adaptive coalition of cognitive agents | |
Tizhoosh | Opposition-based learning: a new scheme for machine intelligence | |
García et al. | No strategy can win in the repeated prisoner's dilemma: linking game theory and computer simulations | |
Carmel et al. | Model-based learning of interaction strategies in multi-agent systems | |
van Eck et al. | Application of reinforcement learning to the game of Othello | |
Wang et al. | On the convergence of the monte carlo exploring starts algorithm for reinforcement learning | |
Sutton et al. | The Alberta plan for AI research | |
Gummadi et al. | Mean field analysis of multi-armed bandit games | |
Hafez et al. | Topological Q-learning with internally guided exploration for mobile robot navigation | |
Subramanian et al. | Multi-agent advisor Q-learning | |
Shah et al. | On reinforcement learning for turn-based zero-sum Markov games | |
Mishra et al. | Model-free reinforcement learning for stochastic Stackelberg security games | |
Shi et al. | Efficient hierarchical policy network with fuzzy rules | |
Amhraoui et al. | Expected Lenient Q-learning: a fast variant of the Lenient Q-learning algorithm for cooperative stochastic Markov games | |
Zhang et al. | Opinion dynamics in gossiper-media networks based on multiagent reinforcement learning | |
Dockhorn | Prediction-based search for autonomous game-playing | |
Tuyls et al. | Multiagent learning | |
Ammar et al. | Multi-agent architecture for Multi‐objective optimization of Flexible Neural Tree | |
Dahl | The lagging anchor algorithm: Reinforcement learning in two-player zero-sum games with imperfect information | |
Papini | Safe policy optimization | |
Fan et al. | Optimal evolution strategy for continuous strategy games on complex networks via reinforcement learning | |
Shinkawa et al. | Bandit approach to conflict-free multi-agent Q-learning in view of photonic implementation |