<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet href="client.xsl" type="text/xsl"?>
<article article-type="other">
<front>
<journal-meta>
<journal-id/>
<issn/>
<banner>
<!--<href>banner.jpg</href>-->
<size width="100%"/>
</banner>
</journal-meta>
<article-meta>
<title-group>
<article-title>Theoretical Advantages of Lenient Qlearners: An Evolutionary Game Theoretic Perspective</article-title></title-group>

<author><a href="mailto:liviu@google.com"><name>Liviu Panait</name></a></author>
<aff>Google Inc 604 Arizona Ave, Santa Monica, CA, USA</aff>

<author><a href="mailto:k.tuyls@micc.unimaas.nl"><name>Karl Tuyls</name></a></author>
<aff>Maastricht University MiCC-IKAT, The Netherlands</aff>
</article-meta></front>
<body>
<abstract>
<title>ABSTRACT</title>
<p>This paper presents the dynamics of multiple reinforcement learning
agents from an Evolutionary Game Theoretic (EGT) perspective.
We provide a Replicator Dynamics model for traditional multiagent
Q-learning, and we extend these differential equations to
account for lenient learners: agents that forgive possible mistakes
of their teammates that resulted in lower rewards. We use this extended
formal model to visualize the basins of attraction of both
traditional and lenient multiagent Q-learners in two benchmark coordination
problems. The results indicate that lenience provides
learners with more accurate estimates for the utility of their actions,
resulting in higher likelihood of convergence to the globally
optimal solution. In addition, our research supports the strength of
EGT as a backbone for multiagent reinforcement learning.</p>
</abstract>
<fpdf>
<href>pdflogo.jpg</href>
<hpdf>AAMAS07_0468_d258826501a05d66a654b6fefe92f88a</hpdf>
</fpdf>
</body>
</article>
