Publications• Sorted by Date • Classified by Publication Type • Classified by Research Category • Coevolutionary Nash in Poker GamesFrans A. Oliehoek, Nikos Vlassis, and Edwin de Jong. Coevolutionary Nash in Poker Games. In Proceedings of the 17th Belgian-Dutch Conference on Artificial Intelligence (BNAIC), pp. 188–193, October 2005. DownloadAbstractWe address the problem of learning good policies in poker games. Theclassical game theoretic approach to this problem specifies a Nashequilibrium solution, i.e., a pair of secure mixed (randomized)policies. We describe a new approach for calculating such securepolicies based on coevolution. Here, populations of pure policies forboth players are simultaneously evolved by repeated comparisonsagainst each other, and secure mixed policies are computed from bothpopulations by linear programming. The search heuristic for adding newcandidate pure policies involves computing a best-response pure policy(by solving a POMDP) that provides a worst-case payoff for each mixedpolicy. We provide experimental results suggesting that a Nashequilibrium policy can be approximated in relatively few iterations,thereby producing mixed policies with relatively small support. Weconclude that this is a promising direction of research and providedirections for future work. BibTeX Entry@InProceedings{Oliehoek05BNAIC, author = {Frans A. Oliehoek and Nikos Vlassis and Edwin de Jong}, title = {Coevolutionary {N}ash in Poker Games}, booktitle = BNAIC05, month = oct, year = 2005, pages = {188--193}, abstract = { We address the problem of learning good policies in poker games. The classical game theoretic approach to this problem specifies a Nash equilibrium solution, i.e., a pair of secure mixed (randomized) policies. We describe a new approach for calculating such secure policies based on coevolution. Here, populations of pure policies for both players are simultaneously evolved by repeated comparisons against each other, and secure mixed policies are computed from both populations by linear programming. The search heuristic for adding new candidate pure policies involves computing a best-response pure policy (by solving a POMDP) that provides a worst-case payoff for each mixed policy. We provide experimental results suggesting that a Nash equilibrium policy can be approximated in relatively few iterations, thereby producing mixed policies with relatively small support. We conclude that this is a promising direction of research and provide directions for future work.} }
Generated by
bib2html.pl
(written by Patrick Riley) on
Mon Oct 07, 2024 14:17:04 UTC |