Publications

• Sorted by Date • Classified by Publication Type • Classified by Research Category •

A Cross-Entropy Approach to Solving Dec-POMDPs

Frans A. Oliehoek, Julian F. P. Kooij, and Nikos Vlassis. A Cross-Entropy Approach to Solving Dec-POMDPs. In Proceedings of the 1st International Symposium on Intelligent and Distributed Computing (IDC), pp. 145–154, October 2007.

Download

pdf [146.8kB] ps.gz [376.0kB]

Abstract

Decentralized POMDPs (Dec-POMDPs) are becoming increasingly popular as models for multiagent planning under uncertainty, but solving a Dec-POMDP exactly is known to be intractable. In this paper we examine the use of the Cross-Entropy (CE) method as a randomized (sampling-based) algorithm for approximately solving Dec-POMDPs. In our setup, CE operates by sampling pure policies from an appropriately parametrized stochastic policy, and then evaluates these policies either exactly or approximately in order to define the next stochastic policy to sample from, and so on until convergence. Experimental results demonstrate the potential of the CE method as an alternative to state-of-the-art approximate techniques for solving Dec-POMDPs.

BibTeX Entry

@InProceedings{Oliehoek07IDC,
    author =       {Frans A. Oliehoek and Julian F. P. Kooij and 
                    Nikos Vlassis},
    title =        {A Cross-Entropy Approach to Solving {Dec-POMDPs}},
    booktitle =    {Proceedings of the 1st International Symposium on 
                    Intelligent and Distributed Computing (IDC)},
    month =        oct,
    year =         2007,
    pages =        {145--154},
    abstract = 	 {
    Decentralized POMDPs (Dec-POMDPs) are becoming increasingly popular
    as models for multiagent planning under uncertainty, but solving
    a Dec-POMDP exactly is known to be intractable. In this paper we
    examine the use of the Cross-Entropy (CE) method as a randomized
    (sampling-based) algorithm for approximately solving Dec-POMDPs.
    In our setup, CE operates by sampling pure policies from an
    appropriately parametrized stochastic policy, and then evaluates
    these policies either exactly or approximately in order to define
    the next stochastic policy to sample from, and so on until
    convergence. Experimental results demonstrate the potential of
    the CE method as an alternative to state-of-the-art approximate
    techniques for solving Dec-POMDPs.}
}

Generated by bib2html.pl (written by Patrick Riley) on Mon Apr 08, 2024 20:28:07 UTC