Publications

• Sorted by Date • Classified by Publication Type • Classified by Research Category •

Lossless Clustering of Histories in Decentralized POMDPs

Frans A. Oliehoek, Shimon Whiteson, and Matthijs T. J. Spaan. Lossless Clustering of Histories in Decentralized POMDPs. In Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 577–584, May 2009.

Download

pdf [209.6kB]

Abstract

Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute a generic and expressive framework for multiagent planning under uncertainty. However, planning optimally is difficult because solutions map local observation histories to actions, and the number of such histories grows exponentially in the planning horizon. In this work, we identify a criterion that allows for lossless clustering of observation histories: i.e., we prove that when two histories satisfy the criterion, they have the same optimal value and thus can be treated as one. We show how this result can be exploited in optimal policy search and demonstrate empirically that it can provide a speed-up of multiple orders of magnitude, allowing the optimal solution of significantly larger problems. We also perform an empirical analysis of the generality of our clustering method, which suggests that it may also be useful in other (approximate) Dec-POMDP solution methods.

BibTeX Entry

@InProceedings{Oliehoek09AAMAS,
  author =       {Frans A. Oliehoek and 
                  Shimon Whiteson and
                  Matthijs T. J. Spaan},
  title =        {Lossless Clustering of Histories in Decentralized {POMDP}s},
  booktitle =    AAMAS09,
  month =        may,
  year =         2009,
  pages =        {577--584},
  url =         {http://www.ifaamas.org/Proceedings/aamas09/pdf/01_Full%20Papers/09_49_FP_0366.pdf},
  abstract = 	 {
  Decentralized partially observable Markov decision processes (Dec-POMDPs)
  constitute a generic and expressive framework for multiagent planning
  under uncertainty. However, planning optimally is difficult because
  solutions map local observation histories to actions, and the number
  of such histories grows exponentially in the planning horizon. In
  this work, we identify a criterion that allows for lossless clustering
  of observation histories: i.e., we prove that when two histories satisfy
  the criterion, they have the same optimal value and thus can be treated
  as one. We show how this result can be exploited in optimal policy
  search and demonstrate empirically that it can provide a speed-up
  of multiple orders of magnitude, allowing the optimal solution of
  significantly larger problems.  We also perform an empirical analysis
  of the generality of our clustering method, which suggests that it
  may also be useful in other (approximate) Dec-POMDP solution methods. 
  }
}

Generated by bib2html.pl (written by Patrick Riley) on Tue Jun 25, 2024 12:39:45 UTC