Publications

• Sorted by Date • Classified by Publication Type • Classified by Research Category •

Q-value Functions for Decentralized POMDPs

Frans A. Oliehoek and Nikos Vlassis. Q-value Functions for Decentralized POMDPs. In Proceedings of the Sixth Joint International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 833–840, May 2007.

Download

pdf [171.2kB] ps.gz [163.9kB]

Abstract

Planning in single-agent models like MDPs and POMDPs can be carried out by resorting to Q-value functions: a (near-) optimal Q-value function is computed in a recursive manner by dynamic programming, and then a policy is extracted from this value function. In this paper we study whether similar Q-value functions can be defined in decentralized POMDP models (Dec-POMDPs), what the cost of computing such value functions is, and how policies can be extracted from such value functions. Using the framework of Bayesian games, we argue that searching for the optimal Q-value function may be as costly as exhaustive policy search. Then we analyze various approximate Q-value functions that allow efficient computation. Finally, we describe a family of algorithms for extracting policies from such Q-value functions.

BibTeX Entry

@InProceedings{Oliehoek07AAMAS,
    author =    {Frans A. Oliehoek and Nikos Vlassis},
    title =     {Q-value Functions for Decentralized {POMDP}s},
    booktitle = AAMAS07,
    month =     may,
    year =      2007,
    pages =     {833--840},
    url =       {http://www.ifaamas.org/Proceedings/aamas07/html/AAMAS07_0148_b86e432dcdf71f75f5c8edada8a5ae70.xml},
    abstract = 	{
    Planning in single-agent models like MDPs and POMDPs can be carried
    out by resorting to Q-value functions: a (near-) optimal Q-value 
    function is computed in a recursive manner by dynamic programming, 
    and then a policy is extracted from this value function. In this 
    paper we study whether similar Q-value functions can be defined 
    in decentralized POMDP models (Dec-POMDPs), what the cost of 
    computing such value functions is, and how policies can be 
    extracted from such value functions. Using the framework of 
    Bayesian games, we argue that searching for the optimal Q-value 
    function may be as costly as exhaustive policy search. Then we 
    analyze various approximate Q-value functions that allow efficient
    computation. Finally, we describe a family of algorithms for 
    extracting policies from such Q-value functions. 
    }
}

Generated by bib2html.pl (written by Patrick Riley) on Tue Jun 25, 2024 12:39:45 UTC