Publications• Sorted by Date • Classified by Publication Type • Classified by Research Category • Point-Based Planning for Multi-Objective POMDPsDiederik M. Roijers, Shimon Whiteson, and Frans A. Oliehoek. Point-Based Planning for Multi-Objective POMDPs. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), pp. 1666–1672, July 2015. DownloadAbstractMany sequential decision-making problems require an agent to reason about both multiple objec- tives and uncertainty regarding the environment's state. Such problems can be naturally modelled as multi-objective partially observable Markov deci- sion processes (MOPOMDPs). We propose opti- mistic linear support with alpha reuse (OLSAR), which computes a bounded approximation of the optimal solution set for all possible weightings of the objectives. The main idea is to solve a series of scalarized single-objective POMDPs, each cor- responding to a different weighting of the objec- tives. A key insight underlying OLSAR is that the policies and value functions produced when solv- ing scalarized POMDPs in earlier iterations can be reused to more quickly solve scalarized POMDPs in later iterations. We show experimentally that OLSAR outperforms, both in terms of runtime and approximation quality, alternative methods and a variant of OLSAR that does not leverage reuse. BibTeX Entry@inproceedings{Roijers15IJCAI, author = {Diederik M. Roijers and Shimon Whiteson and Frans A. Oliehoek}, title = {Point-Based Planning for Multi-Objective {POMDPs}}, booktitle = IJCAI15, year = 2015, month = jul, pages = {1666--1672}, url = {https://www.aaai.org/ocs/index.php/IJCAI/IJCAI15/paper/view/10939/10894}, abstract = { Many sequential decision-making problems require an agent to reason about both multiple objec- tives and uncertainty regarding the environment's state. Such problems can be naturally modelled as multi-objective partially observable Markov deci- sion processes (MOPOMDPs). We propose opti- mistic linear support with alpha reuse (OLSAR), which computes a bounded approximation of the optimal solution set for all possible weightings of the objectives. The main idea is to solve a series of scalarized single-objective POMDPs, each cor- responding to a different weighting of the objec- tives. A key insight underlying OLSAR is that the policies and value functions produced when solv- ing scalarized POMDPs in earlier iterations can be reused to more quickly solve scalarized POMDPs in later iterations. We show experimentally that OLSAR outperforms, both in terms of runtime and approximation quality, alternative methods and a variant of OLSAR that does not leverage reuse. } }
Generated by
bib2html.pl
(written by Patrick Riley) on
Mon Oct 07, 2024 14:17:04 UTC |