Publications

Sorted by DateClassified by Publication TypeClassified by Research Category

Point-Based Planning for Multi-Objective POMDPs

Diederik Roijers, Shimon Whiteson, and Frans A. Oliehoek. Point-Based Planning for Multi-Objective POMDPs. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, pp. 1666–1672, July 2015.

Download

pdf [759.5kB]  

Abstract

Many sequential decision-making problems require an agent to reason about both multiple objec- tives and uncertainty regarding the environment’s state. Such problems can be naturally modelled as multi-objective partially observable Markov deci- sion processes (MOPOMDPs). We propose opti- mistic linear support with alpha reuse (OLSAR), which computes a bounded approximation of the optimal solution set for all possible weightings of the objectives. The main idea is to solve a series of scalarized single-objective POMDPs, each cor- responding to a different weighting of the objec- tives. A key insight underlying OLSAR is that the policies and value functions produced when solv- ing scalarized POMDPs in earlier iterations can be reused to more quickly solve scalarized POMDPs in later iterations. We show experimentally that OLSAR outperforms, both in terms of runtime and approximation quality, alternative methods and a variant of OLSAR that does not leverage reuse.

BibTeX Entry

@inproceedings{Roijers15IJCAI,
    author =    {Diederik Roijers and Shimon Whiteson and Frans A. Oliehoek},
    title =     {Point-Based Planning for Multi-Objective {POMDPs}},
    booktitle = IJCAI15,
    year =      2015,
    month =     jul,
    pages =     {1666--1672},
    note =      {},
    abstract = {
    Many sequential decision-making problems require
    an agent to reason about both multiple objec-
    tives and uncertainty regarding the environment’s
    state. Such problems can be naturally modelled as
    multi-objective partially observable Markov deci-
    sion processes (MOPOMDPs). We propose opti-
    mistic linear support with alpha reuse (OLSAR),
    which computes a bounded approximation of the
    optimal solution set for all possible weightings of
    the objectives. The main idea is to solve a series
    of scalarized single-objective POMDPs, each cor-
    responding to a different weighting of the objec-
    tives. A key insight underlying OLSAR is that the
    policies and value functions produced when solv-
    ing scalarized POMDPs in earlier iterations can be
    reused to more quickly solve scalarized POMDPs
    in later iterations. We show experimentally that
    OLSAR outperforms, both in terms of runtime and
    approximation quality, alternative methods and a
    variant of OLSAR that does not leverage reuse.        
    }
}

Generated by bib2html.pl (written by Patrick Riley) on Fri Sep 15, 2017 12:26:00 UTC