Publications• Sorted by Date • Classified by Publication Type • Classified by Research Category • Quality Assessment of MORL Algorithms: A Utility-Based ApproachLuisa M. Zintgraf, Timon V. Kanters, Diederik M. Roijers, Frans A. Oliehoek, and Philipp Beau. Quality Assessment of MORL Algorithms: A Utility-Based Approach. In Proceedings of the 24th Annual Machine Learning Conference of Belgium and the Netherlands (Benelearn), 2015. DownloadAbstractSequential decision-making problems with multiple objectives occur often in practice. In such settings, the utility of a policy depends on how the user values different trade-offs between the objectives. Such valuations can be expressed by a so-called scalarisation function. However, the exact scalarisation function can be unknown when the agents should learn or plan. Therefore, instead of a single solution, the agents aim to produce a solution set that contains an optimal solution for all possible scalarisations. Because it is often not possible to produce an exact solution set, many algorithms have been proposed that produce approximate solution sets instead. We argue that when comparing these algorithms we should do so on the basis of user utility, and on a wide range of problems. In practice however, comparison of the quality of these algorithms have typically been done with only a few limited benchmarks and metrics that do not directly express the utility for the user. In this paper, we propose two metrics that express either the expected utility, or the maximal utility loss with respect to the optimal solution set. Furthermore, we propose a generalised benchmark in order to compare algorithms more reliably. BibTeX Entry@inproceedings{Zintgraf15Benelearn,
author = {Luisa M. Zintgraf and
Timon V. Kanters and
Diederik M. Roijers and
Frans A. Oliehoek and
Philipp Beau},
title = {Quality Assessment of {MORL} Algorithms: A Utility-Based Approach},
booktitle = Benelearn15,
year = 2015,
abstract = {
Sequential decision-making problems with multiple objectives occur
often in practice. In such settings, the utility of a policy depends on
how the user values different trade-offs between the objectives. Such
valuations can be expressed by a so-called scalarisation function.
However, the exact scalarisation function can be unknown when the
agents should learn or plan. Therefore, instead of a single solution,
the agents aim to produce a solution set that contains an optimal
solution for all possible scalarisations. Because it is often not
possible to produce an exact solution set, many algorithms have been
proposed that produce approximate solution sets instead. We argue that
when comparing these algorithms we should do so on the basis of user
utility, and on a wide range of problems. In practice however,
comparison of the quality of these algorithms have typically been done
with only a few limited benchmarks and metrics that do not directly
express the utility for the user. In this paper, we propose two
metrics that express either the expected utility, or the maximal
utility loss with respect to the optimal solution set. Furthermore, we
propose a generalised benchmark in order to compare algorithms more
reliably.
}
}
Generated by
bib2html.pl
(written by Patrick Riley) on
Thu Nov 06, 2025 10:14:50 UTC |