Publications

Sorted by DateClassified by Publication TypeClassified by Research Category

Abstraction for Bayesian Reinforcement Learning in Factored POMDPs

Rolf A. N. Starre, Sammie Katt, Mustafa Mert Çelikok, Marco Loog, and Frans A Oliehoek. Abstraction for Bayesian Reinforcement Learning in Factored POMDPs. Transactions on Machine Learning Research, 2025.

Download

pdf [2.7MB]  

Abstract

Bayesian reinforcement learning provides an elegant solution to addressing the explo-ration–exploitation trade-off in Partially Observable Markov Decision Processes (POMDPs)when the environment’s dynamics and reward function are initially unknown. By maintain-ing a belief over these unknown components and the state, the agent can effectively learnthe environment’s dynamics and optimize their policy. However, scaling Bayesian reinforce-ment learning methods to large problems remains to be a significant challenge. While priorwork has leveraged factored models and online sample-based planning to address this issue,these approaches often retain unnecessarily complex models and factors within the beliefspace that have minimal impact on the optimal policy. While this complexity might be nec-essary for accurate model learning, in reinforcement learning, the primary objective is notto recover the ground truth model but to optimize the policy for maximizing the expectedsum of rewards. Abstraction offers a way to reduce model complexity by removing factorsthat are less relevant to achieving high rewards. In this work, we propose and analyze theintegration of abstraction with online planning in factored POMDPs. Our empirical resultsdemonstrate two key benefits. First, abstraction reduces model size, enabling faster sim-ulations and thus more planning simulations within a fixed runtime. Second, abstractionenhances performance even with a fixed number of simulations due to greater statisticalstrength. These results underscore the potential of abstraction to improve both the scala-bility and effectiveness of Bayesian reinforcement learning in factored POMDPs.

BibTeX Entry

@article{Starre25TMLR,
    title=      {Abstraction for Bayesian Reinforcement Learning in Factored POMDPs},
    author=     {Rolf A. N. Starre and Sammie Katt and Mustafa Mert Çelikok
                 and Marco Loog and Frans A Oliehoek},
    journal=    TMLR,
    issn=       {},
    year=       {2025},
    url=        {https://openreview.net/forum?id=HHgdT6m9L9},
    keywords =  {refereed},
    abstract = {
        Bayesian reinforcement learning provides an elegant solution to addressing the explo-
ration–exploitation trade-off in Partially Observable Markov Decision Processes (POMDPs)
when the environment’s dynamics and reward function are initially unknown. By maintain-
ing a belief over these unknown components and the state, the agent can effectively learn
the environment’s dynamics and optimize their policy. However, scaling Bayesian reinforce-
ment learning methods to large problems remains to be a significant challenge. While prior
work has leveraged factored models and online sample-based planning to address this issue,
these approaches often retain unnecessarily complex models and factors within the belief
space that have minimal impact on the optimal policy. While this complexity might be nec-
essary for accurate model learning, in reinforcement learning, the primary objective is not
to recover the ground truth model but to optimize the policy for maximizing the expected
sum of rewards. Abstraction offers a way to reduce model complexity by removing factors
that are less relevant to achieving high rewards. In this work, we propose and analyze the
integration of abstraction with online planning in factored POMDPs. Our empirical results
demonstrate two key benefits. First, abstraction reduces model size, enabling faster sim-
ulations and thus more planning simulations within a fixed runtime. Second, abstraction
enhances performance even with a fixed number of simulations due to greater statistical
strength. These results underscore the potential of abstraction to improve both the scala-
bility and effectiveness of Bayesian reinforcement learning in factored POMDPs.
    }
}

Generated by bib2html.pl (written by Patrick Riley) on Mon Jun 30, 2025 20:06:05 UTC