Publications• Sorted by Date • Classified by Publication Type • Classified by Research Category • Abstraction for Bayesian Reinforcement Learning in Factored POMDPsRolf A. N. Starre, Sammie Katt, Mustafa Mert Çelikok, Marco Loog, and Frans A Oliehoek. Abstraction for Bayesian Reinforcement Learning in Factored POMDPs. Transactions on Machine Learning Research, 2025. DownloadAbstractBayesian reinforcement learning provides an elegant solution to addressing the explo-ration–exploitation trade-off in Partially Observable Markov Decision Processes (POMDPs)when the environment’s dynamics and reward function are initially unknown. By maintain-ing a belief over these unknown components and the state, the agent can effectively learnthe environment’s dynamics and optimize their policy. However, scaling Bayesian reinforce-ment learning methods to large problems remains to be a significant challenge. While priorwork has leveraged factored models and online sample-based planning to address this issue,these approaches often retain unnecessarily complex models and factors within the beliefspace that have minimal impact on the optimal policy. While this complexity might be nec-essary for accurate model learning, in reinforcement learning, the primary objective is notto recover the ground truth model but to optimize the policy for maximizing the expectedsum of rewards. Abstraction offers a way to reduce model complexity by removing factorsthat are less relevant to achieving high rewards. In this work, we propose and analyze theintegration of abstraction with online planning in factored POMDPs. Our empirical resultsdemonstrate two key benefits. First, abstraction reduces model size, enabling faster sim-ulations and thus more planning simulations within a fixed runtime. Second, abstractionenhances performance even with a fixed number of simulations due to greater statisticalstrength. These results underscore the potential of abstraction to improve both the scala-bility and effectiveness of Bayesian reinforcement learning in factored POMDPs. BibTeX Entry@article{Starre25TMLR, title= {Abstraction for Bayesian Reinforcement Learning in Factored POMDPs}, author= {Rolf A. N. Starre and Sammie Katt and Mustafa Mert Çelikok and Marco Loog and Frans A Oliehoek}, journal= TMLR, issn= {}, year= {2025}, url= {https://openreview.net/forum?id=HHgdT6m9L9}, keywords = {refereed}, abstract = { Bayesian reinforcement learning provides an elegant solution to addressing the explo- ration–exploitation trade-off in Partially Observable Markov Decision Processes (POMDPs) when the environment’s dynamics and reward function are initially unknown. By maintain- ing a belief over these unknown components and the state, the agent can effectively learn the environment’s dynamics and optimize their policy. However, scaling Bayesian reinforce- ment learning methods to large problems remains to be a significant challenge. While prior work has leveraged factored models and online sample-based planning to address this issue, these approaches often retain unnecessarily complex models and factors within the belief space that have minimal impact on the optimal policy. While this complexity might be nec- essary for accurate model learning, in reinforcement learning, the primary objective is not to recover the ground truth model but to optimize the policy for maximizing the expected sum of rewards. Abstraction offers a way to reduce model complexity by removing factors that are less relevant to achieving high rewards. In this work, we propose and analyze the integration of abstraction with online planning in factored POMDPs. Our empirical results demonstrate two key benefits. First, abstraction reduces model size, enabling faster sim- ulations and thus more planning simulations within a fixed runtime. Second, abstraction enhances performance even with a fixed number of simulations due to greater statistical strength. These results underscore the potential of abstraction to improve both the scala- bility and effectiveness of Bayesian reinforcement learning in factored POMDPs. } }
Generated by
bib2html.pl
(written by Patrick Riley) on
Mon Jun 30, 2025 20:06:05 UTC |