In this work we show how symmetries that can occur in MDPs can be exploited for more efficient deep reinforcement learning.
This paper shows that also in decentralized multiagent settings we can employ “prediction rewards” for active perception. (Intuitively leading to a type of voting that we try to optimize).
The camready version of Influence-Augmented Online Planning for Complex Environments is now available.
In this work, we show that by learning approximate representations of influence, we can speed up online planning (POMCP) sufficiently to get better performance when the time for online decision making is constrained.
Three of our papers were accepted at NeurIPS. For short descriptions, see my tweet.
(Updated) arxiv links will follow…
I’m looking for a postdoc to work on learning in interactive settings. Please see https://www.fransoliehoek.net/wp/vacancies/.
This paper tackles the problem of active perception: taking actions to minimize one’s uncertainty. It further formalizes the link between information gain and prediction rewards, and uses this to propose a deep-learning approach to optimize active perception from a data set, thus obviating the need for a complex POMDP model.
Aleksander Czechowski got his paper on Decentralized MCTS via Learned Teammate Models accepted at IJCAI 2020.
In this paper we learn the models of other agents that each agent then uses to predict the future with. Stay tuned for the camready.
Together with Thomas Kipf, Max Welling and myself, Elise van der Pol did some excellent work on model-based RL.
- This post on plannable approximations with MDP homomorphisms
- The paper
I will be co-organizing a AAAI spring symposium on “Challenges and Opportunities for Multi-Agent Reinforcement Learning”. We want to make it a workshop with some actual ‘work’. Please read here for more info.
This is the question that De Volkskrant asked me to comment on. Find the piece here (in Dutch).