My main research interests lie in what I call interactive learning and decision making: the intersection of AI, machine learning and game theory. I try to generate fundamental knowledge about algorithms and models for complex tasks. In addition, I think about how such abstract models might be applied to challenging real-world tasks such as collaboration in multi-robot systems, optimization of traffic control systems, intelligent e-commerce agents, etc.
For more information about my research, look here.
For research opportunities, look here.
For more information about possible student projects (current TU Delft students), look here.
April 24th, 2020: AAMAS: Maximizing Information Gain via Prediction Rewards
This paper tackles the problem of active perception: taking actions to minimize one’s uncertainty. It further formalizes the link between information gain and prediction rewards, and uses this to propose a deep-learning approach to optimize active perception from a data set, thus obviating the need for a complex POMDP model.
April 24th, 2020: IJCAI paper: Decentralized MCTS via Learned Teammate Models
Aleksander Czechowski got his paper on Decentralized MCTS via Learned Teammate Models accepted at IJCAI 2020.
In this paper we learn the models of other agents that each agent then uses to predict the future with. Stay tuned for the camready.
March 2nd, 2020: AAMAS camready: Model-based RL
Together with Thomas Kipf, Max Welling and myself, Elise van der Pol did some excellent work on model-based RL.
- This post on plannable approximations with MDP homomorphisms
- The paper
April 25th, 2019: Influence-Based Abstraction in Deep Reinforcement Learning
April 12th, 2019: Scaling Bayesian RL for Factored POMDPs
Reinforcement learning is tough. POMDPs are hard. And doing RL in partially observable problems is a huge challenge. With Sammie and Chris Amato, I have been making some progress to get a principled method (based on Monte Carlo tree search) too scale for structured problems. We can learn both how to act, as well as the structure of the problem at the same time. See the paper and bib.
February 27th, 2019: At AAMAS: Deep learning of Coordination…?
Can deep Q-networks etc. brute force their way through tough coordination problems…? Perhaps not. Jacopo’s work, accepted as an extended abstract at AAMAS’19, takes a first step in exploring this in the one-shot setting.
Not so surprising: “joint Q-learner” can be too large/slow and “individual Q-learners” can fail to find good representations.
But good to know: “factored Q-value functions” which represent the Q-function as a random mixture of components involving 2 or 3 agents, can do quite well, even for hard coordination tasks!
- August 2017 Great news: I have won an ERC Starting Grant!
- August 2017 Our article on active perception is published in Autonomous Robots.
- May 2017 ICML accepted our paper ‘Learning in POMDPs with Monte Carlo Tree Search’
- May 2017 EPSRC will fund my ‘first grant’ proposal. I will be looking for a postdoc with interests in multiagent systems, deep/reinforcement learning.
- April 2017 a 1000 citations – yay! 😉
- April 2017 Our AAMAS paper is nominated for best paper!
- Feb 2017 Daniel Claes has been doing awesome work getting our Warehouse Commisioning planning methods to work on real robots:
- Feb 2017 MADP Toolbox 0.4.1 released!
- Feb 2017 AAMAS accepted our paper on Multi-Robot Warehouse Commisioning, yay!
- Oct 2016 My student Elise’s thesis on traffic light control via deep reinforcement learning will be presented at BNAIC. We also have a video that nicely illustrates the behavior of our novel deep Q-learning+ transfer planning approach. (also described in this demo paper). A NIPS workshop paper is under submission, let me know if you are interested.
- Aug 2016 I am co-organising the NIPS workshop on Learning, Inference and Control of Multi-Agent Systems taking place Friday 9th December 2016, Barcelona, Spain. Join us!
- Jul 2016 The book on Dec-POMDPs I wrote with Chris Amato got published! Check it out here or at Springer.
- Apr 2016 Our paper on PAC Greedy Maximisation with Efficient Bounds on Information Gain for Sensor Selection is accepted at IJCAI. I will also present the poster at AAMAS in Singapore.
- Dec 2015 AAAI ’16 has accepted 4 of my papers and they are now on my publication page.
- Sept 2015 I will give an invited talk at the NIPS 2015 Workshop on Learning, Inference and Control of Multi-Agent Systems, Montreal
- July 2015 I am coorganizing the AAAI spring symposium on Challenges and Opportunities in Multiagent Learning for the Real World. While multiagent learning has been an active area of research for the last two decades, many challenges (such as uncertainty, partial observability, communication limitations, etc.) remain to be solved…!
- June 2015 I am coorganizing the Sequential Decision Making for Intelligent Agents AAAI fall symposium. We aim to bring together researchers that work on formal decision making methods (MDPs, POMDPs, Dec-POMDPs, I-POMDPs, etc.)
- April 2015 MADP v0.3.1 released!
- April 2015 Yay – IJCAI has accepted three of my papers!
- February 2015 Our article Computing Convex Coverage Sets for Faster Multi-Objective Coordination was accepted for publication in the Journal of AI Research.
- Januari 2015 Our work on computing upper bounds for factored Dec-POMDPs will appear at AAMAS as an extended abstract. Check out the extended version.
- Januari 2015 AAMAS has accepted our work on spatial task allocations problems (SPATAPS), see here.
- November 2014 AAAI has accepted two of my papers! One is about exploiting factored value functions in POMCP. The other investigates the use of sub-modularity in the full sequential POMDP setting.
- October 2014 A recent development in Dec-POMDP town is that these beasts can be reduced to a special case of (centralized) POMDPs. Chris and I decided it was time to try and give a quick overview of this approach in this technical report.
- July 2014 I started as Lecturer at the University of Liverpool.
- May 2014 It’s been a while, but we are releasing a new version of the MADP Toolbox ! Download it now: here. We are interested in your feedback, so let us know if you have problems or suggestions.
- April 2014 Chris Amato and I are doing interesting things on Bayesian reinforcement learning for MASs under state uncertainty. See our MSDM paper and technical report for more information.
- December 2013 Our paper Bounded Approximations for Linear Multi-Objective Planning under Uncertainty was accepted for publication at ICAPS.
- December 2013 Good news from AAMAS: A POMDP Based Approach to Optimally Select Sellers in Electronic Marketplaces and Linear Support for Multi-Objective Coordination Graphs both got accepted as full papers!
- October 2013 Our paper Effective Approximations for Spatial Task Allocation Problems was runner up for best paper at BNAIC.
- July 2013 I have been awarded a Veni grant for three years of research!
- June 2013 Presentations of AAMAS and an invited talk at CWI can now be found under Research highlights.
- April 2013 My paper Sufficient Plan-Time Statistics for Decentralized POMDPs is accepted at IJCAI 2013!
- April 2013 Christopher Amato and I are working on Bayesian RL for multiagent systems under uncertainty. See our paper here.
- February 2013 JAIR has accepted our article Incremental Clustering and Expansion for Faster Optimal Planning in Decentralized POMDPs for publication!
- December 2012 Our paper Approximate Solutions for Factored Dec-POMDPs with Many Agents is accepted as a full paper at AAMAS 2013!
- December 2012 Our paper Multi-Objective Variable Elimination for Collaborative Graphical Games is accepted as an extended abstract at AAMAS 2013.
- June 2012 Our paper Exploiting Structure in Cooperative Bayesian Games was accepted at UAI 2012!
- June 2012 Yay! I won the best PC member award at AAMAS 2012!
- June 2012 The 7th MSDM workshop was very interesting! Check the proceedings at the MSDM website.
- April 2012 Two of my papers will appear at AAAI!
- January 2012 My book chapter on decentralized POMDPs is finally published! It can be used as a introduction to Dec-POMDPs and also provides insight on how the current state-of-the-art algorithms (forward heuristic search and backward dynamic programming) for finite-horizon Dec-POMDPs relate to each other.
- December 2011 Two of my papers will appear at AAMAS: a full paper about heuristic search of multiagent influence and an extended abstract about effective method for performing the vector-based backup in settings with delayed communication.
- July 2010 I’ve started at MIT on July 15th!
- Look my thesis is on google books!
- February 2010 On February 12th 2010 I successfully defended my thesis, titled “Value-Based Planning for Teams of Agents in Stochastic Partially Observable Environments”.
- Prashant Doshi and his students Christopher Jackson and Kennth Bogert used the multiagent decision project (MADP) toolbox to compute policies in the game of StarCraft. Watch the (pretty cool) video.