ILDM@AAMAS

There are 4 ILDM papers that will be presented at the main AAMAS conference. Here is the schedule in CEST:

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning
Markus Peschl, Arkady Zgonnikov, Frans Oliehoek and Luciano Siebert
1A2-2 – CEST (UTC +2) Wed, 11 May 2022 18:00
2C5-1 – CEST (UTC +2) Thu, 12 May 2022 09:00

Best-Response Bayesian Reinforcement Learning with BA-POMDPs for Centaurs
Mustafa Mert Çelikok, Frans A. Oliehoek and Samuel Kaski
2C2-2 – CEST (UTC +2) Thu, 12 May 2022 10:00
2A4-3 – CEST (UTC +2) Thu, 12 May 2022 20:00

LEARN BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs
Sammie Katt, Hai Nguyen, Frans Oliehoek and Christopher Amato
1A2-2 – CEST (UTC +2) Wed, 11 May 2022 18:00
3B3-2 – CEST (UTC +2) Fri, 13 May 2022 03:00

Miguel Suau, Jinke He, Matthijs Spaan and Frans Oliehoek
Speeding up Deep Reinforcement Learning through Influence-Augmented Local Simulators
Poster session PDC2 – CEST (UTC +2) Thu, 12 May 2022 12:00

Poincaré-Bendixson Limit Sets in Multi-Agent Learning (Best paper runner-up)
Aleksander Czechowski and Georgios Piliouras
1A4-1 11th May 5pm CEST
3C1-1 13th May 9am CEST

First place ILDM team in RangL Pathways to Net Zero challenge

Aleksander Czechowski and Jinke He are one of the winning teams (‘Epsilon-greedy’) of The RangL Pathways to Net Zero challenge!

The challenge was to find the optimal pathway to a carbon neutral 2050. ‘RangL’ is a competition platform created at The Alan Turing Institute as a new model of collaboration between academia and industry. RangL offers an AI competition environment for practitioners to apply classical and machine learning techniques and expert knowledge to data-driven control problems.

More information: https://rangl.org/blog/ and https://github.com/rangl-labs/netzerotc.

AAMAS’22 paper: Bayesian RL to cooperate with humans

In our new paper Best-Response Bayesian Reinforcement Learning with BA-POMDPs for Centaurs, we investigate a machine whose actions can be overridden by the human. We show how Bayesian RL might lead to quick adaptation to unknown human preferences, as well as aiding the human to pursue its true goals in case of temporally inconsistent behaviors. All credits to Mert for all the hard work!