Can deep Q-networks etc. brute force their way through tough coordination problems…? Perhaps not. Jacopo’s work, accepted as an extended abstract at AAMAS’19, takes a first step in exploring this in the one-shot setting.
Not so surprising: “joint Q-learner” can be too large/slow and “individual Q-learners” can fail to find good representations.
But good to know: “factored Q-value functions” which represent the Q-function as a random mixture of components involving 2 or 3 agents, can do quite well, even for hard coordination tasks!
“Learning from Demonstration in the Wild” is work that I did with the folks at LatentLogic. It’s pretty cool, check the video on youtube and the paper on arXiv:
Join the INFLUENCE research team!
I am looking for a 3-year Postdoc in Influence-based Abstraction, Learning and Coordination. Find me at #IJCAI2018 to informally discuss.
More info: vacancies
My invited IJCAI paper giving an overview of (some of, apologies to some coauthors, could not fit everything there….) my research is now available from my publications page.
I’m going to IJCAI: I have been invited to give a talk in the IJCAI-ECAI ’18 Early Career Spotlight track – I feel very honored.
See you in Stockholm!
2 PhD, and 1 postdoc vacancy are now live. Apply by 23rd of April!
I am hiring a total of 3 PhD students and 2 postdocs.
These are fully funded positions. For more information, please see vacancies.