Publications

• Sorted by Date • Classified by Publication Type • Classified by Research Category •

Safety Guarantees in Multi-agent Learning via Trapping Regions

Aleksander Czechowski and Frans A. Oliehoek. Safety Guarantees in Multi-agent Learning via Trapping Regions. In Proceedings of the 32nd International Joint Conference on Artificial Intelligence (IJCAI), August 2023.

Download

pdf [406.7kB]

Abstract

One of the main challenges of multi-agent learn- ing lies in establishing convergence of the algo- rithms, as, in general, a collection of individual, self-serving agents is not guaranteed to converge with their joint policy, when learning concurrently. This is in stark contrast to most single-agent envi- ronments, and sets a prohibitive barrier for deploy- ment in practical applications, as it induces uncer- tainty in long term behavior of the system. In this work, we apply the concept of trapping regions, known from qualitative theory of dynamical sys- tems, to create safety sets in the joint strategy space for decentralized learning. We propose a binary partitioning algorithm for verification that candi- date sets form trapping regions in systems with known learning dynamics, and a heuristic sampling algorithm for scenarios where learning dynamics are not known. We demonstrate the applications to a regularized version of Dirac Generative Ad- versarial Network, a four-intersection traffic con- trol scenario run in a state of the art open-source microscopic traffic simulator SUMO, and a mathe- matical model of economic competition.

BibTeX Entry

@inproceedings{Czechowski23IJCAI,
    author =    {Czechowski, Aleksander and Oliehoek, Frans A.},
    title =     {Safety Guarantees in Multi-agent Learning via Trapping Regions},
    booktitle = IJCAI23,
    year =      2023,
    month =     aug,
    keywords =  {refereed},
    abstract =  {
        One of the main challenges of multi-agent learn-
        ing lies in establishing convergence of the algo-
        rithms, as, in general, a collection of individual,
        self-serving agents is not guaranteed to converge
        with their joint policy, when learning concurrently.
        This is in stark contrast to most single-agent envi-
        ronments, and sets a prohibitive barrier for deploy-
        ment in practical applications, as it induces uncer-
        tainty in long term behavior of the system. In this
        work, we apply the concept of trapping regions,
        known from qualitative theory of dynamical sys-
        tems, to create safety sets in the joint strategy space
        for decentralized learning. We propose a binary
        partitioning algorithm for verification that candi-
        date sets form trapping regions in systems with
        known learning dynamics, and a heuristic sampling
        algorithm for scenarios where learning dynamics
        are not known. We demonstrate the applications
        to a regularized version of Dirac Generative Ad-
        versarial Network, a four-intersection traffic con-
        trol scenario run in a state of the art open-source
        microscopic traffic simulator SUMO, and a mathe-
        matical model of economic competition.
    }
}

Generated by bib2html.pl (written by Patrick Riley) on Tue Jun 25, 2024 12:39:45 UTC