Publications

• Sorted by Date • Classified by Publication Type • Classified by Research Category •

Influence-aware memory architectures for deep reinforcement learning in POMDPs

Miguel Suau, Jinke He, Elena Congeduti, Rolf A.N. Starre, Aleksander Czechowski, and Frans A. Oliehoek. Influence-aware memory architectures for deep reinforcement learning in POMDPs. Neural Computing and Applications, September 2022.

Download

pdf [3.1MB] ps.gz ps HTML

Abstract

Due to its perceptual limitations, an agent may have too little information about the environment to act optimally. In such cases, it is important to keep track of the action-observation history to uncover hidden state information. Recent deep reinforcement learning methods use recurrent neural networks (RNN) to memorize past observations. However, these models are expensive to train and have convergence difficulties, especially when dealing with high dimensional data. In this paper, we propose influence-aware memory, a theoretically inspired memory architecture that alleviates the training difficulties by restricting the input of the recurrent layers to those variables that influence the hidden state information. Moreover, as opposed to standard RNNs, in which every piece of information used for estimating Q values is inevitably fed back into the network for the next prediction, our model allows information to flow without being necessarily stored in the RNN’s internal memory. Results indicate that, by letting the recurrent layers focus on a small fraction of the observation variables while processing the rest of the information with a feedforward neural network, we can outperform standard recurrent architectures both in training speed and policy performance. This approach also reduces runtime and obtains better scores than methods that stack multiple observations to remove partial observability.

BibTeX Entry

@ARTICLE{Suau22NCA,
        title = {Influence-aware memory architectures for deep reinforcement learning in {POMDPs}},
    author =    {Suau, Miguel and
                 He, Jinke and
                 Elena Congeduti and
                 Rolf A.N. Starre and 
                 Aleksander Czechowski and
                 Frans A. Oliehoek},
      journal = {Neural Computing and Applications},
         year = 2022,
        month = sep,
        doi  =  {10.1007/s00521-022-07691-7},
    keywords =  {refereed},
    url =       {https://doi.org/10.1007/s00521-022-07691-7},
    abstract = {
        Due to its perceptual limitations, an agent may have too little information
        about the environment to act optimally. In such cases, it is important
        to keep track of the action-observation history to uncover hidden state
        information. Recent deep reinforcement learning methods use recurrent
        neural networks (RNN) to memorize past observations. However, these
        models are expensive to train and have convergence difficulties,
        especially when dealing with high dimensional data. In this paper, we
        propose influence-aware memory, a theoretically inspired memory
        architecture that alleviates the training difficulties by restricting
        the input of the recurrent layers to those variables that influence the
        hidden state information. Moreover, as opposed to standard RNNs, in
        which every piece of information used for estimating Q values is
        inevitably fed back into the network for the next prediction, our model
        allows information to flow without being necessarily stored in the
        RNN’s internal memory. Results indicate that, by letting the recurrent
        layers focus on a small fraction of the observation variables while
        processing the rest of the information with a feedforward neural
        network, we can outperform standard recurrent architectures both in
        training speed and policy performance. This approach also reduces
        runtime and obtains better scores than methods that stack multiple
        observations to remove partial observability.
    }
}

Generated by bib2html.pl (written by Patrick Riley) on Tue Jun 25, 2024 12:39:45 UTC