Safe Reinforcement Learning through Meta-learned Instincts

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

An important goal in reinforcement learning is to create agents that can quickly adapt to new goals while avoiding situations that might cause damage to themselves or their environments. One way agents learn is through exploration mechanisms, which are needed to discover new policies. However, in deep reinforcement learning, exploration is normally done by injecting noise in the action space. While performing well in many domains, this setup has the inherent risk that the noisy actions performed by the agent lead to unsafe states in the environment. Here we introduce a novel approach called Meta-Learned Instinctual Networks (MLIN) that allows agents to safely learn during their lifetime while avoiding potentially hazardous states. At the core of the approach is a plastic network trained through reinforcement learning and an evolved “instinctual” network, which does not change during the agent's lifetime but can modulate the noisy output of the plastic network. We test our idea on a simple 2D navigation task with no-go zones, in which the agent has to learn to approach new targets during deployment. MLIN outperforms standard meta-trained networks and allows agents, after an evolutionary training phase, to learn to navigate to new targets without colliding with any of the no-go zones. These results suggest that meta-learning augmented with an instinctual network is a promising new approach for RL in safety-critical domains.
OriginalsprogEngelsk
TitelALIFE 2020 : The 2020 Conference on Artificial Life
Antal sider8
ForlagMIT Press
Publikationsdato13 jul. 2020
Udgave32
Sider183-291
DOI
StatusUdgivet - 13 jul. 2020
BegivenhedALIFE 2020: The 2020 Conference on Artificial Life - online, Montreal, Canada
Varighed: 13 jul. 202018 dec. 2020
http://2020.alife.org/
https://alife.org/conference/alife-2020/

Konference

KonferenceALIFE 2020
Lokationonline
Land/OmrådeCanada
ByMontreal
Periode13/07/202018/12/2020
Internetadresse
NavnArtificial Life Conference Proceedings
ISSN2693-1508

Emneord

  • Reinforcement learning
  • safe reinforcement learning
  • Evolutionary algorithms
  • Life-long learning

Fingeraftryk

Dyk ned i forskningsemnerne om 'Safe Reinforcement Learning through Meta-learned Instincts'. Sammen danner de et unikt fingeraftryk.

Citationsformater