Evolutionary Planning in Latent Space

Thor Valentin Aakjær Nielsen Olesen, Dennis Thinh Tan Nguyen, Rasmus Berg Palm, Sebastian Risi

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

Planning is a powerful approach to reinforcement learning with several desirable properties. However, it requires a model of the world, which is not readily available in many real-life problems. In this paper, we propose to learn a world model that enables Evolutionary Planning in Latent Space (EPLS). We use a Variational Auto Encoder (VAE) to learn a compressed latent representation of individual observations and extend a Mixture Density Recurrent Neural Network (MDRNN) to learn a stochastic, multi-modal forward model of the world that can be used for planning. We use the Random Mutation Hill Climbing (RMHC) to find a sequence of actions that maximize expected reward in this learned model of the world. We demonstrate how to build a model of the world by bootstrapping it with rollouts from a random policy and iteratively refining it with rollouts from an increasingly accurate planning policy using the learned world model. After a few iterations of this refinement, our planning agents are better than standard model-free reinforcement learning approaches demonstrating the viability of our approach.
OriginalsprogEngelsk
TitelInternational Conference on the Applications of Evolutionary Computation
ForlagEvoStar
Publikationsdato2021
StatusUdgivet - 2021
BegivenhedInternational Conference on the Applications of Evolutionary Computation -
Varighed: 7 apr. 2021 → …

Konference

KonferenceInternational Conference on the Applications of Evolutionary Computation
Periode07/04/2021 → …

Emneord

  • Reinforcement Learning
  • World Model
  • Latent Space
  • Variational Auto Encoder
  • Evolutionary Planning

Fingeraftryk

Dyk ned i forskningsemnerne om 'Evolutionary Planning in Latent Space'. Sammen danner de et unikt fingeraftryk.

Citationsformater