Algorithms for Adaptive Game-playing Agents
Research output: Book / Anthology / Report / Ph.D. thesis › Ph.D. thesis › Research
Several games have been promoted by researchers as key challenges in the research field of Artificial Intelligence (AI) through the years, with the ultimate goal of defeating the best human players in these games. Recent developments in deep learning have enabled computers to learn strong policies for many games, where previous methods have fallen short. However, the most complex games, such as the Real-time Strategy (RTS) game StarCraft (Blizzard Entertainment, 1998), are still not mastered by AI. We identify three properties of adaptivity that we believe are required to fully master the most difficult games with AI. These properties are: (1) intra-game adaptivity: the ability to adapt to opponent strategies within a game, (2) inter-game adaptivity: the ability to intelligently switch strategy in-between games, and (3) generality: the ability to generalize to many different, and most likely unseen, variations (such as different levels). We analyze the shortcomings of state-of-the-art game-playing algorithms in regards to adaptation and present novel algorithmic approaches to each property. Several of the presented approaches also attempt to overcome the difficulty of learning adaptive policies in games with sparse rewards. The main contributions in this dissertation are: (a) a continual evolutionary planning algorithm that performs online adaptive build-order planning in StarCraft, (b) an imitation learning approach to intra-game adaptive build-order planning in StarCraft, resulting in the first (to the best of our knowledge) neural-network-based bot that plays the full game, (c) a novel imitation learning method for learning behavioral repertoires from demonstrations, which allows for inter-game adaptivity, (d) an automatic reward shaping technique for reinforcement learning that automatically assigns feedback values based on the temporal rarity of pre-defined events, that works as a form of curriculum learning and regularization technique to avoid overfitted behaviors in games with sparse rewards, (e) a new reinforcement learning framework that incorporates procedural content generation to generate new training levels each episode that get progressively harder as the agent improves, which is shown to overcome sparse rewards and increase the generality of the learned policy, (f) a pragmatic way of evaluating the fairness of game competitions between humans and AI that further highlights the importance of adaptation, and (g) a new challenge and competition for AI that is based on the board game Blood Bowl, which is orders of magnitude more complex than the game go and requires a high level of generality. These contributions bring a new perspective on the AI challenge of playing complex games that has a focus on adaptation. We believe this perspective is crucial to achieving strong and robust game-playing AI. Our contributions may potentially have an impact on many important real-world problems beyond games, such as robotic tasks in changing environments with complex interactions that require a high level of adaptivity.
|Publisher||IT-Universitetet i København|
|Number of pages||258|
|Publication status||Published - 2019|
No data available