Catastrophic forgetting is a ubiquitous problem for the current generation of Artificial Neural Networks: When a network is asked to learn multiple tasks in a sequence, it fails dramatically as it tends to forget past knowledge. Little is known on how far multimodal conversational agents suffer from this phenomenon. In this paper, we study the problem of catastrophic forgetting in Visual Question Answering (VQA) and propose experiments in which we analyze pairs of tasks based on CLEVR, a dataset requiring different skills which involve visual or linguistic knowledge. Our results show that dramatic forgetting is at place in VQA, calling for studies on how multimodal models can be enhanced with continual learning methods.
|Titel||Tenth International Workshop on Spoken Dialogue Systems Technology (IWSDS) 2019 : Lecture Notes in Electrical Engineering book series (LNEE, volume 714)|
|Status||Udgivet - 2019|
|Navn||Lecture Notes in Electrical Engineering|