Abstract
Digital assistants are becoming an integral part of everyday life. However, commercial digital assistants are only available for a limited set of languages. Because of this, a vast amount of people can not use these devices in their native tongue.
In this work, we focus on two core tasks within the digital assistant pipeline: intent classification and slot detection. Intent classification recovers the goal of the utterance, whereas slot detection identifies important properties regarding this goal. Besides introducing a novel cross-lingual dataset for these tasks, consisting of 11 languages, we evaluate a variety of models: 1)
multilingually pretrained transformer-based models, 2) we supplement these models with auxiliary tasks to evaluate whether multi-task learning can be beneficial, and 3) annotation transfer with neural machine translation.
In this work, we focus on two core tasks within the digital assistant pipeline: intent classification and slot detection. Intent classification recovers the goal of the utterance, whereas slot detection identifies important properties regarding this goal. Besides introducing a novel cross-lingual dataset for these tasks, consisting of 11 languages, we evaluate a variety of models: 1)
multilingually pretrained transformer-based models, 2) we supplement these models with auxiliary tasks to evaluate whether multi-task learning can be beneficial, and 3) annotation transfer with neural machine translation.
Originalsprog | Engelsk |
---|---|
Publikationsdato | 25 sep. 2021 |
Status | Udgivet - 25 sep. 2021 |
Begivenhed | RESOURCEFUL-2020 : RESOURCEs and representations For Under-resourced Languages and domains - Gothenburg, Gothenburg, Sverige Varighed: 25 nov. 2020 → … https://gu-clasp.github.io/resourceful-2020/ |
Workshop
Workshop | RESOURCEFUL-2020 |
---|---|
Lokation | Gothenburg |
Land/Område | Sverige |
By | Gothenburg |
Periode | 25/11/2020 → … |
Internetadresse |