Abstract
Recent approaches to data-to-text generation have shown great promise thanks to the use of large-scale datasets and the application of neural network architectures which are trained end-to-end. These models rely on representation learning to select content appropriately, structure it coherently, and verbalize it grammatically, treating entities as nothing more than vocabulary tokens. In this work we propose an entity-centric neural architecture for data-to-text generation. Our model creates entity-specific representations which are dynamically updated. Text is generated conditioned on the data input and entity memory representations using hierarchical attention at each time step. We present experiments on the RotoWire benchmark and a (five times larger) new dataset on the baseball domain which we create. Our results show that the proposed model outperforms competitive baselines in automatic and human evaluation.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics |
Forlag | Association for Computational Linguistics |
Publikationsdato | 2019 |
Sider | 2023-2035 |
DOI | |
Status | Udgivet - 2019 |
Udgivet eksternt | Ja |