TY - JOUR
T1 - QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension
AU - Rogers, Anna
AU - Gardner, Matt
AU - Augenstein, Isabelle
PY - 2022/8/1
Y1 - 2022/8/1
N2 - Alongside huge volumes of research on deep learning models in NLP in the recent years, there has been also much work on benchmark datasets needed to track modeling progress. Question answering and reading comprehension have been particularly prolific in this regard, with over 80 new datasets appearing in the past two years. This study is the largest survey of the field to date. We provide an overview of the various formats and domains of the current resources, highlighting the current lacunae for future work. We further discuss the current classifications of ``reasoning types" in question answering and propose a new taxonomy. We also discuss the implications of over-focusing on English, and survey the current monolingual resources for other languages and multilingual resources. The study is aimed at both practitioners looking for pointers to the wealth of existing data, and at researchers working on new resources.
AB - Alongside huge volumes of research on deep learning models in NLP in the recent years, there has been also much work on benchmark datasets needed to track modeling progress. Question answering and reading comprehension have been particularly prolific in this regard, with over 80 new datasets appearing in the past two years. This study is the largest survey of the field to date. We provide an overview of the various formats and domains of the current resources, highlighting the current lacunae for future work. We further discuss the current classifications of ``reasoning types" in question answering and propose a new taxonomy. We also discuss the implications of over-focusing on English, and survey the current monolingual resources for other languages and multilingual resources. The study is aimed at both practitioners looking for pointers to the wealth of existing data, and at researchers working on new resources.
KW - reading comprehension
KW - natural language understanding
KW - reading comprehension
KW - natural language understanding
U2 - 10.1145/3560260
DO - 10.1145/3560260
M3 - Journal article
JO - ACM CSUR
JF - ACM CSUR
ER -