TY - GEN
T1 - Unpacking Ambiguous Structure: A Dataset for Ambiguous Implicit Discourse Relations for English and Egyptian Arabic
AU - Ruby, Ahmed
AU - Stymne, Sara
AU - Hardmeier, Christian
PY - 2023
Y1 - 2023
N2 - In this paper, we present principles of constructing and resolving ambiguity in implicit discourse relations. Following these principles, we created a dataset in both English and Egyptian Arabic that controls for semantic disambiguation, enabling the investigation of prosodic features in future work. In these datasets, examples are two-part sentences with an implicit discourse relation that can be ambiguously read as either causal or concessive, paired with two different preceding context sentences forcing either the causal or the concessive reading. We also validated both datasets by humans and language models (LMs) to study whether context can help humans or LMs resolve ambiguities of implicit relations and identify the intended relation. As a result, this task posed no difficulty for humans, but proved challenging for BERT/CamelBERT and ELECTRA/AraELECTRA models.
AB - In this paper, we present principles of constructing and resolving ambiguity in implicit discourse relations. Following these principles, we created a dataset in both English and Egyptian Arabic that controls for semantic disambiguation, enabling the investigation of prosodic features in future work. In these datasets, examples are two-part sentences with an implicit discourse relation that can be ambiguously read as either causal or concessive, paired with two different preceding context sentences forcing either the causal or the concessive reading. We also validated both datasets by humans and language models (LMs) to study whether context can help humans or LMs resolve ambiguities of implicit relations and identify the intended relation. As a result, this task posed no difficulty for humans, but proved challenging for BERT/CamelBERT and ELECTRA/AraELECTRA models.
KW - Implicit discourse relations
KW - Semantic disambiguation
KW - Prosodic features
KW - Contextual ambiguity
KW - Human vs. language model validation
KW - Implicit discourse relations
KW - Semantic disambiguation
KW - Prosodic features
KW - Contextual ambiguity
KW - Human vs. language model validation
U2 - 10.18653/v1/2023.codi-1.16
DO - 10.18653/v1/2023.codi-1.16
M3 - Article in proceedings
SP - 126
EP - 144
BT - Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)
PB - Association for Computational Linguistics
CY - Canada
ER -