Skip to main navigation Skip to search Skip to main content

EmbiText: Embracing Ambiguity by Annotation, Recognition and Generation of Pronominal Reference with Event-Entity Ambiguity

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Consider the example “The bird sang the nursery rhyme beautifully. It made everyone in the room smile”. The pronoun ‘it’ here refers either to the bird or to the event of singing. This example is inherently ambiguous. It cannot be meaningfully disambiguated as an event or entity reference, as both readings result in the same text meaning. This study introduces a new dataset EMBITEXT to preserve ambiguity in the language by navigating through the ambiguity surrounding the pronominal reference to the entity or event. Oftentimes, ambiguity does not necessarily need to be resolved but is modelled carefully. Furthermore, this study explores the capacity of LLMs (Llama, Mistral, Gemini, Claude AI) to embrace ambiguity in generating text that exhibit referential ambiguity via an In-Context learning approach. To evaluate of the dataset, RoBERTa was finetuned on this data to model ambiguity while simultaneously distinguishing between entity or event references. Results demonstrate EmbiText’s capacity to advance the ongoing NLP research by modelling linguistic ambiguity in computational environments instead of fully disambiguating it, thereby retaining diverse interpretations where resolution may alter meaning.
Original languageEnglish
Title of host publicationProceedings of the 6th Workshop on Computational Approaches to Discourse, Context and Document-Level Inferences
Number of pages9
PublisherAssociation for Computational Linguistics
Publication dateNov 2025
Pages157-165
ISBN (Electronic)979-8-89176-343-2
DOIs
Publication statusPublished - Nov 2025
EventWorkshop on Computational Approaches to Discourse, Context and Document-Level Inferences - Suzhou International Expo Centre, Suzhou, China
Duration: 8 Nov 20259 Nov 2025
Conference number: 6
https://2025.emnlp.org/program/workshops/?utm

Workshop

WorkshopWorkshop on Computational Approaches to Discourse, Context and Document-Level Inferences
Number6
LocationSuzhou International Expo Centre
Country/TerritoryChina
CitySuzhou
Period08/11/202509/11/2025
Internet address

Keywords

  • Referential ambiguity
  • Coreference resolution
  • In-context learning
  • Large language models
  • Dataset construction for NLP

Fingerprint

Dive into the research topics of 'EmbiText: Embracing Ambiguity by Annotation, Recognition and Generation of Pronominal Reference with Event-Entity Ambiguity'. Together they form a unique fingerprint.

Cite this