Spring til hovednavigation Spring til søgning Spring til hovedindhold

Resources and Evaluations for Danish Entity Resolution

  • Maria Jung Barrett
  • , Hieu Lam
  • , Martin Wu
  • , Ophélie Lacroix
  • , Barbara Plank
  • , Anders Søgaard
  • Alexandra Instituttet A/S
  • Københavns Universitet

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

Automatic coreference resolution is understudied in Danish even though most of the Danish Dependency Treebank (Buch-Kromann, 2003) is annotated with coreference relations. This paper describes a conversion of its partial, yet well-documented, coreference relations into coreference clusters and the training and evaluation of coreference models on this data. To the best of our knowledge, these are the first publicly available, neural coreference models for Danish. We also present a new entity linking annotation on the dataset using WikiData identifiers, a named entity disambiguation (NED) dataset, and a larger automatically created NED dataset enabling wikily supervised NED models. The entity linking annotation is benchmarked using a state-of-the-art neural entity disambiguation model.
OriginalsprogEngelsk
TitelFourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC)
ForlagAssociation for Computational Linguistics
Publikationsdato2021
Sider63–69
StatusUdgivet - 2021
BegivenhedComputational Models of Reference, Anaphora and Coreference - Punta Cana, Dominica
Varighed: 1 nov. 2021 → …
Konferencens nummer: 4

Konference

KonferenceComputational Models of Reference, Anaphora and Coreference
Nummer4
Land/OmrådeDominica
ByPunta Cana
Periode01/11/2021 → …

Emneord

  • Coreference resolution
  • Danish language processing
  • Dependency Treebank
  • Neural models
  • Entity linking
  • WikiData
  • Named entity disambiguation (NED)
  • Wikily supervised models
  • Coreference clusters
  • Natural language processing (NLP)

Citationsformater