Cross-Lingual Cross-Domain Nested Named Entity Evaluation on English Web Texts

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

Named Entity Recognition (NER) is a key
Natural Language Processing task. However,
most existing work on NER targets flat named
entities (NEs) and ignores the recognition of
nested structures, where entities can be en-
closed within other NEs. Moreover, evaluation
of Nested Named Entity Recognition (NNER)
across domains remains challenging, mainly
due to the limited availability of datasets. To
address these gaps, we present EWT-NNER,
a dataset covering five web domains annotated
for nested named entities on top of the English
Web Treebank (EWT). We present the corpus
and an empirical evaluation, including trans-
fer results from German and Danish. EWT-
NNER is annotated for four major entity types,
including suffixes for derivational entity mark-
ers and partial named entities, spanning a total
of 12 classes. We envision the public release
of EWT-NNER to encourage further research
on nested NER, particularly on cross-lingual
cross-domain evaluation.
OriginalsprogEngelsk
TitelFindings of ACL 2021
Antal sider1815
ForlagAssociation for Computational Linguistics
Publikationsdato2021
Sider1808
DOI
StatusUdgivet - 2021

Emneord

  • Named Entity Recognition
  • Nested Named Entities
  • Cross-Domain Evaluation
  • EWT-NNER Dataset
  • Cross-Lingual Transfer

Fingeraftryk

Dyk ned i forskningsemnerne om 'Cross-Lingual Cross-Domain Nested Named Entity Evaluation on English Web Texts'. Sammen danner de et unikt fingeraftryk.

Citationsformater