Abstract
Named Entity Recognition (NER) is a key
Natural Language Processing task. However,
most existing work on NER targets flat named
entities (NEs) and ignores the recognition of
nested structures, where entities can be en-
closed within other NEs. Moreover, evaluation
of Nested Named Entity Recognition (NNER)
across domains remains challenging, mainly
due to the limited availability of datasets. To
address these gaps, we present EWT-NNER,
a dataset covering five web domains annotated
for nested named entities on top of the English
Web Treebank (EWT). We present the corpus
and an empirical evaluation, including trans-
fer results from German and Danish. EWT-
NNER is annotated for four major entity types,
including suffixes for derivational entity mark-
ers and partial named entities, spanning a total
of 12 classes. We envision the public release
of EWT-NNER to encourage further research
on nested NER, particularly on cross-lingual
cross-domain evaluation.
Natural Language Processing task. However,
most existing work on NER targets flat named
entities (NEs) and ignores the recognition of
nested structures, where entities can be en-
closed within other NEs. Moreover, evaluation
of Nested Named Entity Recognition (NNER)
across domains remains challenging, mainly
due to the limited availability of datasets. To
address these gaps, we present EWT-NNER,
a dataset covering five web domains annotated
for nested named entities on top of the English
Web Treebank (EWT). We present the corpus
and an empirical evaluation, including trans-
fer results from German and Danish. EWT-
NNER is annotated for four major entity types,
including suffixes for derivational entity mark-
ers and partial named entities, spanning a total
of 12 classes. We envision the public release
of EWT-NNER to encourage further research
on nested NER, particularly on cross-lingual
cross-domain evaluation.
Original language | English |
---|---|
Title of host publication | Findings of ACL 2021 |
Number of pages | 1815 |
Publisher | Association for Computational Linguistics |
Publication date | 2021 |
Pages | 1808 |
DOIs | |
Publication status | Published - 2021 |
Keywords
- Named Entity Recognition
- Nested Named Entities
- Cross-Domain Evaluation
- EWT-NNER Dataset
- Cross-Lingual Transfer