What do You Mean by Relation Extraction?
 A Survey on Datasets and Study on Scientific Relation Classification

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Over the last five years, research on Relation Extraction (RE) witnessed extensive progress with many new dataset releases. At the same time, setup clarity has decreased, contributing to increased difficulty of reliable empirical evaluation (Taillé et al., 2020). In this paper, we provide a comprehensive survey of RE datasets, and revisit the task definition and its adoption by the community. We find that crossdataset and cross-domain setups are particularly lacking. We present an empirical study on scientific Relation Classification across two datasets. Despite large data overlap, our analysis reveals substantial discrepancies in annotation. Annotation discrepancies strongly impact Relation Classification performance, explaining large drops in cross-dataset evaluations. Variation within further sub-domains exists but impacts Relation Classification only to limited degrees. Overall, our study calls formore rigour in reporting setups in RE and evaluation across multiple test sets.
Original languageEnglish
Title of host publicationThe 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Number of pages17
VolumeProceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Place of PublicationDublin, Ireland
PublisherAssociation for Computational Linguistics
Publication date2022
Pages67–83
DOIs
Publication statusPublished - 2022

Keywords

  • Relation Extraction
  • Survey
  • Cross-domain

Fingerprint

Dive into the research topics of 'What do You Mean by Relation Extraction?
 A Survey on Datasets and Study on Scientific Relation Classification'. Together they form a unique fingerprint.

Cite this