How to Encode Domain Information in Relation Classification

Elisa Bassignana, Viggo Unmack Gascou, Frida Nøhr Laustsen, Gustav Kristensen, Marie Haahr Petersen, Rob van der Goot, Barbara Plank

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Current language models require a lot of training data to obtain high performance. For Relation Classification (RC), many datasets are domain-specific, so combining datasets to obtain better performance is non-trivial. We explore a multi-domain training setup for RC, and attempt to improve performance by encoding domain information. Our proposed models improve > 2 Macro-F1 against the baseline setup, and our analysis reveals that not all the labels benefit the same: The classes which occupy a similar space across domains (i.e., their interpretation is close across them, for example “physical”) benefit the least, while domain-dependent relations (e.g., “part-of”) improve the most when encoding domain information.
Original languageEnglish
Title of host publicationProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Number of pages6
PublisherEuropean Language Resources Association (ELRA)
Publication date2024
Pages8301-8306
Publication statusPublished - 2024

Keywords

  • Relation Classification
  • Robustness
  • Domain
  • Multi-domain training

Fingerprint

Dive into the research topics of 'How to Encode Domain Information in Relation Classification'. Together they form a unique fingerprint.

Cite this