Abstract
Current language models require a lot of training data to obtain high performance. For Relation Classification (RC), many datasets are domain-specific, so combining datasets to obtain better performance is non-trivial. We explore a multi-domain training setup for RC, and attempt to improve performance by encoding domain information. Our proposed models improve > 2 Macro-F1 against the baseline setup, and our analysis reveals that not all the labels benefit the same: The classes which occupy a similar space across domains (i.e., their interpretation is close across them, for example “physical”) benefit the least, while domain-dependent relations (e.g., “part-of”) improve the most when encoding domain information.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) |
| Number of pages | 6 |
| Publisher | European Language Resources Association |
| Publication date | 2024 |
| Pages | 8301-8306 |
| Publication status | Published - 2024 |
| Event | Joint International Conference on Computational Linguistics, Language Resources and Evaluation - Torino, Italy Duration: 20 May 2024 → 25 May 2024 https://aclanthology.org/2024.lrec-main.544/ https://aclanthology.org/2024.lrec-main.1054/ |
Conference
| Conference | Joint International Conference on Computational Linguistics, Language Resources and Evaluation |
|---|---|
| Country/Territory | Italy |
| City | Torino |
| Period | 20/05/2024 → 25/05/2024 |
| Internet address |
Keywords
- Relation Classification
- Robustness
- Domain
- Multi-domain training
Fingerprint
Dive into the research topics of 'How to Encode Domain Information in Relation Classification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver