Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning

Mike Zhang, Kristian Nørgaard Jensen, Barbara Plank

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

Skill Classification (SC) is the task of classifying job competences from job postings. This work is the first in SC applied to Danish job vacancy data. We release the first Danish job posting dataset: *Kompetencer* (\_en\_: competences), annotated for nested spans of competences. To improve upon coarse-grained annotations, we make use of The European Skills, Competences, Qualifications and Occupations (ESCO; le Vrang et al., (2014)) taxonomy API to obtain fine-grained labels via distant supervision. We study two setups: The zero-shot and few-shot classification setting. We fine-tune English-based models and RemBERT (Chung et al., 2020) and compare them to in-language Danish models. Our results show RemBERT significantly outperforms all other models in both the zero-shot and the few-shot setting.
OriginalsprogEngelsk
Titel13th International Conference on Language Resources and Evaluation
Antal sider11
ForlagEuropean Language Resources Association (ELRA)
Publikationsdato16 jun. 2022
Sider436-447
DOI
StatusUdgivet - 16 jun. 2022

Emneord

  • Skill Classification
  • Job Postings
  • Danish Job Data
  • Distant Supervision
  • Machine Learning Models

Fingeraftryk

Dyk ned i forskningsemnerne om 'Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning'. Sammen danner de et unikt fingeraftryk.

Citationsformater