ITU

Discriminating Between Similar Nordic Languages

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

View graph of relations

Automatic language identification is a challenging problem. Discriminating between closely related languages is especially difficult. This paper presents a machine learning approach for automatic language identification for the Nordic languages, which often suffer miscategorisation by existing state-of-the-art tools. Concretely we will focus on discrimination between six Nordic languages: Danish, Swedish, Norwegian (Nynorsk), Norwegian (Bokmål), Faroese and Icelandic.
Original languageEnglish
Title of host publicationProceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects
PublisherAssociation for Computational Linguistics
Publication date20 Apr 2021
Pages67–75
Publication statusPublished - 20 Apr 2021

ID: 84743872