Skip to main navigation Skip to search Skip to main content

Discriminating Between Similar Nordic Languages

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Automatic language identification is a challenging problem. Discriminating between closely related languages is especially difficult. This paper presents a machine learning approach for automatic language identification for the Nordic languages, which often suffer miscategorisation by existing state-of-the-art tools. Concretely we will focus on discrimination between six Nordic languages: Danish, Swedish, Norwegian (Nynorsk), Norwegian (Bokmål), Faroese and Icelandic.
Original languageEnglish
Title of host publicationProceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects
PublisherAssociation for Computational Linguistics
Publication date20 Apr 2021
Pages67–75
Publication statusPublished - 20 Apr 2021
EventWorkshop on NLP for Similar Languages, Varieties and Dialects - VIRTUAL
Duration: 20 Apr 202120 Apr 2021
Conference number: 8

Workshop

WorkshopWorkshop on NLP for Similar Languages, Varieties and Dialects
Number8
CityVIRTUAL
Period20/04/202120/04/2021

Keywords

  • Automatic language identification
  • Nordic languages
  • Machine learning
  • Language discrimination
  • Natal differentiation

Fingerprint

Dive into the research topics of 'Discriminating Between Similar Nordic Languages'. Together they form a unique fingerprint.

Cite this