Automatic language identification is a challenging problem. Discriminating between closely related languages is especially difficult. This paper presents a machine learning approach for automatic language identification for the Nordic languages, which often suffer miscategorisation by existing state-of-the-art tools. Concretely we will focus on discrimination between six Nordic languages: Danish, Swedish, Norwegian (Nynorsk), Norwegian (Bokmål), Faroese and Icelandic.
Title of host publication
Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects
This page is printed from https://en.itu.dk/research/portalplaceholder?layoutfraction=top&langRef=https://pure.itu.dk/portal/da/persons/carsten-schurmann(1313519f-2074-41d6-8b79-589648e37211)/clippings.html?ordering=clippingOrderByDate&page=6&descending=true