Spring til hovednavigation Spring til søgning Spring til hovedindhold

The Copenhagen Corpus of Eye Tracking Recordings from Natural Reading of Danish Texts

  • Nora Hollenstein
  • , Maria Jung Barrett
  • , Marina Björnsdóttir
  • Københavns Universitet

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

Eye movement recordings from reading are one of the richest signals of human language processing. Corpora of eye movements during reading of contextualized running text is a way of making such records available for natural language processing purposes. Such corpora already exist in some languages. We present CopCo, the Copenhagen Corpus of eye tracking recordings from natural reading of Danish texts. It is the first eye tracking corpus of its kind for the Danish language. CopCo includes 1,832 sentences with 34,897 tokens of Danish text extracted from a collection of speech manuscripts. This first release of the corpus contains eye tracking data from 22 participants. It will be extended continuously with more participants and texts from other genres. We assess the data quality of the recorded eye movements and find that the extracted features are in line with related research. The dataset available here: https://osf.io/ud8s5/
OriginalsprogDansk
TitelProceedings of the 13th Language Resources and Evaluation Conference (LREC 2022)
Publikationsdato2022
Sider1712‑1720
StatusUdgivet - 2022
BegivenhedLREC 2022 - Palais du Pharo, Marseille, Frankrig
Varighed: 20 jun. 202225 jun. 2022
Konferencens nummer: 13
https://lrec2022.lrec-conf.org/en/

Konference

KonferenceLREC 2022
Nummer13
LokationPalais du Pharo
Land/OmrådeFrankrig
ByMarseille
Periode20/06/202225/06/2022
Internetadresse

Emneord

  • Eye tracking
  • Natural language processing
  • Corpora
  • Reading
  • Danish language

Citationsformater