Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging

Barbara Plank, Zeljko Agic

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

We introduce DSDS: a cross-lingual neural part-of-speech tagger that learns from dis- parate sources of distant supervision, and realistically scales to hundreds of low-resource languages. The model exploits annotation projection, instance selection, tag dictionaries, morphological lexicons, and distributed representations, all in a uniform framework. The approach is simple, yet surprisingly effective, resulting in a new state of the art without access to any gold annotated data.
Original languageEnglish
Title of host publicationProceedings of the Conference on Empirical Methods in Natural Language Processing
PublisherAssociation for Computational Linguistics
Publication date2018
Publication statusPublished - 2018

Keywords

  • Cross-lingual part-of-speech tagging
  • Distant supervision
  • Low-resource languages
  • Morphological lexicons
  • Annotation projection

Fingerprint

Dive into the research topics of 'Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging'. Together they form a unique fingerprint.

Cite this