Using Linguistic Annotations in Statistical Machine Translation of Film Subtitles

Christian Hardmeier, Martin Volk

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Statistical Machine Translation (SMT) has been successfully employed to support translation of film subtitles. We explore the integration of Constraint Grammar corpus annotations into a Swedish–Danish subtitle SMT system in the framework of factored SMT. While the usefulness of the annotations is limited with large amounts of parallel data, we show that linguistic annotations can increase the gains in translation quality when monolingual data in the target language is added to an SMT system based on a small parallel corpus.
Original languageEnglish
Title of host publicationProceedings of the 17th Nordic Conference of Computational Linguistics (NODALIDA 2009)
Number of pages8
Publication date13 May 2009
Publication statusPublished - 13 May 2009
Externally publishedYes

Fingerprint

Dive into the research topics of 'Using Linguistic Annotations in Statistical Machine Translation of Film Subtitles'. Together they form a unique fingerprint.

Cite this