TY - GEN
T1 - Discovery of Discourse-Related Language Contrasts through Alignment Discrepancies in English-German Translation
AU - Lapshinova-Koltunski, Ekaterina
AU - Hardmeier, Christian
PY - 2017/9/11
Y1 - 2017/9/11
N2 - In this paper, we analyse alignment discrepancies for discourse structures in English-German parallel data -- sentence pairs, in which discourse structures in target or source texts have no alignment in the corresponding parallel sentences. The discourse-related structures are designed in form of linguistic patterns based on the information delivered by automatic part-of-speech and dependency annotation. In addition to alignment errors (existing structures left unaligned), these alignment discrepancies can be caused by language contrasts or through the phenomena of explicitation and implicitation in the translation process. We propose a new approach including new type of resources for corpus-based language contrast analysis and apply it to study and classify the contrasts found in our English-German parallel corpus. As unaligned discourse structures may also result in the loss of discourse information in the MT training data, we hope to deliver information in support of discourse-aware machine translation (MT).
AB - In this paper, we analyse alignment discrepancies for discourse structures in English-German parallel data -- sentence pairs, in which discourse structures in target or source texts have no alignment in the corresponding parallel sentences. The discourse-related structures are designed in form of linguistic patterns based on the information delivered by automatic part-of-speech and dependency annotation. In addition to alignment errors (existing structures left unaligned), these alignment discrepancies can be caused by language contrasts or through the phenomena of explicitation and implicitation in the translation process. We propose a new approach including new type of resources for corpus-based language contrast analysis and apply it to study and classify the contrasts found in our English-German parallel corpus. As unaligned discourse structures may also result in the loss of discourse information in the MT training data, we hope to deliver information in support of discourse-aware machine translation (MT).
KW - Discourse alignment
KW - Parallel corpora
KW - Explicitation implicitation
KW - Language contrast analysis
KW - Discourse-aware machine translation
U2 - 10.18653/v1/W17-4810
DO - 10.18653/v1/W17-4810
M3 - Article in proceedings
SN - 978-1-945626-87-6
BT - Proceedings of the Third Workshop on Discourse in Machine Translation
ER -