Skip to main navigation Skip to search Skip to main content

DecoMT: Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models.

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

This study investigates machine translation between related languages i.e., languages within the same family that share linguistic characteristics such as word order and lexical similarity. Machine translation through few-shot prompting leverages a small set of translation pair examples to generate translations for test sentences. This procedure requires the model to learn how to generate translations while simultaneously ensuring that token ordering is maintained to produce a fluent and accurate translation. We propose that for related languages, the task of machine translation can be simplified by leveraging the monotonic alignment characteristic of such languages. We introduce DecoMT, a novel approach of few-shot prompting that decomposes the translation process into a sequence of word chunk translations. Through automatic and human evaluation conducted on multiple related language pairs across various language families, we demonstrate that our proposed approach of decomposed prompting surpasses multiple established few-shot baseline approaches. For example, DecoMT outperforms the strong few-shot prompting BLOOM model with an average improvement of 8 chrF++ scores across the examined languages.
Original languageEnglish
Title of host publicationProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
PublisherAssociation for Computational Linguistics
Publication date2023
Pages4586-4602
DOIs
Publication statusPublished - 2023
Externally publishedYes
EventConference on Empirical Methods in Natural Language Processing - Resorts World Convention Centre, Singapore
Duration: 6 Dec 202310 Dec 2023
https://2023.emnlp.org/

Conference

ConferenceConference on Empirical Methods in Natural Language Processing
LocationResorts World Convention Centre
Country/TerritorySingapore
Period06/12/202310/12/2023
Internet address

Keywords

  • machine translation
  • few-shot prompting
  • related languages
  • monotonic alignment
  • DecoMT

Fingerprint

Dive into the research topics of 'DecoMT: Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models.'. Together they form a unique fingerprint.

Cite this