Automatic reference-based evaluation of pronoun translation misses the point

Liane Guillou, Christian Hardmeier

Publikation: Konference artikel i Proceeding eller bog/rapport kapitelKonferencebidrag i proceedingsForskningpeer review

Abstract

We compare the performance of the APT and AutoPRF metrics for pronoun translation against a manually annotated dataset comprising human judgements as to the correctness of translations of the PROTEST test suite. Although there is some correlation with the human judgements, a range of issues limit the performance of the automated metrics. Instead, we recommend the use of semiautomatic metrics and test suites in place of fully automatic metrics.
OriginalsprogEngelsk
TitelProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
Publikationsdato2018
ISBN (Trykt)9781948087841
DOI
StatusUdgivet - 2018
Udgivet eksterntJa

Fingeraftryk

Dyk ned i forskningsemnerne om 'Automatic reference-based evaluation of pronoun translation misses the point'. Sammen danner de et unikt fingeraftryk.

Citationsformater