ITU

Maintaining quality in FEVER annotation

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Standard

Maintaining quality in FEVER annotation. / Schulte, Henri; Binau, Julie; Derczynski, Leon.

Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER): Association for Computational Linguistics. Association for Computational Linguistics, 2020. p. 42-46.

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Harvard

Schulte, H, Binau, J & Derczynski, L 2020, Maintaining quality in FEVER annotation. in Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER): Association for Computational Linguistics. Association for Computational Linguistics, pp. 42-46.

APA

Schulte, H., Binau, J., & Derczynski, L. (2020). Maintaining quality in FEVER annotation. In Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER): Association for Computational Linguistics (pp. 42-46). Association for Computational Linguistics.

Vancouver

Schulte H, Binau J, Derczynski L. Maintaining quality in FEVER annotation. In Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER): Association for Computational Linguistics. Association for Computational Linguistics. 2020. p. 42-46

Author

Schulte, Henri ; Binau, Julie ; Derczynski, Leon. / Maintaining quality in FEVER annotation. Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER): Association for Computational Linguistics. Association for Computational Linguistics, 2020. pp. 42-46

Bibtex

@inproceedings{24f3e9859622405ab47db4ee4e341717,
title = "Maintaining quality in FEVER annotation",
abstract = "We propose two measures for measuring the quality of constructed claims in the FEVER task. Annotating data for this task involves the creation of supporting and refuting claims over a set of evidence. Automatic annotation processes often leave superficial patterns in data, which learning systems can detect instead of performing the underlying task. Humans also can leave these superficial patterns, either voluntarily or involuntarily (due to e.g. fatigue). The two measures introduced attempt to detect the impact of these superficial patterns. One is a new information-theoretic and distributionality based measure, DCI; and the other an extension of neural probing work over the ARCT task, utility. We demonstrate these measures over a recent major dataset, that from the English FEVER task in 2019.",
author = "Henri Schulte and Julie Binau and Leon Derczynski",
year = "2020",
month = jul
day = "9",
language = "English",
pages = "42--46",
booktitle = "Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER)",
publisher = "Association for Computational Linguistics",
address = "United States",

}

RIS

TY - GEN

T1 - Maintaining quality in FEVER annotation

AU - Schulte, Henri

AU - Binau, Julie

AU - Derczynski, Leon

PY - 2020/7/9

Y1 - 2020/7/9

N2 - We propose two measures for measuring the quality of constructed claims in the FEVER task. Annotating data for this task involves the creation of supporting and refuting claims over a set of evidence. Automatic annotation processes often leave superficial patterns in data, which learning systems can detect instead of performing the underlying task. Humans also can leave these superficial patterns, either voluntarily or involuntarily (due to e.g. fatigue). The two measures introduced attempt to detect the impact of these superficial patterns. One is a new information-theoretic and distributionality based measure, DCI; and the other an extension of neural probing work over the ARCT task, utility. We demonstrate these measures over a recent major dataset, that from the English FEVER task in 2019.

AB - We propose two measures for measuring the quality of constructed claims in the FEVER task. Annotating data for this task involves the creation of supporting and refuting claims over a set of evidence. Automatic annotation processes often leave superficial patterns in data, which learning systems can detect instead of performing the underlying task. Humans also can leave these superficial patterns, either voluntarily or involuntarily (due to e.g. fatigue). The two measures introduced attempt to detect the impact of these superficial patterns. One is a new information-theoretic and distributionality based measure, DCI; and the other an extension of neural probing work over the ARCT task, utility. We demonstrate these measures over a recent major dataset, that from the English FEVER task in 2019.

M3 - Article in proceedings

SP - 42

EP - 46

BT - Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER)

PB - Association for Computational Linguistics

ER -

ID: 85287454