From Dissonance to Insights: Dissecting Disagreements in Rationale Construction for Case Outcome Classification

Shanshan Xu, Santosh T.y.s.s, Oana Ichim, Isabella Risini, Barbara Plank, Matthias Grabmaier

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

In legal NLP, Case Outcome Classification
(COC) must not only be accurate but also
trustworthy and explainable. Existing work
in explainable COC has been limited to an-
notations by a single expert. However, it is
well-known that lawyers may disagree in their
assessment of case facts. We hence collect
a novel dataset RAVE: Rationale Variation
in ECHR1, which is obtained from two ex-
perts in the domain of international human
rights law, for whom we observe weak agree-
ment. We study their disagreements and build a
two-level task-independent taxonomy, supple-
mented with COC-specific subcategories. We
quantitatively assess different taxonomy cate-
gories and find that disagreements mainly stem
from underspecification of the legal context,
which poses challenges given the typically lim-
ited granularity and noise in COC metadata. To
our knowledge, this is the first work in the legal
NLP that focuses on building a taxonomy over
human label variation. We further assess the ex-
plainablility of state-of-the-art COC models on
RAVE and observe limited agreement between
models and experts. Overall, our case study re-
veals hitherto underappreciated complexities in
creating benchmark datasets in legal NLP that
revolve around identifying aspects of a case’s
facts supposedly relevant to its outcome
Original languageEnglish
Title of host publicationProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
PublisherAssociation for Computational Linguistics
Publication dateDec 2023
Pages9558–9576
DOIs
Publication statusPublished - Dec 2023

Fingerprint

Dive into the research topics of 'From Dissonance to Insights: Dissecting Disagreements in Rationale Construction for Case Outcome Classification'. Together they form a unique fingerprint.

Cite this