Abstract
The last decade in deep learning has brought on increasingly
capable systems that are deployed on a wide variety of applications.
In natural language processing, the field has been transformed by
a number of breakthroughs including large language models, which
are used in increasingly many user-facing applications. In order to
reap the benefits of this technology and reduce potential harms, it
is important to quantify the reliability of model predictions and
the uncertainties that shroud their development.
This thesis studies how uncertainty in natural language processing
can be characterized from a linguistic, statistical and neural
perspective, and how it can be reduced and quantified through
the design of the experimental pipeline. We further explore uncertainty
quantification in modeling by theoretically and empirically
investigating the effect of inductive model biases in text classification
tasks. The corresponding experiments include data for three
different languages (Danish, English and Finnish) and tasks as well
as a large set of different uncertainty quantification approaches.
Additionally, we propose a method for calibrated sampling in natural
language generation based on non-exchangeable conformal
prediction, which provides tighter token sets with better coverage
of the actual continuation. Lastly, we develop an approach to quantify
confidence in large black-box language models using auxiliary
predictors, where the confidence is predicted from the input to and
generated output text of the target model alone.
capable systems that are deployed on a wide variety of applications.
In natural language processing, the field has been transformed by
a number of breakthroughs including large language models, which
are used in increasingly many user-facing applications. In order to
reap the benefits of this technology and reduce potential harms, it
is important to quantify the reliability of model predictions and
the uncertainties that shroud their development.
This thesis studies how uncertainty in natural language processing
can be characterized from a linguistic, statistical and neural
perspective, and how it can be reduced and quantified through
the design of the experimental pipeline. We further explore uncertainty
quantification in modeling by theoretically and empirically
investigating the effect of inductive model biases in text classification
tasks. The corresponding experiments include data for three
different languages (Danish, English and Finnish) and tasks as well
as a large set of different uncertainty quantification approaches.
Additionally, we propose a method for calibrated sampling in natural
language generation based on non-exchangeable conformal
prediction, which provides tighter token sets with better coverage
of the actual continuation. Lastly, we develop an approach to quantify
confidence in large black-box language models using auxiliary
predictors, where the confidence is predicted from the input to and
generated output text of the target model alone.
Originalsprog | Engelsk |
---|
Forlag | IT-Universitetet i København |
---|---|
Antal sider | 364 |
ISBN (Trykt) | 978-87-7949-524-1 |
Status | Udgivet - 2024 |
Navn | ITU-DS |
---|---|
Nummer | 227 |
ISSN | 1602-3536 |