Abstract
As large language models (LLMs) are increasingly deployed in user-facing applications, building trust and maintaining safety by accurately quantifying a model’s confidence in its prediction becomes even more important. However, finding effective ways to calibrate LLMs—especially when the only interface to the models is their generated text—remains a challenge. We propose APRICOT (Auxiliary prediction of confidence targets): A method to set confidence targets and train an additional model that predicts an LLM’s confidence based on its textual input and output alone. This approach has several advantages: It is conceptually simple, does not require access to the target model beyond its output, does not interfere with the language generation, and has a multitude of potential usages, for instance by verbalizing the predicted confidence or using it to re-prompting the LLM to accurately reflecting its uncertainty. We show how our approach performs competitively in terms of calibration error for white-box and black-box LLMs on closed-book question-answering to detect incorrect LLM answers.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics |
| Editors | Lun-Wei Ku, Andre Martins, Vivek Srikumar |
| Volume | Volume 1: Long Papers |
| Place of Publication | Bangkok |
| Publisher | Association for Computational Linguistics |
| Publication date | Aug 2024 |
| Pages | 15440–15459 |
| DOIs | |
| Publication status | Published - Aug 2024 |
| Event | Conference on Association for Computational Linguistics - Bangkok, Thailand Duration: 11 Aug 2024 → 16 Aug 2024 Conference number: 62 https://dblp.org/db/series/findacl/index.html https://dblp.org/rec/conf/acl/2024-1.html |
Conference
| Conference | Conference on Association for Computational Linguistics |
|---|---|
| Number | 62 |
| Country/Territory | Thailand |
| City | Bangkok |
| Period | 11/08/2024 → 16/08/2024 |
| Internet address |
Keywords
- Language model calibration
- Confidence estimation
- Uncertainty quantification
- Prompting and re-prompting
- Text-based evaluation
Fingerprint
Dive into the research topics of 'Calibrating Large Language Models Using Their Generations Only'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver