Abstract
Medical notes contain a wealth of information related to diagnosis, prognosis, and overall patient care that can be used to help physicians make informed decisions. However, like any other data sets consisting of data from diverse demographics, they may be biased toward certain subgroups or subpopulations. Consequently, any bias in the data will be reflected in the output of the machine learning models trained on them. In this paper, we investigate the existence of such biases in Danish medical notes related to three types of blood cancer, with the goal of classifying whether the medical notes indicate severe infection. By employing a hierarchical architecture that combines a sequence model (Transformer and LSTM) with a BERT model to classify long notes, we uncover biases related to demographics and cancer types. Furthermore, we observe performance differences between hospitals. These findings underscore the importance of investigating bias in critical settings such as healthcare and the urgency of monitoring and mitigating it when developing AI-based systems.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health) |
| Editors | Sophia Ananiadou, Dina Demner-Fushman, Deepak Gupta, Paul Thompson |
| Number of pages | 5 |
| Place of Publication | Albuquerque, New Mexico |
| Publisher | Association for Computational Linguistics |
| Publication date | 1 May 2025 |
| Pages | 316-320 |
| ISBN (Print) | 979-8-89176-238-1 |
| DOIs | |
| Publication status | Published - 1 May 2025 |
| Event | Workshop on Patient-Oriented Language Processing - Albuquerque, United States Duration: 3 May 2025 → 4 May 2025 Conference number: 2 http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=183234 |
Conference
| Conference | Workshop on Patient-Oriented Language Processing |
|---|---|
| Number | 2 |
| Country/Territory | United States |
| City | Albuquerque |
| Period | 03/05/2025 → 04/05/2025 |
| Internet address |
Keywords
- bias in healthcare AI
- clinical natural language processing
- electronic health records
- Danish medical notes
- hospital-level performance variation
Fingerprint
Dive into the research topics of 'Bias in Danish Medical Notes: Infection Classification of Long Texts Using Transformer and LSTM Architectures Coupled with BERT'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver