Skip to main navigation Skip to search Skip to main content

Bias in Danish Medical Notes: Infection Classification of Long Texts Using Transformer and LSTM Architectures Coupled with BERT

  • University of Copenhagen
  • Copenhagen University Hospital

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review

Abstract

Medical notes contain a wealth of information related to diagnosis, prognosis, and overall patient care that can be used to help physicians make informed decisions. However, like any other data sets consisting of data from diverse demographics, they may be biased toward certain subgroups or subpopulations. Consequently, any bias in the data will be reflected in the output of the machine learning models trained on them. In this paper, we investigate the existence of such biases in Danish medical notes related to three types of blood cancer, with the goal of classifying whether the medical notes indicate severe infection. By employing a hierarchical architecture that combines a sequence model (Transformer and LSTM) with a BERT model to classify long notes, we uncover biases related to demographics and cancer types. Furthermore, we observe performance differences between hospitals. These findings underscore the importance of investigating bias in critical settings such as healthcare and the urgency of monitoring and mitigating it when developing AI-based systems.
Original languageEnglish
Title of host publicationProceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)
EditorsSophia Ananiadou, Dina Demner-Fushman, Deepak Gupta, Paul Thompson
Number of pages5
Place of PublicationAlbuquerque, New Mexico
PublisherAssociation for Computational Linguistics
Publication date1 May 2025
Pages316-320
ISBN (Print)979-8-89176-238-1
DOIs
Publication statusPublished - 1 May 2025
EventWorkshop on Patient-Oriented Language Processing - Albuquerque, United States
Duration: 3 May 20254 May 2025
Conference number: 2
http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=183234

Conference

ConferenceWorkshop on Patient-Oriented Language Processing
Number2
Country/TerritoryUnited States
CityAlbuquerque
Period03/05/202504/05/2025
Internet address

Keywords

  • bias in healthcare AI
  • clinical natural language processing
  • electronic health records
  • Danish medical notes
  • hospital-level performance variation

Fingerprint

Dive into the research topics of 'Bias in Danish Medical Notes: Infection Classification of Long Texts Using Transformer and LSTM Architectures Coupled with BERT'. Together they form a unique fingerprint.

Cite this