Abstract
Datasets play a critical role in medical imaging research, yet issues such as label quality, shortcuts, and metadata are often overlooked. This lack of attention may harm the generalizability of algorithms and, consequently, negatively impact patient outcomes. While existing medical imaging literature reviews mostly focus on machine learning (ML) methods, with only a few focusing on datasets for specific applications, these reviews remain static – they are published once and not updated thereafter. This fails to account for emerging evidence, such as biases, shortcuts, and additional annotations that other researchers may contribute after the dataset is published. We refer to these newly discovered findings of datasets as research artifacts. To address this gap, we propose a living review that continuously tracks public datasets and their associated research artifacts across multiple medical imaging applications. Our approach includes a framework for the living review to monitor data documentation artifacts, and an SQL database to visualize the citation relationships between research artifact and dataset. Lastly, we discuss key considerations for creating medical imaging datasets, review best practices for data annotation, discuss the significance of shortcuts and demographic diversity, and emphasize the importance of managing datasets throughout their entire lifecycle. Our demo is publicly available at http://inthepicture.itu.dk/.
| Original language | English |
|---|---|
| Title of host publication | FAccT '25: Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency |
| Number of pages | 20 |
| Place of Publication | New York |
| Publisher | Association for Computing Machinery |
| Publication date | 23 Jun 2025 |
| Pages | 511-531 |
| ISBN (Print) | 979-8-4007-1482-5 |
| DOIs | |
| Publication status | Published - 23 Jun 2025 |
| Event | Fairness, Accountability and Transparency - Greece, Athens, Greece Duration: 23 Jun 2025 → 26 Jun 2025 Conference number: 8 https://facctconference.org/2025/ |
Conference
| Conference | Fairness, Accountability and Transparency |
|---|---|
| Number | 8 |
| Location | Greece |
| Country/Territory | Greece |
| City | Athens |
| Period | 23/06/2025 → 26/06/2025 |
| Internet address |
Keywords
- open data
- data governance
- healthcare
- medical imaging
- shortcuts
- bias
- research artifacts
- living review
Fingerprint
Dive into the research topics of 'In the Picture: Medical Imaging Datasets, Artifacts, and their Living Review'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Workshop `In the Picture: Medical Imaging Datasets´
Sánchez, A. J. (PI) & Cheplygina, V. (CoI)
Danish Data Science Academy (DDSA)
14/03/2024 → 31/12/2024
Project: Research
-
MMC: Making Metadata Count
Cheplygina, V. (PI), Sánchez, A. J. (CoI) & Sourget, T. (CoI)
Independent Research Fund Denmark
01/10/2022 → 30/09/2025
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver