Abstract
Transfer learning has become an increasingly popular approach in medical imaging, as it offers a solution to the challenge of training models with limited dataset sizes. The ability to leverage knowledge from pre-trained models has proven to be beneficial in various medical imaging applications, such as disease diagnosis, classification of pathological conditions, and early detection of abnormalities in imaging modalities like X-rays, MRIs, and CT scans.
Pre-training on ImageNet, a dataset originally designed for natural image classification, has become the standard in the field. However, unlike natural images, which typically feature distinct global objects, medical images often rely on subtle local texture variations to indicate pathology. These stark differences between natural and medical images have spurred exploration into alternative pre-training strategies, such as using existing medical datasets, modifying them, or developing new datasets specifically designed for medical imaging. However, how to effectively choose between various alternatives and what impact the choice of source dataset has on the model’s final representations remain unclear.
This thesis examines the mechanics of transfer learning in medical image classification and explores the broader impact of the source dataset, extending beyond transfer performance. The aim is to provide insights and tools to guide the selection of appropriate source datasets for medical image classification. First, we compare learned intermediate representations of the models pre-trained on natural and medical source datasets. Our results indicate that, while models achieve comparable
performance, they converge to distinct representations, which further diverge after fine-tuning. Next, we investigate the impact of these different representations on model generalization by fine-tuning models on targets curated to include systematically controlled confounders. The results show substantial differences in robustness to shortcut learning between models pre-trained on natural images and those pre-trained on medical images, despite similar classification performance. Finally, we benchmark existing transferability metrics for source dataset selection and show that current metrics–designed and validated on natural image datasets–perform poorly in the context of medical image classification. This highlights the need for transferability metrics specifically tailored to medical imaging tasks. To address this, we propose a novel transferability metric that integrates feature quality with gradient information, overcoming the self-source bias inherent in previous methods that rely solely on feature quality. Our results show that this approach outperforms existing metrics in source dataset selection for medical image classification.
Pre-training on ImageNet, a dataset originally designed for natural image classification, has become the standard in the field. However, unlike natural images, which typically feature distinct global objects, medical images often rely on subtle local texture variations to indicate pathology. These stark differences between natural and medical images have spurred exploration into alternative pre-training strategies, such as using existing medical datasets, modifying them, or developing new datasets specifically designed for medical imaging. However, how to effectively choose between various alternatives and what impact the choice of source dataset has on the model’s final representations remain unclear.
This thesis examines the mechanics of transfer learning in medical image classification and explores the broader impact of the source dataset, extending beyond transfer performance. The aim is to provide insights and tools to guide the selection of appropriate source datasets for medical image classification. First, we compare learned intermediate representations of the models pre-trained on natural and medical source datasets. Our results indicate that, while models achieve comparable
performance, they converge to distinct representations, which further diverge after fine-tuning. Next, we investigate the impact of these different representations on model generalization by fine-tuning models on targets curated to include systematically controlled confounders. The results show substantial differences in robustness to shortcut learning between models pre-trained on natural images and those pre-trained on medical images, despite similar classification performance. Finally, we benchmark existing transferability metrics for source dataset selection and show that current metrics–designed and validated on natural image datasets–perform poorly in the context of medical image classification. This highlights the need for transferability metrics specifically tailored to medical imaging tasks. To address this, we propose a novel transferability metric that integrates feature quality with gradient information, overcoming the self-source bias inherent in previous methods that rely solely on feature quality. Our results show that this approach outperforms existing metrics in source dataset selection for medical image classification.
Originalsprog | Engelsk |
---|---|
Kvalifikation | Ph.d. |
Vejleder(e) |
|
Bevillingsdato | 14 feb. 2025 |
Udgiver | |
ISBN'er, trykt | 978-87-7949-532-6 |
Status | Udgivet - 2025 |