Uvodnik
More slices, less truth: effects of different test-set design strategies for magnetic resonance image classification
Mila Glavaški
orcid.org/0000-0003-4709-9872
; Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia
Lazar Velicki
orcid.org/0000-0002-2907-819X
; Institute of Cardiovascular Diseases Vojvodina, Clinic for Cardiovascular Surgery, Novi Sad, Serbia
Sažetak
Aim To assess the effects of different test-set design strategies for magnetic resonance (MR) image classification using deep learning.
Methods Error rates in 10 experimental settings were assessed. The performance of pretrained models and data
augmentation were examined as possible contributing
factors.
Results Error rates in experimental settings using MR images of different patients for training and test sets were
ten times higher than those in experimental settings using
MR images of the same patients (four disease groups with
whole-chest images, 46.80% vs 2.06%; four disease groups
without whole-chest images, 49.09% vs 1.29%; sex classification with whole-chest images, 16.02% vs 0.96%; and
sex classification without whole-chest images, 23.56% vs
0.30%). Error rates were higher when data augmentation
was applied to settings that used MR images of different
patients for training and test sets.
Conclusion When deep learning is applied to MR image
classification, training and test sets should consist of MR
images of different patients. Models built on training and
test sets consisting of images of the same patients yield
optimistic error rates and lead to wrong conclusions. MR
images of neighboring slices are so similar that they cause
data leakage effect.
Ključne riječi
Hrčak ID:
306690
URI
Datum izdavanja:
25.8.2022.
Posjeta: 342 *