Evaluating and enhancing the robustness of vision transformers against adversarial attacks in medical imaging

KANCA GÜLSOY, ELİF; AYAS, SELEN; BAYKAL KABLAN, ELİF; EKİNCİ, MURAT

doi:10.1007/s11517-024-03226-5

Evaluating and enhancing the robustness of vision transformers against adversarial attacks in medical imaging

KANCA GÜLSOY E., AYAS S., BAYKAL KABLAN E., EKİNCİ M.

MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, cilt.63, ss.673-690, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 63
Basım Tarihi: 2025
Doi Numarası: 10.1007/s11517-024-03226-5
Dergi Adı: MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ABI/INFORM, Applied Science & Technology Source, BIOSIS, Biotechnology Research Abstracts, Business Source Elite, Business Source Premier, CINAHL, Compendex, Computer & Applied Sciences, INSPEC
Sayfa Sayıları: ss.673-690
Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Deep neural networks (DNNs) have demonstrated exceptional performance in medical image analysis. However, recent studies have uncovered significant vulnerabilities in DNN models, particularly their susceptibility to adversarial attacks that manipulate these models into making inaccurate predictions. Vision Transformers (ViTs), despite their advanced capabilities in medical imaging tasks, have not been thoroughly evaluated for their robustness against such attacks in this domain. This study addresses this research gap by conducting an extensive analysis of various adversarial attacks on ViTs specifically within medical imaging contexts. We explore adversarial training as a potential defense mechanism and assess the resilience of ViT models against state-of-the-art adversarial attacks and defense strategies using publicly available benchmark medical image datasets. Our findings reveal that ViTs are vulnerable to adversarial attacks even with minimal perturbations, although adversarial training significantly enhances their robustness, achieving over 80% classification accuracy. Additionally, we perform a comparative analysis with state-of-the-art convolutional neural network models, highlighting the unique strengths and weaknesses of ViTs in handling adversarial threats. This research advances the understanding of ViTs robustness in medical imaging and provides insights into their practical deployment in real-world scenarios.Graphical Abstract(left).