Multi-input hybrid face presentation attack detection method based on simplified Xception and channel attention mechanism


GÜNAY YILMAZ A., Turhal U., NABİYEV V.

EXPERT SYSTEMS WITH APPLICATIONS, cilt.283, 2025 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 283
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1016/j.eswa.2025.127610
  • Dergi Adı: EXPERT SYSTEMS WITH APPLICATIONS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Currently, biometric recognition systems, especially facial recognition systems, are frequently used for person authentication. Increasing facial image and video sharing on social media makes these systems vulnerable to attacks. For this reason, there is an increasing need for sensitive face presentation attack (FPA) detection systems. In this paper, a lightweight multi-input hybrid deep convolutional neural network model was proposed for FPA detection. For this purpose, a simplified version of the widely used Xception network was developed. This model was extended via squeeze and excitation blocks to weight the data in the feature encoding channels. In addition, a simple residual network architecture with attention blocks was designed. The FPA detection performances of these models were subsequently examined in cases where the input data consisted of the raw images or cropped facial images. According to the results, a deep learning architecture with three parallel connections was proposed. The raw images, cropped facial images, and face-weighted multi-color multi-level local binary pattern features were given as inputs to the proposed model. Therefore, a multi-input hybrid FPA detection model was created using both hand-crafted features and deep features. The proposed architecture has approximately 82% fewer parameters than the original Xception network does. The experimental results on the benchmark CASIA and REPLAY-ATTACK datasets demonstrate the model's effectiveness, achieving a 1.53% equal error rate (EER) on the CASIA dataset and 0.00% EER with a 0.07% half total error rate (HTER) on the REPLAY-ATTACK dataset. These results outperform many state-of-the-art methods while requiring significantly fewer computational resources, making the approach suitable for deployment in resource-constrained environments.