Multi Pattern Features-Based Spoofing Detection Mechanism Using One Class Learning

Üstübioğlu, BESTE; Tahaoğlu, GÜL; Üstübioğlu, Arda; Ulutaş, GÜZİN; Amerini, İrene; Kılıç, MUHAMMED

doi:10.1109/access.2024.3447572

Multi Pattern Features-Based Spoofing Detection Mechanism Using One Class Learning

Üstübioğlu B., Tahaoğlu G., Üstübioğlu A., Ulutaş G., Amerini İ., Kılıç M.

IEEE ACCESS, cilt.12, ss.117523-117540, 2024 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 12
Basım Tarihi: 2024
Doi Numarası: 10.1109/access.2024.3447572
Dergi Adı: IEEE ACCESS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.117523-117540
Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Automatic Speaker Verification systems are prone to various voice spoofing attacks such as replays, voice conversion (VC) and speech synthesis. Malicious users can perform specific tasks such as controlling the bank account of someone, taking control of a smart home, and similar activities, by using advanced audio manipulation techniques. This study presents a Multi-Pattern Features Based Spoofing detection mechanism using the modified ResNet architecture and OC-Softmax layer to detect various LA and PA spoofing attacks. We proposed a novel Pattern features-based audio spoof detection scheme. The scheme contains three branches to evaluate different patterns on a Mel spectrogram of the audio file. This is the first work for the audio spoofing detection task using three different pattern representations of Mel spectrogram with modified ResNet architecture and OC-Softmax layer. Through the proposed network, we can extract pattern images from the Mel spectrogram and gives each of them into modified ResNet architecture. At the last step of each network, we use OC-Softmax to obtain a score for the current pattern image and then the method fuses three scores to label the input audio. Experimental results on the ASVspoof 2019 and ASVspoof 2021 corpuses show that the proposed method achieves better results in the challenges of ASVspoof 2019 than state-of-the-art methods. For example, in the logical access scenario, our model improves the tandem decision cost function and equal error rate scores by 0.06% and 2.14%, respectively, compared with state-of-the-art methods. Additionally, experiments illustrate that the proposed fused decision improved the performance of the system.