Robust Audio Forgery Detection Method Based on Capsule Network


DİNÇER S., ÜSTÜBİOĞLU B., ULUTAŞ G., TAHAOĞLU G., Ustubioglu A.

2023 International Conference on Electrical and Information Technology, IEIT 2023, Malang, Endonezya, 14 - 15 Eylül 2023, ss.243-247 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ieit59852.2023.10335590
  • Basıldığı Şehir: Malang
  • Basıldığı Ülke: Endonezya
  • Sayfa Sayıları: ss.243-247
  • Anahtar Kelimeler: Audio forgery, Capsule Network, Mel spectrogram
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Audio copy-move forgery is an audio forgery method that aims to generate forged audio by hiding some words or duplicating the words in the same audio. The significance of audio forgery detection lies in its pivotal role in maintaining the integrity of audio content, crucial for sectors such as law enforcement, journalism, and entertainment. Detecting manipulated audio, like copy-move forgery, ensures trustworthiness and credibility in various real-world applications, including authentication of evidence, preserving the authenticity of audio records, and safeguarding the integrity of media content. Post-processing operations such as compression, noise adding, and median filtering applied to fake audio to hide traces of forgery make audio copy-paste detection very difficult. In the proposed study, an audio copy-move forgery detection method based on Capsule networks resistant to post-processing operations is proposed. This is the first method to employ Capsule Network with Mel spectrogram of audio for the detection of audio copy-move forgery. For this purpose, the suspicious audio file taken as input is converted into mel spectrogram images. Features are extracted from Mel spectrogram images using EfficientNet. The capsule network is trained with these features, and the audio file given as a test is labeled as forged or original. The proposed method and other studies have been tested on the database created by us. Obtained results show that the proposed method is highly resistant to post-processing operations and outperforms other studies with more than 34% in accuracy.