Mel spectrogram-based audio forgery detection using CNN


Ustubioglu A., ÜSTÜBİOĞLU B., ULUTAŞ G.

SIGNAL IMAGE AND VIDEO PROCESSING, cilt.17, sa.5, ss.2211-2219, 2023 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 17 Sayı: 5
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1007/s11760-022-02436-4
  • Dergi Adı: SIGNAL IMAGE AND VIDEO PROCESSING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, zbMATH
  • Sayfa Sayıları: ss.2211-2219
  • Anahtar Kelimeler: Copy-move forgery detection, Audio forgery, Audio forensic, Spectrogram-CNN, COPY-MOVE DETECTION, PITCH
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

In this time of technology, digital speech can be created and falsified by a very diverse of hardware and software technologies. Audio copy-move forgery is an audio forgery technique that goals to create forged audio by hiding undesirable words or repeating wanted words in identical speech. Therefore, audio authentication has been a necessary requisition. In this study, an effective approach to spectral images based on audio copy-move forgery detection using convolutional neural networks (CNN) with data augmentation is proposed. There are only a few handcrafted methods conducted for the detection of audio copy-move forgery. None of the existing works on audio copy-move forgery detection has proposed deep feature learning from speech recording with Mel spectrogram. This is the first method to employ deep learning with Mel spectrogram of audio for the detection of audio copy-move forgery. The proposed CNN architecture classifies the suspicious Mel spectrogram images into two classes: original and forged. The proposed CNN system is successfully trained on these Mel spectrogram image feature extraction. The proposed algorithm has been tested on our datasets generated from Arabic Speech Corpus and TIMIT speech database. The results show the effectiveness, robustness of post-processing operations, and high accuracy of the proposed approach compared to other studies.