An Attack-Independent Audio Forgery Detection Technique Based On Cochleagram Images Of Segments with Dynamic Threshold

ÜSTÜBİOĞLU, BESTE

doi:10.1109/access.2024.3409543

An Attack-Independent Audio Forgery Detection Technique Based On Cochleagram Images Of Segments with Dynamic Threshold

Atıf İçin Kopyala

ÜSTÜBİOĞLU B.

IEEE Access, cilt.12, ss.82660-82675, 2024 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 12
Basım Tarihi: 2024
Doi Numarası: 10.1109/access.2024.3409543
Dergi Adı: IEEE Access
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.82660-82675
Anahtar Kelimeler: Cochleagram, copy-move forgery, forgery detection, SSIM
Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Thanks to advanced audio editing software, speech recordings can be tampered with very easily. In case the speech recordings are used as forensic evidence, adding the audio recordings together, cutting them, and changing their content are legally unacceptable and constitute a crime. Audio copy-move forgery is the most common forgery for the purpose of changing the content of the speech. Audio copy-move forgery is performed by copying a segment in the audio and pasting it anywhere in the same audio. In this study, a robust and new method based on cochleagram images is proposed to detect audio copy-move forgery. The proposed method uses cochleagram images of the voiced parts of the audio to detect forgery clues in the input audio file. For this purpose, the audio file is first split into voiced parts using a pitch-based Voice Activity Detection (VAD) method. Each audio part is then converted into a cochleagram image. Structural similarity index measure (SSIM) is used to calculate the similarity between cochleagram images. After calculating the SSIM values between the cochleagram images, the proposed forgery localization algorithm is performed. In this algorithm, the SSIM values among the cochleagram images are first sorted in descending order. The length ratio between these pairs of segments is calculated in order to determine which of the values in this descending order are duplicated segment pairs. If this ratio exceeds the specified percentage rate, these segment pairs are marked as forged segments. Finally, the proposed audio copy-move forgery detection method is evaluated against the state-of-the-art approaches with two Copy-Move Forgery Detection (CMFD) database and forged databases created from TIMIT and the Arabic Speech Corpus database. The experimental results show that the proposed method is significantly high robust against post-processing operations compared to other studies.