Audio forgery detection and localization with super-resolution spectrogram and keypoint-based clustering approach

ÜSTÜBİOĞLU, BESTE; TAHAOĞLU, GÜL; ULUTAŞ, GÜZİN; Ustubioglu, ARDA; KILIÇ, MUHAMMED

doi:10.1007/s11227-023-05504-9

Audio forgery detection and localization with super-resolution spectrogram and keypoint-based clustering approach

ÜSTÜBİOĞLU B., TAHAOĞLU G., ULUTAŞ G., Ustubioglu A., KILIÇ M.

JOURNAL OF SUPERCOMPUTING, cilt.80, sa.1, ss.486-518, 2024 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 80 Sayı: 1
Basım Tarihi: 2024
Doi Numarası: 10.1007/s11227-023-05504-9
Dergi Adı: JOURNAL OF SUPERCOMPUTING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, zbMATH
Sayfa Sayıları: ss.486-518
Anahtar Kelimeler: Audio copy-move-forgery detection, Audio forensic, Audio forgery, BRIEF feature, Clustering-based matching, High-frequency spectrogram
Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Malicious individuals can modify speech recordings with advanced audio editing software to create forged audio. The most common forgery method, known as audio copy-move forgery, involves copying part of the audio to duplicate or delete a segment. Considering the fact that the speech recording is used as evidence in court, it is of great importance to detect whether the voice recordings are forged or not. To this end, we present an effective and robust method based on BRIEF and OPTICS to detect and locate audio copy-move forgeries. The proposed method uses super-resolution spectrogram images of the input audio to detect forged parts in suspicious audio recordings. For this purpose, key points and their feature descriptors are first extracted from the spectrogram image using the BRIEF method. The ordering points to identify the clustering structure method (OPTICS) is used by the approach to match the corresponding descriptors. The proposed approach to eliminate false matches evaluates the correctness of the matches. The method also marks the corresponding forged segments in the audio file based on the location of the keypoints in these clusters. The performance results show that the proposed method has significantly high robustness to post-processing attacks such as noise addition, filtering, and especially compression, as reported in the literature.