Robust copy-move detection in digital audio forensics based on pitch and modified discrete cosine transform

ÜSTÜBİOĞLU, BESTE; Kucukugurlu, Busranur; ULUTAŞ, GÜZİN

doi:10.1007/s11042-022-13035-3

Robust copy-move detection in digital audio forensics based on pitch and modified discrete cosine transform

Atıf İçin Kopyala

ÜSTÜBİOĞLU B., Kucukugurlu B., ULUTAŞ G.

MULTIMEDIA TOOLS AND APPLICATIONS, cilt.81, sa.19, ss.27149-27185, 2022 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 81 Sayı: 19
Basım Tarihi: 2022
Doi Numarası: 10.1007/s11042-022-13035-3
Dergi Adı: MULTIMEDIA TOOLS AND APPLICATIONS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, FRANCIS, ABI/INFORM, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, zbMATH
Sayfa Sayıları: ss.27149-27185
Anahtar Kelimeler: Copy-move forgery detection, Audio forgery, Audio forensic, MDCT, Pitch
Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Copy-move forgery is one of the most widely used methods in the field of audio forensic because it is difficult to detect and convenient to apply. However, post-processing operations applied to forged speech to hide traces of forgery make detection of forensic more difficult. In this study, we propose a robust method for detection and localization of the audio copy-move forgery using modified discrete cosine transform (MDCT). For this purpose, we first divide the speech recording into voiced and unvoiced parts by extracting a pitch sequence from the speech recording. After the determination of voiced parts, we extract MDCT coefficients from them and take the mean of the transpose of the coefficient matrix as the feature. These MDCT features are very robust against commonly used post-processing operations (especially audio compressing). Euclidean distance (ED) is applied to compute the similarities between the features. The voiced parts, which give minimum ED are determined as copy-move audio parts. Also, two separate databases (Pitch-based and Voice Activity Detection (VAD-based) are created during the work to evaluate the performance of the proposed method because there is no common database in the field of audio forensic in the literature. Experiment results show that the proposed method gives better results for detection and localization of audio copy-move forgery on different databases compared to other studies in the literature. The proposed method is also robust against common post-processing operations such as noise addition, filtering operation, and especially compression operation.