Determination of a speaker's age and gender with an SVM classifier based on GMM supervectors

Yucesoy, Ergun; Nabıyev, VASİF

doi:10.17341/gummfd.71595

Determination of a speaker's age and gender with an SVM classifier based on GMM supervectors

JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, cilt.31, sa.3, ss.501-510, 2016 (SCI-Expanded, Scopus, TRDizin)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 31 Sayı: 3
Basım Tarihi: 2016
Doi Numarası: 10.17341/gummfd.71595
Dergi Adı: JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.501-510
Anahtar Kelimeler: Age and gender recognition, gaussian mixture model, gaussian mixture model supervectors, support vector machine, RECOGNITION
Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

In this study, a system classifying speakers according to their age and/or genders is proposed. In this system phone conversations including mobile calls that took place indoor or outdoor are used as inputs. It is aimed to classify the speakers according to their genders into three classes as male, female and child, according to their ages into four classes as child, youth, adult and senior, and finally according to both gender and age into seven classes. For this aim, GMM models that are created with MFCC coefficients obtained by the voiced parts of the conversations are transformed into supervectors. These supervectors are applied to SVM classifier. Signal energy is used for determining the voiced parts of conversations. For the training of GMM models, the adaptation approach of UBM is preferred. Also, by testing GMM models that are created with different number of components and different length conversations, the impact of GMM components number and speech duration on the age and gender identification is investigated. At the end of these tests, the highest classification success rates are obtained by modeling 16-second speeches with 64-component GMMs. The rates obtained from these tests are measured as 92.42% for gender category, 60.10% for age category and 60.02% for age&gender category.