Evaluating MFCC-based speaker identification systems with data envelopment analysis


ÖZCAN Z., KAYIKÇIOĞLU T.

Expert Systems with Applications, vol.168, 2021 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 168
  • Publication Date: 2021
  • Doi Number: 10.1016/j.eswa.2020.114448
  • Journal Name: Expert Systems with Applications
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Computer & Applied Sciences, INSPEC, Metadex, Public Affairs Index, Civil Engineering Abstracts
  • Keywords: Speaker recognition evaluation, Multi-criteria decision making, Data envelopment analysis, Speaker identification, MFCC features, MCDM, EFFICIENCY, CLASSIFICATION, RECOGNITION, SELECTION, MACHINES, RANKING
  • Karadeniz Technical University Affiliated: Yes

Abstract

© 2020 Elsevier LtdThe concept of the efficiency of speaker recognition systems varies in the literature. Although many authors have defined efficiency as recognition accuracy, others have defined it as low energy consumption, memory storage, or computational burden. In our study, for a novel approach, speaker recognition was evaluated following a multi-criteria decision-making approach in two stages. First, speaker identification based on Mel-frequency cepstrum coefficients (MFCC) was conducted for various parameters and methods, including number of speakers, number of MFCCs, test speech duration, training utterance length and the various classifiers. Classification metrics, memory storage, testing, and training time of the trials were measured as well, and the performance of the trials was examined for each criterion. Verifying the literature, the study revealed that no parameters or methods achieved the best performance for all criteria. In the second stage, a multi-criteria efficiency analysis, as suggested in the literature, was conducted according to various application scenarios. By using data envelopment analysis, the efficiency of trials according to the scenarios was determined. After ranking the efficiency scores, it was revealed that the best solution was task-dependent. From the perspective of classifiers, artificial neural networks outperformed the others considering benefits to cost; however, some of their costs were high, whereas the other classifiers provided the best solutions in light of cost criteria. Last, the number of MFCCs was the least effective parameter for efficiency. Altogether, the findings indicate that the efficiency of a speaker identification system cannot be defined as recognition accuracy, memory storage, testing time or training time but as a function of those criteria.