A novel and powerful hybrid classifier method: Development and testing of heuristic k-nn algorithm with fuzzy distance metric


KAHRAMAN H. T.

DATA & KNOWLEDGE ENGINEERING, cilt.103, ss.44-59, 2016 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 103
  • Basım Tarihi: 2016
  • Doi Numarası: 10.1016/j.datak.2016.02.002
  • Dergi Adı: DATA & KNOWLEDGE ENGINEERING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.44-59
  • Anahtar Kelimeler: Classification, Data mining methods and algorithms, Artificial bee colony optimization, Fuzzy distance metric, Heuristic weight-tuning method, Hybrid k-nearest neighbor classifier, NEAREST-NEIGHBOR, EXCITATION-CURRENT, OPTIMIZATION, SIMILARITY, FILTER, QUERY, RULE
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Weight-tuning methods and distance metrics have a significant impact on the k-nearest neighbor-based classification. A major challenge is the issue of how to explore the optimal weight values of the features and how to measure distances between the neighbors affecting the classification accuracy of the k-nn. In this paper, a powerful similarity measurement method, which is called the fuzzy distance metric, is explained and extended to measure the distances between the test and training observations. Depending on the fuzzy metric, similarity arrays can be produced more efficiently than the classic and other weighted distance measurements. Finally, the weighting methods are combined with the fuzzy metric-based similarity measurement and the k-nearest neighbor algorithm to increase the classification accuracy of the proposed algorithm. The effectiveness of the proposed approaches is proven by comparing their performances with the performances of the classic and the population-based heuristic methods on the well-known, real-world classification problems obtained from the UCI machine-learning benchmark repository. The experimental results show that the proposed hybrid algorithms significantly explore more optimal weight vectors significantly and provide more accurate classification results than the powerful and well-known instance-based intuitive and heuristic classification algorithms and classic approaches over real datasets. (C) 2016 Elsevier B.V. All rights reserved.