Automatic fuzzy-DBSCAN algorithm for morphological and overlapping datasets


Creative Commons License

Yelghi A., KÖSE C., Yelghi A., Shahkar A.

JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, cilt.31, sa.6, ss.1245-1253, 2020 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 31 Sayı: 6
  • Basım Tarihi: 2020
  • Doi Numarası: 10.23919/jsee.2020.000095
  • Dergi Adı: JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Communication Abstracts, INSPEC, Metadex, zbMATH, Civil Engineering Abstracts
  • Sayfa Sayıları: ss.1245-1253
  • Anahtar Kelimeler: clustering, density-based spatial clustering of applications with noise (DBSCAN), fuzzy, overlapping, data mining, CLUSTERING-ALGORITHM
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Clustering is one of the unsupervised learning problems. It is a procedure which partitions data objects into groups. Many algorithms could not overcome the problems of morphology, overlapping and the large number of clusters at the same time. Many scientific communities have used the clustering algorithm from the perspective of density, which is one of the best methods in clustering. This study proposes a density-based spatial clustering of applications with noise (DBSCAN) algorithm based on the selected high-density areas by automatic fuzzy-DBSCAN (AFD) which works with the initialization of two parameters. AFD, by using fuzzy and DBSCAN features, is modeled by the selection of high-density areas and generates two parameters for merging and separating automatically. The two generated parameters provide a state of sub-cluster rules in the Cartesian coordinate system for the dataset. The model overcomes the problems of clustering such as morphology, overlapping, and the number of clusters in a dataset simultaneously. In the experiments, all algorithms are performed on eight data sets with 30 times of running. Three of them are related to overlapping real datasets and the rest are morphologic and synthetic datasets. It is demonstrated that the AFD algorithm outperforms other recently developed clustering algorithms.