10th International Conference on Advances in Statistics, Budapest, Macaristan, 19 - 21 Nisan 2024, ss.33
Feature selection, which is widely used in data preprocessing, is a challenging problem since it involves hard combinatorial optimization. The high dimension of the data obtained from the sources is encountered as an issue in many issues such as computation cost. For this reason, eliminating the unnecessary ones among the data and choosing the appropriate ones makes it possible to evaluate the information correctly. Feature selection has been proven to effectively remove irrelevant and redundant features. In addition, feature selection techniques extract the most recognizable features to improve the performance of classification methods. Recently, due to the advantages of simplicity and easy implementation, many meta-heuristic algorithms have shown effectiveness in solving hard combinatorial optimization problems. In this study, the tuna swarm optimization (TSO) algorithm inspired by the cooperative foraging behavior of tuna swarms is used in feature selection. However, the TSO algorithm has not been systematically applied to feature selection problems yet. Therefore, a binary version of the tuna swarm optimization, called BTSO, is improved to select the optimal feature subset that maximizes the classification accuracy and minimizes the feature subset length. To demonstrate the efficiency and superiority of the proposed BTSO algorithm, benchmark datasets from the UCI repository are employed. The classification accuracy, the fitness values, the number of selected features, sensitivity, specificity, and convergence curves are reported for BTSO and its competing algorithms. The results show the ability of the BTSO algorithm in searching the feature space and selecting the most informative features for classification tasks.