Comparison of tree-based ensemble learning algorithms for landslide susceptibility mapping in Murgul (Artvin), Turkey


Creative Commons License

Usta Z., Akıncı H., AKIN A. T.

Earth Science Informatics, cilt.17, sa.2, ss.1459-1481, 2024 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 17 Sayı: 2
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1007/s12145-024-01259-w
  • Dergi Adı: Earth Science Informatics
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, CAB Abstracts, Geobase, INSPEC
  • Sayfa Sayıları: ss.1459-1481
  • Anahtar Kelimeler: Ensemble learning algorithms, Landslide, Landslide susceptibility, Machine learning, Murgul
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Turkey’s Artvin province is prone to landslides due to its geological structure, rugged topography, and climatic characteristics with intense rainfall. In this study, landslide susceptibility maps (LSMs) of Murgul district in Artvin province were produced. The study employed tree-based ensemble learning algorithms, namely Random Forest (RF), Light Gradient Boosting Machine (LightGBM), Categorical Boosting (CatBoost), and eXtreme Gradient Boosting (XGBoost). LSM was performed using 13 factors, including altitude, aspect, distance to drainage, distance to faults, distance to roads, land cover, lithology, plan curvature, profile curvature, slope, slope length, topographic position index (TPI), and topographic wetness index (TWI). The study utilized a landslide inventory consisting of 54 landslide polygons. Landslide inventory dataset contained 92,446 pixels with a spatial resolution of 10 m. Consistent with the literature, the majority of landslide pixels (70% – 64,712 pixels) were used for model training, and the remaining portion (30% – 27,734 pixels) was used for model validation. Overall accuracy, precision, recall, F1-score, root mean square error (RMSE), and area under the receiver operating characteristic curve (AUC-ROC) were considered as validation metrics. LightGBM and XGBoost were found to have better performance in all validation metrics compared to other algorithms. Additionally, SHapley Additive exPlanations (SHAP) were utilized to explain and interpret the model outputs. As per the LightGBM algorithm, the most influential factors in the occurrence of landslide in the study area were determined to be altitude, lithology, distance to faults, and aspect, whereas TWI, plan and profile curvature were identified as the least influential factors. Finally, it was concluded that the produced LSMs would provide significant contributions to decision makers in reducing the damages caused by landslides in the study area.