A hybrid stacked ensemble model for rapid seismic damage assessment with imbalanced training data: A case study on the 2023 Kahramanmaraş earthquakes


Mostofi S., Yilmaz Z., BAŞAĞA H. B., OKUR F. Y., ALTUNIŞIK A. C., Taciroglu E.

Engineering Structures, cilt.340, 2025 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 340
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1016/j.engstruct.2025.120754
  • Dergi Adı: Engineering Structures
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Aquatic Science & Fisheries Abstracts (ASFA), Communication Abstracts, Compendex, Geobase, ICONDA Bibliographic, INSPEC, Metadex, DIALNET, Civil Engineering Abstracts
  • Anahtar Kelimeler: Ensemble methods, Imbalanced dataset, Machine learning, Seismic damage assessment, Stacked ensemble
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Accurate and rapid classification of earthquake-damaged buildings is crucial for effective first response and recovery. Conventional inspection methods are labor-intensive, and thus, automated machine learning offers a promising alternative. Yet, real-world seismic datasets that can be used for training machine learning models are often highly imbalanced, with severely damaged or collapsed structures representing only a small minority. Previous studies that have explored class-imbalance methods focused on basic data-level or conventional ensemble techniques, as well as advanced strategies—such as SMOTEENN, CTGAN, and stacked ensembles with custom weighting—remain underexplored. In this study, building survey data from the 2023 Kahramanmaraş earthquakes were used to comprehensively evaluate data-level, algorithm-level, and hybrid methods. Guided by the results of these evaluations, a novel class-weighted stacked ensemble model featuring Balanced Random Forest and XGBoost as base learners was devised. This new model leverages customized misclassification penalties to improve minority class detection and achieves a balanced accuracy of 0.62 and a G-Mean of 0.75, markedly outperforming models employing data-level balancing alone.