Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes from Construction Accident Reports


TOĞAN V., Mostofi F., Tokdemir O. B., Kadıoğlu F.

IEEE Access, cilt.13, ss.99758-99777, 2025 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 13
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1109/access.2025.3576442
  • Dergi Adı: IEEE Access
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Sayfa Sayıları: ss.99758-99777
  • Anahtar Kelimeler: Construction industry, decision making, machine learning, natural language processing, project management, safety management, transfer learning, transformers
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

The time-intensive extraction of insights from textual safety documents using conventional methods causes delays and inaccuracies, hindering proactive incident prevention in construction projects. While the architecture of large language models (LLMs) were well-studied, their deployment efficiencies were often overlooked. This study proposes DistilBERT as a more efficient text management method for extracting safety text from construction safety documents. To maintain the relevance of the extracted safety text, a dataset of 5,224 construction accident cases from 73 projects across the Euro-Asia region was compiled, where incidents were analyzed through detailed questionnaires to identify safety attributes, with term frequency-inverse document frequency (TF-IDF) analysis applied for validation. When benchmarked against conventional NLP methods and state-of-the-art LLMs such as BERT, RoBERTa, and XLNet, DistilBERT demonstrated comparable accuracy with significantly reduced computational time. Specifically, DistilBERT achieved an accuracy of 79% in severity scale classification with an F1 score of 0.72, while reducing processing time by approximately 50% compared to BERT (from 2,918.28 seconds to 1,492.08 seconds). By offering rapid inference speeds with negligible accuracy trade-offs, DistilBERT emerges as a practical tool for automating safety text extraction, making it ideal for settings with limited computational capabilities and urgent decision-making requirements. This study examines how DistilBERT can be integrated into construction safety management systems without modifying the underlying platforms. Future work should focus on API creation, secure machine learning pipelines, and optimized deployment of LLMs, particularly in complex contexts.