Uncovering the Educational Data Mining Landscape and Future Perspective: A Comprehensive Analysis

Creative Commons License

Özyurt Ö., Özyurt H., Mishra D.

IEEE ACCESS, vol.11, pp.120192-120208, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 11
  • Publication Date: 2023
  • Doi Number: 10.1109/access.2023.3327624
  • Journal Name: IEEE ACCESS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Page Numbers: pp.120192-120208
  • Keywords: Educational data mining, machine learning, research trends, topic modeling
  • Karadeniz Technical University Affiliated: Yes


Educational data mining (EDM) enables improving educational systems by using data mining techniques on educational data to analyze students' learning processes to extract valuable information that helps optimize teaching strategies and improve student achievement. EDM has been an important area of research and application in recent years. The aim of this study is to describe the current situation of the EDM field and reveal its future perspective. The study employs descriptive analysis and topic modeling, utilizing a corpus of 2792 studies indexed in the Scopus database since 2007. Firstly, the study determines the document types, distribution by years, prominent authors, countries, subject areas, and journals of the studies in the field of EDM. Then, using topic modeling analysis, which is an unsupervised machine learning technique, the study determines hidden patterns, research interests, and trends within the field. This study is innovative and the first as it reveals latent research interests and trends in the field of EDM through machine learning-based topic modeling-based analysis. The descriptive characteristics of the study emphasize the continuous development of the field and its multidisciplinary aspect. The outputs of the topic modeling analysis reveal that the studies can be grouped into twelve topics. The most frequently studied topic is "Learning pattern and behavior", and the topic whose frequency of study increases the most over time is "Dropout risk prediction". When comparing the frequency of study of the topics over time to other topics, the first topic that stands out is "Performance prediction". The results of this study can be expected to make significant contributions to the field in terms of revealing the big picture of the current literature in the field of EDM and providing a future perspective. Therefore, the results of the study are expected to give direction to the field and provide important insights or guidance to decision makers and education policy makers.