APPLIED SCIENCES-BASEL, cilt.15, sa.11, 2025 (SCI-Expanded)
This study aims to analyze online job postings using machine learning-based, semantic approaches and to identify the expertise roles and competencies required for big data professions. The methodology of this study employs latent Dirichlet allocation (LDA), a probabilistic topic modeling technique, to reveal hidden semantic structures within a corpus of big data job postings. As a result of our analysis, we have identified seven expertise roles, six proficiency areas, and 32 competencies (knowledge, skills, and abilities) necessary for big data professions. These positions include "developer", "engineer", "architect", "analyst", "manager", "administrator", and "consultant". The six essential proficiency areas for big data are "big data knowledge", "developer skills", "big data analytics", "cloud services", "soft skills", and "technical background". Furthermore, the top five skills emerged as "big data processing", "big data tools", "communication skills", "remote development", and "big data architecture". The findings of our study indicated that the competencies required for big data careers cover a broad spectrum, including technical, analytical, developer, and soft skills. Our findings provide a competency map for big data professions, detailing the roles and skills required. It is anticipated that the findings will assist big data professionals in assessing and enhancing their competencies, businesses in meeting their big data labor force needs, and academies in customizing their big data training programs to meet industry requirements.