Analysis of Findings in Radiology Reports Using Non-Negative Matrix Factorization


Karayağız E., Berber T.

15th Turkish Congress of Medical Informatics Association, Trabzon, Türkiye, 30 - 31 Mayıs 2024, ss.235-243

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Şehir: Trabzon
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.235-243
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Medical progress has allowed for the collection of large quantities of
data, which can be utilized to discover connections between individual patient
results and offer useful insights for analyzing and interpreting future cases. Topic
modelling methods can be utilized to classify text data based on the words it in-
cludes. An issue when dealing with text data is its huge magnitude, which com-
plicates performing calculations and generating useful insights. This study em-
ploys text vectorization techniques and Non-Negative Matrix Factorization, an
algorithm for matrix decomposition, to analyze the finding section of radiology
reports. The goal is to group words that frequently appear together in reports into
clusters and analyze findings to assign them to the appropriate cluster. The pri-
mary problem in topic modelling is ascertaining the optimal number of clusters
to extract. Word cluster comparisons are conducted by assessing the coherence
and relevance of the terms within each cluster. Only one data set is utilized for
this research. The dataset is examined with various cluster sizes, and their coher-
ence scores are computed. Optimally selecting the number of clusters leads to a
more significant distinction between words for analysis.