Machine Learning-Based Screening for Potential Singlet Fission Chromophores: The Challenge of Imbalanced Data Sets


Borislavov L., Nedyalkova M., Tadjer A., AYDEMİR Ö., Romanova J.

JOURNAL OF PHYSICAL CHEMISTRY LETTERS, cilt.14, sa.45, ss.10103-10112, 2023 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 14 Sayı: 45
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1021/acs.jpclett.3c02365
  • Dergi Adı: JOURNAL OF PHYSICAL CHEMISTRY LETTERS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Chemical Abstracts Core, Compendex, MEDLINE
  • Sayfa Sayıları: ss.10103-10112
  • Karadeniz Teknik Üniversitesi Adresli: Evet

Özet

Excitation with one photon of a singlet fission (SF) material generates two triplet excitons, thus doubling the solar cell efficiency. Therefore, the SF molecules are regarded as new generation organic photovoltaics, but it is hard to identify them. Recently, it was demonstrated that molecules of low-to-intermediate diradical character (DRC) are potential SF chromophores. This prompts a low-cost strategy for finding new SF candidates by computational high-throughput workflows. We propose a machine learning aided screening for SF entrants based on their DRC. Our data set comprises 469 784 compounds extracted from the PubChem database, structurally rich but inherently imbalanced regarding DRC values. We developed well performing classification models that can retrieve potential SF chromophores. The latter (similar to 4%) were analyzed by K-means clustering to reveal qualitative structure-property relationships and to extract strategies for molecular design. The developed screening procedure and data set can be easily adapted for applications of diradicaloids in photonics and spintronics.