Abstract
In this study, we trained and compared explainable machine learning algorithms for predicting the band gaps of perovskite materials that have the formula ABX3 containing both zero and non-zero band gaps. Six supervised learning models: 5 ensemble learning methods and 1 neural network (CompoundNet) were employed to study the non-linear relationship that exists between the band gap and the characteristics of its constituent elements such as electronegativity, covalent radius, first ionization energy, and row in the periodic table. The machine learning (ML) models were trained on datasets obtained from density functional theory (DFT) calculations. The results show that CatBoost and XGBoost models yielded the least predictive errors and the highest coefficient of determination of R2 ≥ 88% than other approaches in the testing phase. Furthermore, the Shapley Additive Explanation (SHAP) was used for explaining the model based on the elemental composition of each perovskite compound from the physics standpoint, and a novel holistic feature ranking of the explained models was proposed. One key insight gained from the SHAP analysis is that the Pauling electronegativity of the B site cation in the cubic perovskites which characteristically plays an important role in the electronic properties of this class of materials is the feature that contributes most to the prediction of the band gaps. These results reveal the potential of ML to predict materials properties quickly and accurately with datasets useful in the engineering of efficient solar cell devices.
Original language | English |
---|---|
Article number | 107427 |
Journal | Materials Science in Semiconductor Processing |
Volume | 161 |
DOIs | |
Publication status | Published - Jul 2023 |
Keywords
- Band gaps
- Ensemble learning
- Explainable artificial intelligence
- Neural networks