Predicting Corporate Loan Defaults Using Deep Learning Algorithms and a Comparative Analysis with Linear Models: A Case Study of a Major Commercial Bank
Pages 1-42
https://doi.org/10.30699/ijf.2025.444059.1460
Mohammad Ahmadi Azar, Reza Tehrani, Seyed Mojtabi Mirlohi
Abstract In today's complex economic landscape, accurately predicting events such as customer loan defaults presents a significant challenge for financial institutions. Traditional methods have shown limitations in accuracy, prompting the adoption of data-driven machine learning techniques for enhanced predictive capabilities. This study investigates the efficacy of novel machine-learning algorithms compared with linear models for predicting loan defaults at a major commercial bank. Data from over six thousand customer loan files spanning 2019 to 2022 were collected, cleaned, and clustered based on key loan indicators. The accuracy of predicting loan defaults was first evaluated using popular machine learning classification models, including LightGBM, XGBoost, Multilayer Perceptron, and Logistic Regression, and XGBoost performed best. After that, prediction accuracy was evaluated using various time-series machine learning algorithms, with a particular focus on a combined Gradient Boosting and Long Short-Term Memory (LSTM) approach. Results indicate that the combined algorithm outperforms traditional linear models, showing a substantial 40% improvement over the ARIMA algorithm in predicting loan default behavior. This study underscores the potential of advanced machine learning techniques to enhance predictive accuracy in the banking sector, offering valuable insights for risk assessment and financial decision-making.


