Keywords = Machine Learning

Predicting Corporate Loan Defaults Using Deep Learning Algorithms and a Comparative Analysis with Linear Models: A Case Study of a Major Commercial Bank

Volume 10, Issue 1, 2026, Pages 1-42

https://doi.org/10.30699/ijf.2025.444059.1460

Mohammad Ahmadi Azar, Reza Tehrani, Seyed Mojtabi Mirlohi

Abstract In today's complex economic landscape, accurately predicting events such as customer loan defaults presents a significant challenge for financial institutions. Traditional methods have shown limitations in accuracy, prompting the adoption of data-driven machine learning techniques for enhanced predictive capabilities. This study investigates the efficacy of novel machine-learning algorithms compared with linear models for predicting loan defaults at a major commercial bank. Data from over six thousand customer loan files spanning 2019 to 2022 were collected, cleaned, and clustered based on key loan indicators. The accuracy of predicting loan defaults was first evaluated using popular machine learning classification models, including LightGBM, XGBoost, Multilayer Perceptron, and Logistic Regression, and XGBoost performed best. After that, prediction accuracy was evaluated using various time-series machine learning algorithms, with a particular focus on a combined Gradient Boosting and Long Short-Term Memory (LSTM) approach. Results indicate that the combined algorithm outperforms traditional linear models, showing a substantial 40% improvement over the ARIMA algorithm in predicting loan default behavior. This study underscores the potential of advanced machine learning techniques to enhance predictive accuracy in the banking sector, offering valuable insights for risk assessment and financial decision-making.

Analysis of the characteristics affecting the trading risk of listed companies' stocks: A hybrid spatial artificial intelligence approach

Volume 10, Issue 1, 2026, Pages 173-237

https://doi.org/10.30699/ijf.2026.569343.1569

Javad Zolfaghary Tabesh, Babak Jamshidinavid, Mehrdad Ghanbary, Afshin Baghfalaki

Abstract This research identifies the determinants of trading risk (conditional variance) in the Iranian stock market over 15 years (2008-2023) using a novel hybrid approach combining spatial econometrics and machine learning algorithms. The main objective is to evaluate the superiority of hybrid models over traditional methods and to identify the roles of macroeconomic, geopolitical, behavioral factors, and firm characteristics in shaping systematic risk. The sample includes 172 companies listed on the Tehran Stock Exchange with 30,960 monthly observations and 33 explanatory variables. The methodology was implemented in three stages: First, GARCH and EGARCH models were employed to extract conditional variance and confirm the leverage effect. Second, the Spatial Durbin Error Model (SDEM) was used to decompose direct, spatial spillover, and total effects of variables while controlling for cross-sectional dependence (Pesaran CD statistic = 87.45***) and spatial autocorrelation (Moran's I = 0.4567***). Third, machine learning algorithms, including Linear Regression, SVM, Random Forest, XGBoost, LSTM, and Transformer, were applied independently and in combination with SDEM outputs. The results demonstrated a clear performance hierarchy: Linear Regression (R² = 0.4123, RMSE = 0.0987), SVM (R² = 0.5987), Random Forest (R² = 0.6789), XGBoost as the best standalone model (R² = 0.7456, RMSE = 0.0534), and Ensemble (R² = 0.7523). Hybrid models showed significant superiority: SDEM + XGBoost (R² = 0.7823, RMSE = 0.0471; 11.80% error reduction compared to standalone XGBoost and 52.3% improvement over Linear Regression), and SDEM + Ensemble (R² = 0.7867, RMSE = 0.0467) achieved optimal performance. Time-series cross-validation (average test RMSE = 0.0492) and the Diebold-Mariano test (DM = 3.456*** against XGBoost) confirmed statistical superiority. From a substantive perspective, the exchange rate with a total effect of 0.2443*** and SHAP contribution of 18.34% was identified as the most important systematic risk factor, followed by sanction intensity (total effect = 0.1274***, SHAP = 14.23%), Altman Z-score (SHAP = 15.67%), total stock index (total effect = -0.1801***, SHAP = 12.89%), and investor sentiment (total effect = 0.1001***, SHAP = 11.45%). The findings demonstrate that hybrid spatial econometrics and machine learning models improve prediction accuracy by 12-15% through extracting complementary information. Geopolitical and behavioral factors, in addition to traditional macroeconomic variables, are systematically important. Spatial spillovers constitute 15-25% of total effects, which are ignored in traditional models. This research shows that the frontier of financial risk modeling lies in the synergistic integration of economic theory and machine learning.

Smart-Beta Portfolio Optimization Using Machine Learning Techniques

Volume 9, Issue 4, 2025, Pages 91-116

https://doi.org/10.30699/ijf.2025.524928.1518

Fatemeh Salehirad, Farid Tondnevis

Abstract This study examines the integration of machine learning techniques with smart beta investment strategies to enhance portfolio performance. Traditional market indices often fail to meet investors' expectations, especially during volatile market periods, leading to a growing interest in alternative strategies such as smart beta methodologies. These strategies combine the cost and risk efficiency of passive investing with the performance advantages of active strategies by employing alternative weighting schemes based on financial factors such as value, quality, and momentum. In this research, Return on Invested Capital (ROIC) is selected as a value-based factor due to its strong reflection of a company's operational efficiency and value creation driver. We employ three machine learning models—Support Vector Regression (SVR), Random Forest, and XGBoost—to forecast ROIC based on various financial ratios. Each model is fine-tuned using Bayesian optimization techniques to achieve the highest forecasting accuracy. The dataset includes financial data from 85 manufacturing companies listed on the Tehran Stock Exchange. Model performance is evaluated using R², Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE), with the optimized Random Forest model achieving the best results based on higher R² and lower error values compared to the other models. The forecasted ROIC values are then used to construct a smart beta portfolio, which is compared to a traditional market-cap-weighted portfolio. The findings demonstrate that a machine learning-enhanced, ROIC-based smart beta strategy can significantly outperform traditional approaches, offering investors a more robust and data-driven method for portfolio construction and risk-adjusted return enhancement.

Risk prediction of investment funds in member countries of the Federation of European and Asian Stock Exchanges - Machine Learning Approaches

Volume 9, Issue 4, 2025, Pages 140-188

https://doi.org/10.30699/ijf.2025.522399.1527

Nashmil Esmaily, Parviz Piri, Ali Ashtab, Mehdi Heydari, Akbar Zavari Rezaei,

Abstract The main objective of this study is to compare the predictive accuracy of machine learning models, particularly Random Forest and Artificial Neural Networks, with classical statistical methods (such as Logistic Regression and Linear Discriminant Analysis) in forecasting the risk of Exchange-Traded Funds (ETFs) in member countries of the Federation of European and Asian Stock Exchanges. Furthermore, the study aims to identify the key performance and fundamental variables impacting the risk of these funds. This research adopts a quantitative approach based on secondary data analysis. Data were collected for the years 2015-2023 from the databases of the Federation of European and Asian Stock Exchanges and the Tehran Stock Exchange. After preprocessing, risk prediction models, including Random Forest, Artificial Neural Networks, Logistic Regression, and Linear Discriminant Analysis, were developed and validated for each country using unified evaluation metrics (such as accuracy and AUC). The statistical significance of differences in model performance was tested using non-parametric Mann-Whitney U tests, given the non-normal distribution of accuracy across countries. Sensitivity analysis was then conducted on the two superior machine learning models to determine the impact of independent variables (both performance indicators, such as Jensen's alpha and market return, and fundamental attributes, such as fund size and manager expertise) across different markets. Empirical results indicate that, across most countries and after harmonizing time and geographical dimensions, machine learning models, specifically Random Forest and Artificial Neural Networks, outperform classical statistical approaches in predicting ETF risk, with statistically significantly higher accuracy and AUC values (p<0.05 in Mann-Whitney U tests). The robustness of these findings is confirmed after controlling for heterogeneity among countries. Sensitivity analyses further reveal that both performance variables (e.g., Jensen's alpha, market return) and fundamental factors (e.g., fund size, manager expertise) have a significant impact on risk outcomes within these models. At the same time, machine learning methods exhibit a stronger ability to identify and quantify the importance of these variables compared to classical methods. The results highlight the practical advantage of adopting machine learning techniques for risk assessment and management in diverse international financial markets. Overall, the findings of this study reveal that employing machine learning models—especially Random Forest and Artificial Neural Networks—significantly improves the accuracy of ETF risk prediction and enables a more comprehensive identification of key risk factors compared to classical statistical approaches. These models demonstrate superior flexibility and the ability to capture complex, multidimensional data patterns, making them highly advantageous tools for financial risk management. The results suggest that integrating advanced machine learning techniques at both regional and international levels can enhance the responsiveness of investment systems to market changes, providing fund managers and investors with a more solid, data-driven basis for decision-making.

Comparative Analysis of Machine Learning Algorithms in Predicting Jumps in Stock Closing Price: Case Study of Iran Khodro Using NearMiss and SMOTE Approaches

Volume 9, Issue 3, 2025, Pages 27-54

https://doi.org/10.30699/ijf.2025.491324.1496

Ahmad Jafarnejad, Arman Rezasoltani, Amir Mohammad Khani

Abstract Predicting stock price fluctuations has always been one of the most important financial challenges due to the complexities of financial data and nonlinear market behavior. This research aimed to analyze and compare the performance of machine learning algorithms in predicting the closing price jump of Iran Khodro Company shares. Two different methods of managing unbalanced data, NearMiss and SMOTE, were used to overcome the challenge of unbalanced data. The results showed that the NearMiss method outperformed SMOTE by balancing precision and recall in machine learning models. The CatBoost model was recognized as the best machine learning model in this study due to its stable performance in NearMiss and SMOTE methods. The CatBoost model showed a perfect balance between evaluation indicators in the NearMiss method, with an accuracy of 91.46% and an F1 score of 91.29%. This model also had high precision (93.18%) and acceptable recall (89.52%), which showed the ability to detect jumps and avoid wrong predictions correctly. On the other hand, in the SMOTE method, the Random Forest model was superior, with an accuracy of 85.08%. These results show that a combination of unbalanced data management methods and advanced machine learning algorithms can significantly improve the accuracy of price volatility prediction. The results of this research can help investors and financial analysts make better decisions in risk management and optimizing investment strategies.

Predicting the trend of the total index of the Tehran Stock Exchange using an image processing technique

Volume 9, Issue 1, Winter 2025, Pages 1-31

https://doi.org/10.61186/ijf.2024.426626.1442

Roxane Pooresmaeil Niaki, Moslem Peymany foroushani, Seyed Morteza Amini

Abstract This study explores the considerable significance of candlestick chart patterns as a foundational asset within the realm of stock market analysis and prediction. As a graphical representation of historical price movements and patterns, Candlestick charts offer a distinct and valuable perspective for understanding how the financial market operates. This perspective assists us in accurately pinpointing the most advantageous times for making decisions to buy or sell financial securities, such as stocks or bonds. These charts provide insights into market trends and potential trading opportunities. We adopt an innovative approach by harnessing image processing techniques to extract and analyze patterns from Candlestick charts systematically. Our findings underscore the pivotal role of visual data in financial analysis, particularly in times of market volatility and uncertainty. Investors often resort to technical analysis strategies when confronted with erratic market trends, often relying on insights derived from chart-based analysis to guide their decision-making processes. By meticulously extracting essential insights from candlestick charts, our study aims to provide investors with more efficient and less error-prone tools. Ultimately, this endeavor contributes to the enhancement of decision-making precision and the mitigation of risks inherent in participating in the dynamic stock market landscape.

Comparative Analysis of Missing Values Imputation Methods: A Case Study in Financial Series (S&P500 and Bitcoin Value Data Sets)

Volume 8, Issue 1, 2024, Pages 47-70

https://doi.org/10.61186/ijf.2024.414027.1427

Mahdi Goldani

Abstract The accurate imputation of missing values in time series data is paramount for maintaining the integrity and reliability of analyses and predictions. This article investigates the effica-cy of various missing values imputation methods, encom-passing well-known machine learning and statistical tech-niques. Moreover, for a better understanding, they imple-mented two financial data time series: S&P 500 and Bitcoin markets spanning from 2016 to 2023 on a daily frequency. Initially utilizing complete datasets, controlled missingness was introduced by randomly removing 45 data points. Then, these methods applied multiple imputation strategies for estimating and substituting these missing values. Experi-mental evaluation yielded insightful findings regarding the performance of the different methods. The examined ma-chine learning methods, including k-Nearest Neighbors (k-NN), Random Forest, Deep Learning, and Decision Trees, consistently outperformed their statistical counterparts, such as Mean Imputation, Regression Imputation, Hot-Deck Im-putation, and Expectation-Maximization Imputation. Nota-bly, Random Forest emerged as the most effective method, showcasing superior performance in terms of accuracy and robustness. Conversely, the Mean Imputation method exhibited com-paratively inferior outcomes, suggesting its limited suitabil-ity for financial time series data. This research contributes to the ongoing discourse on data integrity within finance ana-lytics and serves as a comprehensive guide for practitioners seeking optimal missing values imputation methods. The empirical evidence provided herein advances the under-standing of imputation techniques' relative performance and their application in financial data, facilitating enhanced de-cision-making processes and yielding more reliable predic-tions.

Forecasting Financial Time Series Using Deep Learning Networks: Evidence from Long-Short Term Memory and Gated Recurrent Unit

Volume 6, Issue 4, 2022, Pages 81-94

https://doi.org/10.30699/ijf.2022.313164.1286

Mohammadreza Ghadimpour, Seyed babak Ebrahimi

Abstract The ability to predict the stock market and analyze market trends is invaluable to researchers and anyone interested in investing. However, this task is a challenging problem due to a large number of parameters and unpredictable noise that may affect the stock price. To overcome this issue, researchers have employed numerous approaches such as Moving Average (MA), Support Vector Machine (SVM), and Neural Networks. With technological advances, deep learning methods have become popular in processing time-series data. In this paper, we compare two recently introduced deep learning models, namely a Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), in forecasting daily movements of the Standard & Poor (S&P 500) index using the daily closing price of this index from 14/5/1991 to 14/5/2021. Results show that both models are effective and accurate in stock market prediction. In this case study, the mean squared error (MSE) and mean absolute error (MAE) for the GRU model are slightly lower than the LSTM model; hence, GRU outperformed the LSTM model despite its simpler structure. The results of this study are applicable in various instances where it is challenging to identify patterns among large volumes of unstructured data, such as medical data analysis, text mining, and financial time series modeling.

Hierarchical Risk Parity as an Alternative to Conventional Methods of Portfolio Optimization: (A Study of Tehran Stock Exchange)

Volume 5, Issue 4, Autumn 2021, Pages 1-24

https://doi.org/10.30699/ijf.2021.289848.1242

Marziyeh Nourahmadi, Hojjatollah Sadeqi

Abstract One of the most critical investment issues faced by different investors is choosing an optimal investment portfolio and balancing risk and return in a way that, maximizes investment returns and minimize the investment risk. So far, many methods have been introduced to form a portfolio, the most famous of the Markowitz approach. The Markowitz mean-variance approach is widely known in the world of finance and, it marks the foundation of every portfolio theory. The mean-variance theory has many practical drawbacks due to the difficulty in estimating the expected return and covariance for different asset classes. In this study, we use the Hierarchical Risk Parity (HRP) machine learning technique and compare the results with the three methods of Minimum Variance (MVP), Uniform Distribution (UNIF), and Risk Parity (RP). To conduct this research, the adjusted price of 50 listed companies of the Tehran Stock Exchange for 2018-07-01 to 2020-09-29 has been used. 70% of the data are considered as in-sample and the remaining 30% as out-of-sample. We evaluate the results using four criteria: Sharp, Maximum Drawdown, Calmer, Sortino. The results show that the MVP and, UNIF approach within the in-sample and, the UNIF and HRP approach out-of-sample have the best performance in sharp measure.