Document Type : Original Article
Authors
^{1} Ph.D. Candidate in Financial Engineering, Faculty of Economic, Management and Accounting, Yazd University, Yazd, Iran.
^{2} Assistant Prof., Department of Accounting and Finance, Faculty of Humanities and Social Sciences, Yazd University, Yazd, Iran.
Abstract
Keywords
Introduction
The problem of securities optimization is a significant financial problem, and the issue of choosing the optimal stock portfolio has long occupied the minds of investment professionals. One of the basic assumptions in finance is that due to a lack of resources, all economic options face some exchange. When deciding on an investment, a fundamental issue that a rational investor face is choosing between the amount of return he wants to make and the amount of risk he is willing to accept according to that return. As a result, an essential step in the investment process is to see how to allocate your financial resources optimally (Bechis et al., 2020)
In the 1950s, Markowitz, the father of modern portfolio theory (MPT), proposed that investors act rationally in the most efficient way when deciding on resource allocation. If it cannot wholly eliminate the portfolio's variance, but there is a law that allows the investor to diversify his funds among all these securities that have the maximum expected return, this principle assumes that the portfolio of both Provides the maximum expected return and the minimum variance.
Markowitz's theory is now known as the modern portfolio theory and lays the foundation for all investment literature and securities optimization methods. This method succeeded in formulating an optimal approach to allocating resources among risky securities where people are only interested in the average and variance of stock returns. MPT provides a formal yet acceptable way to find optimal portfolios called the efficient frontier, which shows the maximum expected return for a given level of risk or the lowest risk for a given expected return level(Bechis et al., 2020).
The problem with the Markowitz variancemean method, estimation errors, and inconsistencies led to the development of several other academics' attempts to find possible portfolio solutions that would lead to optimal asset allocation. The Minimum Variance theory has many practical drawbacks due to the difficulty of estimating the expected return and covariance for different asset classes. Portfolio diversification and securities performance during the 2008 credit crunch created the need for the asset management industry to develop new theoretical frameworks with strong empirical results. The new models are riskbased, meaning that they try to estimate risk factors instead of expected returns that are not predictable. The weight of the new portfolio does not take into account the expected returns and depends only on the specific risk factors affecting each security in the portfolio.
Despite the prevailing view that diversification has failed in the recent credit crunch, riskequalization strategies have performed better than traditional portfolios.
The key to risk equality is to diversify asset classes that behave differently in economic environments: In general, stocks perform well in highgrowth and lowinflation environments, bonds perform well in inflation or recession. Moreover, commodities usually perform best in inflationary conditions. Therefore, creating a balanced basket can lead to much more robust returns. Equity risk portfolios typically invest more in lowvolatility securities than traditional asset allocation strategies. Some of the essential riskbased models are:
Equal Risk Contribution Portfolio (ERC), Risk Parity Portfolio (RP), Global Minimum Variance (GMV), Maximum Diversification Portfolio (MDP), Maximum Sharpe Ratio Portfolio (MSP), Inverse Volatility Strategy (IV), MarketCapitalizationWeighted Portfolio (MCWP).
In this research, we implement a Hierarchical Risk Parity (HRP) approach based on clustering methods and compare the results with the three Minimum Variance methods, uniform distribution, and Risk Parity (RP). The data set used for this research is the top 50 companies of the Tehran Stock Exchange. In the next section, we will discuss the theoretical literature and research done in this regard, and in the third section, HRP theory will be explained and finally, the implementation of a portfolio using this approach for the Top 50 companies of Tehran Stock Exchange in two We deal insample and outofsample courses.
Literature Review
The portfolio is a process in which investors choose how to allocate assets. Markowitz's portfolio theory not only reveals the determinants of portfolio risk but more importantly, it concludes the critical conclusion that "the expected return on an asset is determined by asset risk." Therefore, the price of an asset is determined by its variance or standard deviation.
Investment managers can achieve investment portfolio performance in the three activities that make up the portfolio management process: investment policy, portfolio selection, and market timing. Studies of large US retirement plans show that the overall return on investment policy is 93.6%, and therefore investment policy is the most crucial part of portfolio management, often referred to as strategic allocation (Brinson et al., 1986). Investment policy, or strategic allocation, determines which asset categories are selected and by what weight to achieve the investment goal (Brinson et al., 1986). Given the asset class and its weight, since each asset class is related to its risk and return, the investment manager must decide on the risk tolerance, investment horizon, and level of investment risk (Cochrane, 1999).
Portfolio management can be divided into active and inactive management (AlAradi & Jaimungal, 2018)،(Sharpe, 1991).
The issue of portfolio diversification has long been of interest to researchers. To better understand the subject, the keyword "Diversified Portfolios" was searched in Scopus on April 9, 2021, and the following outputs were received using the bibliometrics package in R software. The bibliometric method process is performed as described by(Börner et al., 2003). The general science mapping workflow was introduced by (Börner et al., 2003).
Table 1. Descriptive statistics of research 


According to the results obtained from the Scopus portal, 447 studies have been conducted between 1957 and 2021, of which 603 have been reported in the form of articles.
According to Fig 1, the peak of research was conducted in the years 2015 to 2020.
The following figure shows the importance of portfolio diversification in research. In this network, the relationship between the authors, keywords as well as the subject of the research have been examined. By examining this form, research fields of interest to researchers can be obtained. Portfolio selection, portfolio diversification, risk management are some of the essential keywords in this research.


Fig 1. Annual Scientific production 

Fig 2. Financial network, the relationship between authors, keywords, and research topics 
The Hierarchical structure of complex financial systems was first examined by Nobel Laureate Herbert Simon in 1991. In the famous article (Simon, 1991) " The architecture of complexity," the author states that "with a complex system, it means that it is composed of many parts that interact with each other in a nonsimple way." In such systems, the total is greater than the sum of the components.
He argues that complex financial systems have a hierarchical organization by which the entire system is broken down into distinct subgroups that can be more easily analyzed. A hierarchical system means a system consists of interconnected systems, each of which has a hierarchical structure when we reach the lowest level of the original subsystem. Thus, a hierarchical structure can help solve complex problems and divide them into smaller, simpler subgroups, after which all those solutions are grouped.
To predict the covariance matrix of size N, we need at least the expected i.i.d return (Independent and identically distributed). However, there is sufficient evidence that asset returns have cluster fluctuations and variance heterogeneity and have an unstable correlation structure over long periods, leading to severe errors that can lead to diversification benefits Destroy the portfolio.
To overcome this problem, Marcos Lopez de Prado was the first researcher to propose a hierarchical model for portfolio construction in his famous paper “Building Diversified Portfolios that outperform outofsample” in 2016. The Spanish author uses network theory and machine learning to build a diversified portfolio with hierarchical equity of risk (HRP) approach that differs significantly from riskbased portfolio optimization models. The HRP method prevents the inversion of the covariance matrix. The securities relationship in the portfolio is organized as a hierarchy in which similar asset clusters are created using the correlation coefficient. Replacing the traditional covariance structure with a hierarchical structure makes possible three main goals: First, it makes full use of covariance matrix information. Second, it covers weight stability. Third, unlike most traditional methods of riskbased asset allocation, there is no need to invert the covariance matrix (Bechis et al., 2020).
Researchers such as (Barziy & Chlebus, 2020), (Snow, 2020), (Molyboga, 2020), (Jaeger et al., 2021) used the HRP approach in their research. (Lohre et al., 2020) In their paper, they examine diversification strategies based on hierarchical clustering. (Raffinot, 2018) Their results show that HERC portfolios based on descending risk criteria perform statistically better than CDaR criteria for risk adjustment. (Jain & Jain, 2019) This paper examines the effect of incorrect choice of covariance matrix on the performance of different allocation methods. It then examines whether the performance of HRP based on machine learning methods is better than portfolios based on traditional risk adjustment methods.
Research Methodology and Research Findings
This research was modeling in terms of type and descriptive in terms of method and applied in terms of purpose. The subject area of this research is the application of machine learning in selecting the optimal portfolio. The statistical population of this research is the top 50 companies in the stock market index, which because some of these companies were initial public offers and, did not have enough data, 5 of them were removed and with the remaining 45 during the period 20180701 to 20200929 (554 trading days) were examined. 70% of the data (381) were considered as insample, and 30% of the data were considered as outofsample (165 days).
Below is a table of descriptive statistics data:
Table 2. Descriptive statistics of data

In the following, we will explain the primary model of work, which is the HRP method.
The concept of HRP is based on graph theory and machine learning techniques and can be divided into three main stages: tree clustering, quasidiagonalization, and recursive bisection. In the following, we will describe each step in more detail. The first step involves breaking down portfolio assets into different clusters using a hierarchical tree clustering algorithm. For the two assets i and j, the correlation matrix is converted to the correlationdistance matrix D as follows (Burggraf, 2020):
(1) 






Fig 3. Correlation clustering matrix 

The second step in the tree clustering step involves calculating the Euclidean distance between all the columns in a pairwise manner, which gives us the distance matrix :

(2) 
The main difference between the above equation and the equation we calculated in the previous section is that the former calculates the distance between the two securities i and j in the portfolio while the latter calculates the distance between those pairs of assets. And then is a function of the whole correlation matrix. The next step involves creating the first cluster . Pairs that are at least spaced apart can be returned:

(3) 
Where U is a set of clusters. Next, we need to update the matrix through a path called "linkage criterion". The distance between the first cluster and the other clusters i is calculated as follows:

(4) 
This step is repeated for each stock in the portfolio. Each time a new cluster of assets is formed, the distance matrix algorithm updates until only one cluster remain (Bechis et al., 2020)

Fig 4. Stock clustering tree 
Next, the quasidiagonalization covariance matrix is used, which sorts the data to sort the intrinsic clusters. The rows and columns of the covariance matrix are organized so that similar assets are stacked, and different investments are separated. Thus, large covariances are located along the diagonal of the covariance matrix, while smaller covariances are located around this diagonal; hence it is called quasidiagonal.

Fig 5. Covariance matrix 
Bisection is the last step of the HRP algorithm and is the essential step as it defines the final weight of the securities in the portfolio. Here it takes advantage of the portfolio feature that "inverse allocation is optimal for the diagonal covariance matrix".
Following the treeclustering process, the algorithm divides each cluster into two subclusters and , starting with the final cluster . According to the weight given to the portfolio, , the variance of each subcluster is calculated as follows:

(5) 
where:

(6) 
Due to these two weight factors, the algorithm updates the weight of portfolios for each subcluster. Therefore, only the assets in each cluster are considered for the final portfolio allocation. The weights and for these two subclusters are as follows:

(7) 
This topdown assignment of weights is an advantage of HRP over other allocation algorithms. Only the assets of one group compete for allocation instead of all the portfolio assets competing with each other. The whole algorithm ensures that and .
The following diagram shows the price and return of data, respectively.

Fig 6. Price chart 

Fig 7. Return chart 
The dendrogram diagram of the companies is as follows.

Fig 8. Dendrogram diagram 
Next, using four methods of HRP and MinVar, uniform distribution, and risk Parity (RP), calculate the optimal portfolio weight and evaluate the four methods using insample and outofsample data.


Fig 9. The weight of each stock according to four types of methods 
The following Table show Weight of each stock based on four types of methods.
Table 3. Weight of each stock according to four types of methods 


In the following, we test the portfolio performance generated by the algorithms by examining the results insample and outofsample.

Fig 10. insample 
As can be seen, the performance of the weighting method based on uniform distribution is better than other methods in both time samples insample and outofsample.
To evaluate the four estimated methods, we use the Sharp, Calmer, Sortino, and Maximum DrawDown for both insample and outsample sections:
Table 4. Evaluation of insample optimization methods 


According to the results obtained from the insample evaluation table, we find that based on Sortino, Sharp, Calmer and Maximum DrowDown, the minvar approach and uniform distribution performed better than HRP methods and Risk parity.
Fig 11. outofsample 
Table 5. Evaluation of outofsample optimization methods 


According to the results obtained from the outofsample evaluation table, it is perceived that based on the ratio of Sortino, Sharp, Calmer and maximum DrawDown, uniform distribution method and Risk parity, better performance than HRP and minvar.
Conclusion
Selecting stocks and forming an optimal portfolio have long been one of the essential concerns of investors. For this purpose, many methods have been created and introduced regarding how to choose a portfolio. In this research, we use the Hierarchical Risk Parity (HRP) machine learning technique, and compare the results with the three methods of Minimum Variance, Uniform Distribution and, Risk Parity (RP) in two time periods insample and outofsample for the top 50 companies the stock exchange. The results show the Minimum Variance approach within the insample and the Uniform Distribution approach outofsample have the best performance. It should be noted that such optimization methods can show different ranking outputs by changing the period, changing the amount inside and outside the sample. Therefore, portfolio managers should have an active approach in evaluating each of these methods according to the conditions and situations in which they are.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest concerning the research, authorship and, or publication of this article.
Funding
The authors received no financial support for the research, authorship and, or publication of this article.