The Quantitative Diversity Index in Multi-Objective Portfolio Model

The primary purpose of investors is maximizing the utility that is characterized by two essential criteria include risk and return. Regarding investors' uncertainty about the future, one of the main ways to reduce risk is to diversify the investment portfolio. In this research, we proposed an index conducted by Euclidean distance for assessing portfolio diversity. Besides, we designed a multi-objective model to select optimal stock portfolios with considering value at risk (VaR), which is one of the critical indicators of unacceptable risk, portfolio Beta as systematic risk, and portfolio variance as unsystematic risk simultaneously. The model presented in this paper aims to maximize diversification while minimizing value at risk and stock risks. Furthermore, maximizing returns are considered as a limitation of this model. Since the proposed model is nonlinear and concerning computational complexity, it is NPhard; therefore, we utilized the PSO and the GE metaheuristic algorithms that are improved for solving multi-objective problems to solve the model. The results of the model implementation in multiple iterations showed that the average yield of selected portfolios by the model is higher than the desirable condition. The evaluation of stock performance indicators also shows the satisfactory performance of the multi-objective model.


Introduction
The current literature highlights the importance of choosing the optimal set of investments in the capital market to maximize the expected wealth of investors. In doing so, an investor needs proper methods or criteria to identify and measure the potential value of each investment opportunity. These criteria should be sufficiently reliable and accurate so that investors can decide with high confidence and low risk. Risk and returns are two main critical factors in capital market decisions. The selection of a set of stocks, called portfolio, is usually driven by the interaction between risk and return. The higher the risk of an investment, demands a higher return (Jones, 2010).
Since the early 1950s of the inauguration of modem portfolio theory, the rate of return and the risk of a portfolio has been recognized as the most important factors for investors in every capital market. Markowitz theory (1952) proved that the risk and returns could be at an optimal point by investing in a diversified portfolio of financial assets‫‬ . The degree of risk-taking among individuals is a tradeoff between risk and expected return, and asset returns are unpredictable or risky. Diversification means choosing different financial assets to reduce the risk of one specific asset. A diversity index is a mathematical measure that can show how much the initial wealth is distributed between different assets. In other words, the diversity index contributes investors to choose the appropriate number of assets for investing based on their initial wealth. Diversity has many advantages including aggregate competition in the capital market, maximizing investor wealth and mitigates portfolio risk (Chan, Peter, and William, 1989). According to the stock portfolio theory, portfolio risk is not only affected by the average standard deviation, but also by the diversity of investment. In other words, the larger the variety in an investment portfolio, the lower would be the risk (Reilly and Keith, 2002). This goal requires that the variability of the return on a particular asset be adjusted to the variability of the return on other assets in the portfolio, which would reduce the unsystematic risk (Platanakis, Athanasios, and Charles, 2018;Jackwerth, and Anna, 2016) This research seeks to provide an alternative model to select the optimal stock portfolio and a useful tool to estimate the degree of diversification by adjusting risk win returns. VaR, ‫‬ which is one of the significant indicators of undesirable risk, is integrated with systematic and unsystematic risks when forming a portfolio. We also present the Euclidean distance measure for stock portfolio diversification and formulate a multi-objective model to choose optimal stock portfolios. The Euclidean distance is a novel measure to index the level of diversity in the portfolio, this index calculates the distance between different assets and optimizes the diversification. The results of the model are compared with Sterling diversity index which is a well-known index that integrated variety and balance into a dual concept that can explicit the condition of different parts of society. The model attempts to maximize diversification by minimizing the VaR and stock risk.
Moreover, maximizing returns are considered to be a constraint of this model. A genetic algorithm is used to optimize the nonlinear multi-objective model and compare the results of different dimensions to validate the framework. The results display that the average yield of selected portfolios by the model is higher than the desirable condition and confirm the positive performance of the multi-objective model.
According to investment theory, financial asset's risks can be classified into systematic risk and unsystematic risk. An unsystematic risk, referred to a controllable risk, is exclusive to an asset because the risk is related to a portion of the return on an asset. This amount of risk is specific to a company or an industry, and it is due to several factors such as worker strikes, management practices, advertising competition, and changes in consumer tastes. Systematic risk, an uncontrollable risk, is related to the general market conditions such as interest rates, the national currency rate fluctuations, inflation rates, monetary and financial policies, and political conditions (Gagliardini, and Christian, 2013) Systematic risks cannot be eliminated at all ( Kim et al. 2018). Markowitz (1952) suggests that the risk and returns could be at an optimal point by investing in a diversified portfolio of financial assets.‫‬ He created two-directional reforms in the management of investment, with the idea that a financial decision to be taken from the swap between risk and the return of the stock market. First, he assumed that the investor performs a quantitative evaluation of the risk and return of the stock portfolio at the same time pays attention to integrate the portfolio return and the motion of the portfolio returns, which it is the main idea of diversification. Second, the financial decision-making process assumes to be an optimization problem; the investor chooses a portfolio among the various types of available combinations which has the least variance (Georgalos, Ivan, and David, 2018). Markowitz approach is a diversification method that is used in the analysis of the portfolio of investments. This kind of diversification involves the inclusion of covariance between the securities and the combination of less correlated capital assets to reduce the risk in the portfolio without jeopardizing returns. In other words, the less correlation in an investment portfolio will reduce the risk of the portfolio (Francis, and Dongcheol, 2013).
Other methods introduced include the use of the Hirschman-Herfindahl Index (HHI) and the Shannon entropy index (Chen, Yong, Xianhua, and Lingling, 2014). Oh, et al. ‫)5002(‬ used a genetic algorithm to optimize the index-based portfolio. Their goal was to build a portfolio that had the same performance as the stock index. The proposed algorithm was applied to one of the Korean stock market indices from 1999 to 2001 and was compared with traditional methods of constructing the index-based basket. The results show that the genetic algorithm has many advantages over traditional methods when the fluctuations of the market are increasing. It is fully effective and shows the average performance when the market trend is constant.

Research background
This study contributes to the literature as follows: First, we develop an innovative multi-objective mathematical model by combining VaR and portfolio diversification as a task to optimize portfolio, stock returns, and risks. Second, we consider both systematic and unsystematic risks in the model formulation using the Euclidean distance criterion as a tool for the quantitative assessment of the diversity index. Third, we apply a real-world case study to verify the proposed model and solve the proposed model using two robust meta-heuristic algorithms. Fourth, the model can be tested in any other capital market in the world.
Applying the diversification strategy to portfolio optimization is considered by many scholars and organizations (Steinberg, 2018;Pola, 2016;Briere, Kim, and Ariane, 2015;Dang, 2019;Kara, Ayşe and Gerhard-Wilhelm, 2019;Paut, Rodolphe, and Marc, 2019;Aluko, Oladapo, and Bolanle, 2018;Beaudreau, Maggie, and Philip, 2018). For instance, Liu (2018) used experimental data of several key cryptocurrencies to study the role of diversity and investment in the digital asset market. Similarly, Kajtazi et al. (2018) searched the effects of considering bitcoin to the ideal portfolio using the mean-CVaR method. They reported that this consideration could play a significant role in portfolio diversification.
Diyarbakırlıoğlu and Satman (2013) offered a new approach to evaluate the diversification risk of an investment portfolio by the covariance matrix of returns. They solved their problem using Maximum Diversification Index (MDI) through the genetic algorithm. Their problem is verified by existed stock returns data, and results show the MDI can be powerfully applied to define a large set of investable assets. Oyenubi (2016) reported an acceptable description for the elusiveness of the optimum amount of stocks in a portfolio. He used the Portfolio Diversification Index (PDI) to quantify diversification.
Having used a novel quantification approach, Oyenubi attempted to quantify the level of the interdependency of stocks in a set, and the problem is solved by using Pareto based algorithm to find optimal portfolios. Jadhao and Chandra (2017) utilized sample entropy and approximate entropy indicators for diversifying portfolio in the rotation strategy based on size and style. Oloko (2018) investigated the benefits of diversification in Nigeria's stock market. Kalashnikov et al. (2017) suggested a new integrated approach to address the Lean Six Sigma project portfolio and solved the problem using a binary mixedinteger bi-objective quadratic model. Their model is solved via branch-andbound solver of CPLEX software, and it is verified using numerical examples. Chang et al. (2009) examined the optimization of the portfolio considering different scales for risk measurement using a genetic algorithm. In their research, the genetic algorithm was used because of its ability to solve complex problems in various risk assessments. The results showed that most of the optimization problems, including cardinal limitations, can be solved through a genetic algorithm within a reasonable time; if the mean-variance, semi-variance, and variance associated with skewness are used as a risk measurement criterion. He also found that the smaller portfolio has better performance than a larger one. The stock returns derived from the genetic algorithm are less than other models, but risk reduction and adjusted risk-based criteria offset the reduction in returns implying the superiority‫‬ of the response from the genetic algorithm.
Investors should consider the following rules when launching their diversification strategy.‫‬ First, the whole investment portfolio in one stock does not have a significant impact on the overall strategy. Second, targeting a maximum of 20 stocks in the portfolio would result in better control of the managerial costs. Third, investing more than 10% in one stock is not recommended because it can be against diversification. Forth, to achieve optimal diversity, the focus on investing in companies that significantly are affected by others should be avoided; because they are affiliated companies or large suppliers or customers. Investing in a large company that is affected by others, may prevent diversification.

Assumptions
First, investors are generally risk-averse and more interested in higher expected returns per less risk, yet they seek to balance between risk and returns. As a result, the stock portfolio is selected based on minimizing the amount of fitness function. Second, investors do not tend to invest in a small stock that has low liquidity, and there are no restrictions on transaction costs and taxes. Third, there are no restrictions on the market and the short sale. A mathematical model to consider portfolio diversity, stock returns, and risks simultaneously are developed from a portfolio analysis viewpoint. To this end, at first, we provide the assumptions, indices, parameters, and decision variables of the proposed model and then using these relevant components, we formulate the mathematical model. Decision variables x j Percentage of stock j in the investment portfolio q j Equal to 1, if the stock j is in the investment portfolio; otherwise 0 y t Equal to 1, if the portfolio returns in time t are negative, otherwise 0 VaR VaR SR The systematic risk value of the investment portfolio USR The unsystematic risk value of the investment portfolio Multi-objective model After describing the relevant components, the proposed multi-objective model can be formulated as follow. This model has three main objective functions that attempt to maximize stock diversification and minimize VaR and stock risks. Three mentioned objective functions along with its relevant constraints shows as follow: Objective functions 1 11 : The objective function (1) is to maximize the diversity of stock portfolio. Since the distance is an essential component to measure the difference between elements, then this criterion should play a fundamental role in the classification of investments. Therefore, variety can be calculated using the difference between the elements. Accordingly, a portfolio of stocks as N={S1, S2, …, Sn} is considered, and the Euclidean distance between the two selected stocks i and j is defined using equations (16) to (18). Finally, the integrated multi-criteria diversity index based on the Euclidean distance between each pair of elements can be introduced as D=∑ni=1 ∑nj=1 dij xixj. In this regard, increasing the value of means increasing the variation between items. Then, the variety of elective elements is estimated as total Euclidean distances between each pair of them.
On the other hands, the objective function (2) is to minimize the systematic risk and unsystematic risk. Different indices have been proposed for risk measurement of stock portfolios, including the portfolio standard deviation introduced by Markowitz (1952). In order to calculate the variance of the stock portfolio, the weight of each share must be determined (x j ) in the stock portfolio. The stock variance is calculated as equation (4) which represents the unsystematic risk. Also, equation (5) measure the systematic risk of the stock portfolio to calculate parameter β j the equation (19) is used. The objective function (3) is to minimize VaR which it represents the risk of loss for investments‫‬ and estimates how much of the investment may be lost within a given time period when the market condition is stable.
Equation (6) implies that the purchased stocks must be precisely the same as all available resources. Constraints (7) shows that the selected stocks in an investment portfolio should be less than or equal to the appropriate maximum amount. Constraints (8) specified an upper bound for the weight of each share (x j ) in the stock portfolio if this stock is selected, it can increase the diversification of the stock portfolio. In determining the upper bound for the decision variables, the investor's opinion is decisive, and it is determined by the minimum number of shares which the investor is willing to invest in them. Constraints (9) shows that the portfolio returns plus the VaR is related to the y t . Equation (10) shows that in α percentage of time the variable y t is equal to one. When this variable is equal to one, the corresponding constraint in the first constraints category is redundant, in other words, the portfolio returns in α percent of the time or in [αT] of the time period can be negative. For the rest of the time, y t is equal to zero; it means the portfolio returns plus the VaR in periods that portfolio return is negative should be greater than zero. In constraints (11) ȓ j represents the average return of each stock and E Ω max is the maximum expected return of the stock's portfolio and the expression (1-λ) E Ω max indicates the minimum expected return of the portfolio. Given that the portfolio return is assumed to be the weighted sum of the return, the value of E Ω max can be calculated using the equation (12). Finally, constraints (13)-(15) show the binary or non-negative decision variables.

Solution approach
To solve the proposed multi-objective model, two well-known meta-heuristics algorithms including a genetic algorithm (GA), and particle swarm optimization (PSO) are used. Moreover, Lingo software is used to obtain the ideal values of each objective functions.

Genetic Algorithm (GA)
GA is a highly effective and efficient random and meta-heuristic optimization method that has been used to solve many complex problems developed by Holland (1974). In this algorithm, first, the problem variables are chosen randomly; then they are combined to draw other points. GA as one of these algorithms is basically a computer search method composed of the gene and chromosome structures.
This algorithm initially begins with a set of random solutions (chromosomes) which is known as the population base, and then the value of each chromosome is determined according to the fitness function. Therefore, higher qualities of chromosomes have a greater chance of producing offspring; on this basis, the choice of parents is taken, and then the offspring are created by crossover operator on the parents. Finally, some of the genes of the offspring change with the mutation process, and then the new offspring are replaced with the weakest chromosomes in the initial population. The main steps to solve an optimization problem via GA can be illustrated as Figure. 1 Also, the mutation and crossover operators are presented in (Holland, 1974), and the reader advised to see the research.
Step 1. Create the initial population; a) Generate random chromosomes.
Step 2. Perform the main loop; b) Calculate the fitness functions of each chromosome. c) Select two chromosomes from the initial population using the roulettewheel operator. d) Do a crossover operator. e) Do mutation operator. f) Repeat steps b to e until enough members to form the next generation be created.
Step 3. Repeat step 2 while stopping criteria is not satisfied.
Step 4. Display the obtained results. In the face of multi-objective optimization problems, one of the approaches to solving is to use an LP-metric approach which was introduced by Zelany (1974). This method is one of the compromise programming methods, and it works without achieving knowledge from the decision-maker, and it attempts to minimize the distance between some of the reference points and the probable solution (deviation). In this method, the choice of the reference point and the criterion for measuring the distance is an important topic. Based on this model, each objective of the problem (k objectives) is solved separately, and after normalizing, the answers are combined to find the optimal solution (the optimal answer that is the closest answer to the ideal answer). The mathematical form of this method can be displayed as follows.
p=1 refers to the same weight of all deviations, and an increase of means more weight of larger deviations. Now, to form a fitness function (goal) for using in the genetic algorithm, the LP-metric approach is used according to Where 1 is the objective function of the greatest diversity (diversification), 2 is the objective function of total risks includes systematic and unsystematic risks, and 3 is the objective of the VaR. Also, 1 * , 2 * , and 3 * are the optimal amount of each of the objectives in terms of the problem constraints.

Particle Swarm Optimization
Eberhart and Kennedy (1995) proposed a particle swarm optimization algorithm which is inspired by the social behavior of animals, such as the collective migration of birds and fishes. Initially, this algorithm was used to explore the effective patterns on the simultaneous flying of birds, the sudden change of direction, and the optimal deformation of their groups. The change in the location of birds in the search space is influenced by the experience and knowledge of themselves and their neighbors. Therefore, the position of other birds affects the search of a bird. The result of the modeling of this social behavior is the process of searching for birds in the direction of successful areas. Birds learn from each other and move on to their best neighbors based on their knowledge. The basis of this algorithm is on the principle that at any given moment, each bird adjusts its location in the search space, according to the best place ever located and the best place in its entire neighborhood. The following relationships are also used to update the velocity and location of each of the particles.
Where w is the inertial weighting factor or moving in its path, which indicates the effect of the velocity vector of the previous iteration (V i (t-1)) on the velocity vector in the current iteration (V i (t)). Also, c 1 and c 2 represent the constant coefficient of training or motion in the direction of the best value of the examined particle and the best value among all population, respectively. Moreover, rand 1 and rand 2 are two random numbers with uniform distribution in (0, 1). The x i (t) and x i (t-1) represent the position vector of particles in the current iteration and previous iteration, respectively.
The best position found for particle i is defined by P i .best while P g .best represents the best position found by the best particle among all particles.
To prevent the excessive movement speed of a particle when they move from one location to another or velocity vector divergence, the velocity variations are limited to (V min ≤ V i (t) ≤ V max ). The pseudo-code of the particle swarm optimization algorithm is presented in Figure. 2. Initialize particle For each particle Calculate fitness value of the particle f i /*updating particle's best fitness value so far*/ If f i is better than P i .best Set current value as the new P i .best End For /*updating population's best fitness value so far*/ Set P g .best to the best fitness value of all particles For each particle Calculate particle velocity according to equation (22) Update particle position according to equation (23) Calculate the fitness value of the particle f i /*Updating population's best fitness value so far*/ Set P g .best to the best fitness value of all particles End For End Figure 2. The pseudo code of particle swarm optimization algorithm Also, the chromosome used for these two algorithms is shown in Figure. 3 By way of example, we suppose that eight stocks (j=5) exist and MaxNS=5. So, activate genes of the chromosome is less than or equal to MaxNS (see constraint (7). For instance, it supposed with 5 includes gens 1, 3, 4, 6, and 7. The initial chromosome with random data between (0, 1) are generated which sum of them be equal to 1 (based on the constraint (6) of the model). So, the proposed chromosome represents the percentage of stock j in the investment portfolio (x j ).

Parameter setting
Here, the parameters of algorithms are tuned to achieve the best performance, to this end Taguchi method are used (Taguchi, ‫.)6891‬ This method as one of the designs of experiments approaches seeks to tune parameters using a set of the orthogonal array instead of full factorial experiments. To perform this method, Minitab software is utilized, and since the LP-metric approach is used to combine objective functions, so following response as "smaller is the better" for Signal-to-Noise ratio is used.
To do this at first, the levels of the parameters of each algorithm are provided in Table 1. Then, by performing the Taguchi method using Minitab software, the orthogonal arrays L 9 and L 27 are chosen for GA and PSO respectively. Finally, these two algorithms are ran using these sets of experiments and the results are presented in Tables 2-3. Furthermore, to decide on these results the Signal-to-Noise plots are illustrated in Figure. 5-6.    Likewise, Figure. 5 shows that the values of 1, 0.5, 1, 300, and 200 are selected for C 1 , C 2 , W, N-pop, and Max iteration, respectively. Therefore, these values set for final running via two algorithms.

The Model Test
To implement the model and measure its efficiency, we performed a test of data from the top 30 companies in Tehran Stock Exchange (TSE) by considering several features such as volume, value, no. of trades, closing price, and market capitalization are reported in Table 4. These mentioned values are used as S jk (characteristic k from the stock j) to calculate d ij (Euclidean distance between two selected stocks i and j). Also, the returns values of stocks in each time period are provided in Table 5 and the values of average returns, β coefficients, and variance, per 12 months are given in Table 6. Furthermore, the values of d ij and σ ij can be obtained using their related formulas and the values of desired confidence level (α), percentage of the minimum expected return of portfolio (λ), maximum investment in stocks (U), and the maximum number of stocks in the investment portfolio (MaxNp) are assumed as 0.2, 0.6, 0.1, and 20, respectively. Moreover, in order to find the ideal values of each objective function, the model is solved using Lingo software and applying these values of parameters. Therefore, values of 1.637845E+15, 0.04760250, and 0.1019152 are obtained as goal values for Z 1 * , Z 2 * , and Z 3 * , respectively.

Results
The results of testing the models using various approaches are reported. They are representing the proportions of the budgets to be invested in each company's stock. Several criteria are used to compare three approaches such as the return rate, the real diversity index, systematic risk, unsystematic risk, amount of stock in the portfolio, VaR, CVaR, Treynor ratio, and Sharpe ratio. The comparison of these approaches is illustrated in Table 7.
Also, it should be noted that all calculations are based on using the branch and bound solver of the Lingo 9 software and MATLAB software. After solving the problem with the data of the previous section and the mentioned goals, Tables 7-9 have been obtained. According to Table 7, the GA has the least‫‬ CPU time and VaR, and the highest‫‬ portfolio return and sterling diversity index. So, it can be selected as the best approach and the related decision variables of it are provided in Table  8. According to Table 8, the following 20 companies can be selected as an optimal portfolio: SROD1, GHND1, LAMI1, STEH1, SKHS1, KFAN1, KPRS1, TAMI1, KSKA1, SGRB1, SHGN1, BSTE1, SRMA1, DMVN1, SPKH1, SMAZ1, SURO1, KHOC1, ABAD1, and SSOF1. And the values of Z 1 , Z 2 , and Z 3 result in 5.849442E+12, 1.004157 and 0.1275286, respectively. Furthermore, to show the efficiency of the proposed multi-objective model, it is compared with Markowitz and diversification-risks models. These values are presented in Table 9. To validate the framework, their model includes Markowitz, diversification-risks, diversification-risks-VaR, and diversification-risks-CVaR models are solved 15 times with different population sizes to display the average returns of stock portfolios, as shown in Figure 6. In this investigation, we have used the conventional method of calculation VaR (VaR= -σ 2 Z α -µ) with the assumption that the distribution of returns is normal with a mean of µ and σ 2 variance at the α confidence level, based on these assumptions the conditional value at risk (CVaR) model is calculated which is one of the developed versions of VaR model. This risk measure quantifies the amount of tail risk of the portfolio that proposed in 2000 by Rockafellar and Uryasev. Since the optimal return rate for investment based on a market index is defined as an interval between 6.5 and 8.5, so based on Fig. 6, only in the diversification-risks model the average return is lower than the desired value. In other models, the average portfolio returns are higher than the mentioned values. Therefore, it can be concluded that the proposed model can be chosen as a suitable model.

Conclusion
This study introduces the Euclidean distance criterion as a measure of stock portfolio diversification and uses a multi-objective model attempts to select optimal stock portfolios. The model aims to maximize diversification and minimize VaR, and stock risks including systematic and unsystematic risks. Also, maximizing returns are considered as a constraint of this model. Since the proposed model is nonlinear (and regarding computational complexity, it is NP-hard) the study utilizes two meta-heuristic algorithms to solve the model. It further validates portfolio selection by using the data of the top 30 active companies in the TSE for the 12 months. The findings show that the GA has the least CPU time and VaR, and the highest portfolio return and sterling diversity index. So, it can be selected as the best approach, and the related decision variables can be reported based on the results of 20 active companies in the TSE that were selected as the optimal portfolio. Moreover, to show the effectiveness of the proposed multi-objective model, we compared our proposed model with Markowitz and diversification-risks models. After testing our model for 15 times with different population sizes, we found that the average portfolio returns are higher than the desirable values resulted from the market index.
Various studies have been conducted to investigate the relationship between returns and risk in the selection of stock portfolios. However, in the field of quantification of diversity index, no particular mathematical formulation has not been introduced so far. In this research, the Euclidean distance criterion has been tested as a tool for the quantitative assessment of the diversity index. Also, the research model is designed based on deterministic parameters, the unfeasibility of short selling, and without considering the transaction cost. For the future study, it is suggested that in addition to the uncertainties of parameters of a model, the effect of short selling and transaction costs on the index of diversity be studied. In addition, the different methods in VaR computations can be utilized in the model and the results should be compared.