Document Type : Original Article
Authors
 Hossein Naderi ^{1}
 Mehrdad Ghanbari ^{} ^{2}
 Babak Jamshidi Navid ^{2}
 Arash Nademi ^{3}
^{1} Ph.D. Candidate, Department of Accounting, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran.
^{2} Assistant Prof., Department of Accounting, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran.
^{3} Assistant Prof., Department of Statistics, Ilam Branch, Islamic Azad University, Ilam, Iran.
Abstract
The modeling of strategies for buying and selling in Stock Market Investment has been the object of numerous advances and uses in economic studies, both theoretically and empirically. One of the popular models in economic studies is applying the Markov Switching models for forecasting the time series observations based on stock prices. The semiparametric estimators for these models are a class of popular methods that have been used extensively by researchers to increase the accuracy of estimation. The main part of these estimators is based on kernel functions. Despite the existence of many kernel functions that are capable in applications for forecasting the stock prices, there is a widely use of Gaussian kernel in these estimators. But there is a question if other types of kernel function can be used in these estimators. This paper tries to introduce the other kernel functions that can be a good replacement for this kernel function to increase the ability of Markov Switching models. We first test six popular kernel functions to find the best one based on simulation studies and then offer the new strategy of buying and selling stocks by the best kernel function selection on real data.
Keywords
Introduction
The many investigations in economic and financial mathematics focused on what makes an investor profitable in the stock market. These studies can aid the researchers to decrease the investment risks and increase opportunities for a high return of gaining. One of the important questions in the stock market is when the investors can buy the stocks and when they can sell their stocks. In research economic papers, there are two aspects of analysis: fundamental and technical analysis. In fundamental aspects, the researchers find the reasons for changing stock prices, in response to reasons of changing prices caused by exogenous geopolitical events, supply disruptions or financial operation of the companies, etc. But technical analysis noted more the statistical and probabilities rules governed by processes of the data. In an aspect of technical analysis, there are a lot of models in time series to capture the stock prices. The Markov Switching models are the popular models in time series that are applied most widely in financial and economic data. These models exhibit abrupt changes in the behavior of time series data, called switches of regimes, where the switching between the regimes is controlled by a hidden Markov Chain process. (See Chang, Yongok, & Joon 2017; Von Ganske, 2016; Billio, Casarin, Ravazzolo, & Van Dijk, 2016; Di Persio, & Frigo, 2016; Neale, Clark, Dolan, & Hunter, 2016; Nademi, & Nademi, 2019).
Recently, Markov switching AR–ARCH[1] models have repeatedly applied for making switching regimes processes and every one of them offers an algorithm for estimating the parameters. The finding of the best algorithms for parameters estimation of these models has been the object of many expansions and usages over the last decade. Numerous algorithms based on parametric and nonparametric methods have been offered to capture the parameters of the modeling process. In this respect, the combination of parametric and nonparametric methods, called semiparametric algorithms, are popular and most broadly applied. (See Chan, & Wang, 2017; Chang, Tang, & Wu, 2016; Chen, Shen, Wei, & Lin, 2017; Gupta, Cobre, Polpo, & Sinha, 2016; Gu, Ma, & Balasubramanian, 2015; Nademi, & Farnoosh, 2014).
In a semiparametric class of algorithms, a special function, called kernel function, is used. The selection of proper kernel function is an important item for estimating the parameters. Such that, if we apply the proper type, we can have a fast and unbiased estimating process. So, offering the best kernel functions for estimation algorithms can be essential for the modelling process. In this paper, we first focus on selecting the best kernel function in a special class of Markov switching models called semiparametric Markov switching offered by Nademi & Farnoosh (2014) for modeling the time series data and then offer the new strategy of buying and selling stocks by the bestselected kernel function of this model on real data.
In the next section, the model and its algorithm will be introduced. Section 3 discuss the best selection of kernel function by a simulation study and offer the buying and selling strategy by comparing the different kernel functions and the feasibility of these kernels will be shown. Finally, section 4 conclude and introduce the opportunities for future studies.
The model and EM algorithm
This section consists of two subsections. In the first subsection, we introduce the Markov switching model introduced by Nademi & Farnoosh (2014) and in the second subsection, their algorithm for estimating the parameters will be reviewed. Note that, their semiparametric algorithm is a part of a more general algorithm as EM algorithm that applies to the class of Markov switching models.
where switching between the regimes is controlled by a hidden Markov chain , with values in , and the residuals are i.i.d. random variables with mean and variance 1. are random variables which assume as values of the unit vectors , i.e. exactly one of the is 1, and the others are . The stationary distribution of the hidden regime process is given by the transition probability matrix A, i.e, . We get the stationary probabilities by equation .
is considered as a semiparametric function such that , where is a nonparametric adjustment factor and is a known function of and and is the parametric space. So, we can refer to mean function (2) and rewrite it by the following form:
 The EM algorithm based on the semiparametric method
Supposing the definition of and , Nademi & Farnoosh (2014) applied a special class of loglikelihood function, called "complete" loglikelihood function, by the following form
Where and is the normal density with mean and standard deviation The word "complete" refers to this definition that if we would have observed the complete data , instead of just , we could maximize the complete data loglikelihood (see Franke et al. [8]), instead of the ordinary loglikelihood.
Applying this method leads to the use of the Expectation and Maximization algorithm known as the EM algorithm. The EM algorithm repeats between drawing the unseen variables by conditional expectations given the seen data and using an elementary estimate of the parameters on the one phase (Estep), and by maximizing to get an update of approximations of or on the other phase (Mstep). These two phases until assuring certain stopping criteria are iterated. The algorithm offered with the EM algorithm can be summarized in the following steps.
Estep: Suppose and are given. S0, the conditional expectation of the unseen variables given are calculated by
Mstep: Suppose the approximations for the unseen variables are given. Then, the transition probabilities are calculated by
where are the joint conditional probability of and given the entire sequence of observations ( ) estimated by
The probabilities are approximated by
and the functions are estimated by
such that, gets from for where is
and is
where is a Gaussian Kernel function and are estimated by
for where denote the sample residuals. The optimal selection of the bandwidth are also estimated by
The estimate of the parameters are obtained by iterating these two steps (Estep and Mstep) until convergence.
In relation (3), they applied the Gaussian Kernel function for estimating . But there is a question that if other types of kernel function can improve performance of the semiparametric algorithm. By the definition of kernel function, a function , with compact support, is a kernel function satisfying
So, by this definition, we can find other functions that have these conditions. We want to try some other types of kernel functions that are popular in the mathematics field. Table 1 shows several types of kernel functions that are commonly used. These functions consist of Uniform, Triangle, Epanechnikov, Bisquare and Triweight. We also apply the Gaussian kernel to compare this function with the candidate kernel functions. Figure 1 shows the plots of these functions. We apply these functions to compare their ability to the improvement of the EM algorithm.
Table 1. The popular kernel functions.
Name 
Kernel 
Uniform 

Triangle 

Epanechnikov 

Bisquare 

Triweight 

Gaussian 

Figure 1. The six popular Kernel functions
Research Background
In this section, we first carry out a simulation study to examine the finite sample performance of the proposed kernel functions, and then the Semiparametric Markov Switching model with the selected kernel function will apply to financial observations, including the Automotive Industry Index of the Tehran Stock market data, to find the strategy of buying and selling in the stocks of industrial companies. Because there is a high correlation between the prices of stocks for automotive companies and the Automotive Industry Index, we focus on the Automotive Industry Index data to offer the strategy of buying and selling.
 Simulation study
We intend to examine the feasibility of various kernel functions by estimating parameters in the semiparametric model (1). We generated two sets of data with sample size N=500 based on nonlinear functions and by the following forms:
,
and it was supposed that and . The transition probability matrix was considered by
.
We also chose the variance parameters as and . Figures 2(a) and (d) show the generated observations based on and respectively. Figure 1(b, c) and (e, f) show the corresponding scatter plot of with and for two simulated data sets. These plots also indicate the degree of dependency of the data is 2. Such that, this degree of dependency is because of the structure of the model (1).
For comparing the models, we apply two indices the square Root of Mean Squared Error (RMSE) by the following form:
and classifying index "Max . The classifying index "Max is defined by the following definition:
" is belonging to regime k if and only if ."
This index is suitable for the evaluation of the models, such that the proper model is powerful in classifying the observations in the right regimes.
Table 2 shows the estimated parameters of the simulated model based on . Comparing the measures of RMSE for the six kernel functions, we can find that the Triangle kernel with RMSE 0.0710 is more efficient than the others kernel functions. After that Triweight kernel with RMSE 0.0781 can be selected as the proper kernel. Figure 3 shows Max for six kernel functions in which for Triangle kernel, except for a few cases, almost all are at least greater than 0.8, i.e. there is a clear decision for one of the two phases in the large majority of cases and we find that high percentage of the data is correctly classified. With making change the nonlinear function by for the second simulated data, we founded the different order of the kernel functions (table 3). Such that, the Uniform kernel function (RMSE=0.0521) had the best performance among other kernel functions. Also, after the Uniform kernel function, The Gaussian kernel with RMSE 0.0584 is more proper than the others. Figure 4 shows for the second simulation data set which indicates the ability of the Uniform kernel in classifying the data comparing the others. These results indicate the different efficiency of kernel functions in various problems. This demonstrates in applying the semiparametric algorithms based on selecting kernel functions, trying different kernel functions can help to get proper estimating.
Table 2. The estimated parameters for the simulated data based on .
The Parameters 
Uniform 
Triangle 
Epanechnikov 
Bisquare 
Triweight 
Gaussian 

3.0148 
2.0212 
2.9824 
2.3684 
2.0084 
2.0461 

4.9842 
5.9810 
5.2617 
6.8410 
6.0316 
4.1586 

0.3245 
0.4215 
0.3874 
0.3841 
0.3848 
0.4167 

0.5120 
0.5984 
0.5361 
0.4835 
0.5549 
0.5684 

0.0104 
0.0012 
0.0022 
0.0101 
0.0010 
0.0025 

0.0032 
0.0039 
0.0024 
0.0213 
0.0026 
0.0038 

0.0201 
0.0101 
0.0191 
0.0318 
0.0013 
0.0198 

0.0215 
0.0312 
0.0234 
0.0227 
0.0301 
0.0279 

0.0110 
0.0101 
0.0150 
0.0012 
0.0027 
0.0012 

0.0318 
0.0491 
0.0284 
0.0046 
0.0394 
0.0394 

0.4462 
0.4374 
0.3976 
0.4284 
0.4504 
0.4598 

0.5538 
0.5626 
0.6024 
0.5716 
0.5496 
0.5402 

0.5107 
0.4021 
0.4462 
0.4872 
0.4138 
0.4463 

0.4115 
0.3126 
0.2945 
0.3651 
0.3391 
0.3798 

0.0262 
0.0215 
0.0371 
0.0315 
0.0259 
0.0297 

0.0334 
0.0316 
0.0502 
0.0078 
0.0298 
0.0467 

0.0889 
0.0710 
0.0885 
0.0881 
0.0781 
0.0789 


Figure 2. For nonlinear function : {(a). Simulated data, (b). Scater plot of , (c). Scater plot of }, For nonlinear function : {(d). Simulated data, (e). Scater plot of , (f). Scater plot of }. 
Table 3. The estimated parameters for the simulated data based on .
The Parameters 
Uniform 
Triangle 
Epanechnikov 
Bisquare 
Triweight 
Gaussian 


2.1480 
3.5942 
3.4618 
4.1128 
2.9916 
2.1384 


5.9257 
4.3302 
4.5280 
5.2648 
3.5218 
6.7681 


0.4635 
0.6891 
0.6457 
0.5561 
0.4691 
0.4954 


0.5549 
0.4697 
0.4630 
0.6021 
0.3559 
0.6894 


0.0017 
0.0052 
0.0036 
0.0021 
0.0649 
0.0016 


0.0029 
0.0063 
0.0074 
0.0034 
0.0529 
0.0048 


0.0138 
0.0627 
0.0108 
0.0251 
0.0024 
0.0113 


0.0204 
0.0104 
0.0371 
0.0319 
0.0031 
0.0287 


0.0157 
0.0264 
0.0349 
0.0108 
0.0013 
0.0015 


0.0313 
0.0129 
0.0137 
0.0008 
0.0062 
0.0292 


0.4462 
0.4374 
0.3976 
0.4284 
0.4504 
0.4598 


0.5538 
0.5626 
0.6024 
0.5716 
0.5496 
0.5402 


0.4410 
0.5649 
0.5410 
0.4026 
0.4952 
0.4137 


0.3217 
0.2149 
0.3619 
0.2237 
0.2237 
0.4679 


0.0149 
0.0338 
0.0246 
0.0213 
0.0150 
0.0346 


0.0226 
0.0108 
0.0117 
0.0346 
0.0357 
0.0243 


0.0521 
0.0621 
0.0602 
0.0703 
0.0648 
0.0584 





Figure 3. Max for six kernel functions based on .
Figure 4. Max for six kernel functions based on Results We consider a data set of Automotive Industry Index of Tehran Stock market for the period March 25, 2018, to May 3, 2021, downloaded from "http://tse.ir/archive.html", where the sample size is 744. In the first step, we must determine the number of regimes in observations. This can determine by drawing the sample path of data and observing changes trends as increasing and decreasing function or by classifying the data in two classes of high and low volatility (see Nademi, & Farnoosh, 2014; Nademi, 2019 ). For sake of simplicity and good showing the regimes, we draw 200 observations of the data set. Figure 5 (blue line) shows the sample path of the data. According to this plot, we applied the step function (red line), the down step (regime=1) and the upper step (regime=2), to indicate the regimes such that we named increasing trends and decreasing trends by regimes =1 and regimes=2, respectively. So, we considered six Semiparametric Markov Switching models (called MSSEMIK (i), i=1,…,6) based on six kernel functions and two regimes (M=2).
Figure 5. Automotive Industry Index data and the regimes. In the second step, we define as the stationary automotive industry index data in time ( input variables) and apply the semiparametric model (1) to fit the observations in the following relation:
where , and are defined by section 2.1. In the third step, we apply the EM algorithm described in section 2.2 to estimate the parameters of the model. The EM algorithm is a numeric procedure and start by initial parameters and then iterating the Estep and Mstep until the convergence of parameters. Note that, in one part of Mstep, we must select a kernel function to estimate and in function (relation 3). So, we trial six kernel functions Gaussian, Uniform, Triangle, Epanechnikov, Bisquare and Triweight to get a proper result. This proper result can compare by the index RMSE. Such that, the best model has minimum RMSE. We also compare the ability of models in classifying the observation to regimes by the index of . Finally, in the fourth step, the proper model will be selected for creating the strategy of buying and selling stocks by applying the estimated joint conditional probabilities defined in Mstep in section 2.2. Table 4 lists the estimated parameters for six different specifications of the semiparametric Markov switching models in the Automotive Industry Index. The results of RMSE criteria for the forecasting indicate that in all of the six different specifications of the semiparametric Markov switching models, the MSSEMIK(1) model is more accurate for the prediction of the Automotive Industry Index than other semiparametric Markov switching models. After this model, the MSSEMIK(6), MSSEMIK(2), MSSEMIK(5), MSSEMIK(4) and MSSEMIK(3) models have more accurate forecasting for Automotive Industry Index, respectively. Therefore, the proper kernel function for forecasting the data is Uniform. Figure 6 shows the transition probabilities based on index , The values of for six models, which are all greater than 0.5, show the ability of semiparametric models in classifying the data. On the other hand, in the MSSEMIK (1) model the belonging probabilities greater than 0.73 indicate that this model is more powerful than the other models in classifying the observations. Table 4. The estimated parameters for the Automotive Industry Index data
Figure 6. Max for (a). MSSEMIK (1) model, (b). MSSEMIK (2) model, (c). MSSEMIK (3) model, (d). MSSEMIK (4) model, (e). MSSEMIK (5) model, (f). MSSEMIK (6) model. Now that, the appropriate model has been identified (MSSEMIK (1) model), we apply the estimated joint conditional probabilities to introduce the strategy of buying and selling stocks. For the sake of simplicity in showing the estimated joint conditional probabilities in the graphs, we chose 5 observations (T=739 to T=744) at the end of the Automotive Industry Index, including April 29, 2021, to May 3, 2021. Table 5 lists the estimated joint conditional probabilities based on , such that, we can write the joint conditional probability matrix , For the best model (MSSEMIK(1) model). This matrix can offer the strategy of buying and selling stock in financial markets. Such that, the probability elements of the matrix indicate the behavior of the data in passing time "t" to "t+1". Note that, we define regime 1 as decreasing trend and regime 2 as an increasing trend.
Table 5. The estimated joint conditional probability matrix based on the six models
Table 5 shows the estimated joint conditional probability matrix for the observations of the Automotive Industry Index for the selected time period based on the MSSEMIK(1) model. According to the result of this table for the bestselected model (MSSEMIK(1)), we can see the maximum probabilities among the elements of the matrices for period (t=739, t+1=740) and (t=740, t+1=741) are and , respectively, that are belong to the switching between the regimes from 1 to 2 ( ). So, we have an increasing trend. This offers that the strategy of buying the stocks in a period of time t=739.
On the other hand, the results indicate that the maximum probabilities among the elements of the matrices for period (t=741, t+1=742) are . This shows the switching between the regimes is from 2 to 2 ( ) that indicates the stock may go to a decreasing trend that offers the strategy of selling the stocks in a period of time t=741 to t= 742.
Also, the maximum probabilities of matrices for period (t=742, t+1=743) and (t=743, t+1=744) are and , respectively, that are belong to the switching between the regimes from 2 to 1 ( ) and 1to 1 ( ). So, we have a decreasing trend. This offers that the strategy of buying the stocks at the end of a period of t= 744.
Figure 7 demonstrates the estimated elements of the joint conditional probability matrix for the bestselected model (MSSEMIK (1)). Figure 7(a) shows the sample path of the Automotive Industry Index, including April 29, 2021, to May 3, 2021, which shows an increasing trend until lag 3 and decreasing trend to lag 6. Figure 7(b) indicates the estimated elements of the joint conditional probability matrix. Such that, the red line shows the switching between regimes from decreasing to increasing trend ( ) that was explained for a period (t=739, t+1=740) and (t=740, t+1=741). The blue line indicates the switching between regimes from an increasing trend to increasing one or staying in an increasing trend for two lags (t=741, t+1=742). The green line indicates the switching between regimes from increasing to decreasing trend that was explained for a period (t=742, t+1=743) and finally, the yellow line demonstrates the switching between regimes from a decreasing trend to decreasing one or staying in a decreasing trend that was explained for a period (t=743, t+1=744). We can find the maximum probability in every of lags (16).
Figure 7. (a). The sample path of the Automotive Industry Index for the period of April 29, 2021, to May 3, 2021, (b). the estimated elements of joint conditional probability matrix for the best selected model (MSSEMIK(1)), the colors of red, blue, green, and yellow are the joint conditional probability of ( ), ( ), ( ) and ( ), respectively. 
Conclusion
We have offered the strategy of buying and selling stock in financial markets by a special class of Markov switching models based on the joint conditional probability matrix. This strategy can be capable of selecting the various kernel functions. Such that, the best selection of this function, increase the accuracy of the joint conditional probability matrix. We have compared a set of kernel functions with the semiparametric Markov switching models in terms of their ability to capturing the parameters in simulation studies. The estimation results of the models have indicated that the proper kernel function for estimating the parameters refer to the nature of data and there is not a firm decision in selecting these functions. Such that we found the proper performance of the Triweight kernel in the first simulation data set. While we found Uniform kernel as the best kernel function in the second simulation one. This reveals that the Gaussian kernel function is not the best selection in every data modeling. This reality shows that the researchers in estimating processes must be trial some kernel functions to find the best performance of their algorithms and the use of Gaussian kernel function in every algorithm can be deceptive.
We also suggest to academics that they compare these kernel functions with other semiparametric and nonparametric models to improve the current knowledge about the better model for forecasting time series data
[1] Autoregressive–Autoregressive Conditionally Heteroscedasticity models.