The empirical data consists of stocks from MMI and NASDAQ and three cur- rency exchange rates.
The first data set consists of 20 stocks in the major market index (MMI)1. We obtain the data of most stocks for the period from Jan. 2, 1970 to Dec. 31, 2008, except for AXP and T. The data for AXP and T start from May 18, 1977 and Jan 2, 1984 respectively. We choose the stocks from MMI because
1The firms in the MMI are American Express (AXP), AT&T (T), Chevron (CHV), Coca-
Cola (KO), Disney (DIS), Dow Chemical (DOW), Du Pont (DD), Eastman Kodak (EK), Exxon (XOM), General Electric (GE), General Motors (GM), International Business Machines (IBM), International Paper (IP), Johnson & Johnson (JNJ), McDonald’s (MCD), Merck (MRK), 3M (MMM), Philip Morris (MO), Procter and Gamble (PG), and Sears (S).
they are well known and highly capitalized stocks representing a broad range of industries and they generally exhibit a high level of trading activity. Return data are obtained from daily stock file of the Center for Research in Security Prices (CRSP) and accessed from Wharton Research Data Services (WRDS).
The exogenous threshold variable we used in this empirical study is the Volatility Index (VIX). The Chicago Board Options Exchange (CBOE) Volatility Index is a key measure of market expectations of near-term (30-day) volatility conveyed by S&P 500 stock index option prices. It is a weighted blend of prices for a range of options on the S&P 500 index. The volatility index is calculated and disseminated in real-time by CBOE. We obtain the data from CBOE web- site from Jan. 2 1990 to Dec. 31 2008. Since the volatility index measures the market expectations for the future volatility, it is reasonable to assume the independence between VIX and the current volatility. It is shown in the data that the sample correlation coefficient between squared return and VIX for 20 stocks in MMI ranges from 0.03 to 0.09, while average correlation coefficient between squared return and volume is around 0.5. We can thus treat VIX as a weakly exogenous variable.
The summary statistics for the returns in MMI are presented in Table 3.1 in Appendix. The columns report the sample minimum, maximum, mean, stan- dard deviation, coefficient of skewness, and coefficient of kurtosis. We notice that all the return series have large kurtosis comparing to a normal distribu- tion, and most of the returns are negatively skewed.
Since the simulation study of the endogenous threshold model suggests that the MLE perform reasonably well for large endogeneity coefficient, we also ap- ply our model to the volume data. In addition, the use of the threshold model
to describe the conditional variance dynamics is motivated by the volume- volatility correlation, we want to examine whether the endogenous threshold variable volume provides more information on the regime shifts in the condi- tional variance process. Nonetheless the volume variable reveals the trading activities for the individual stocks, while the VIX variable just gives the infor- mation for the market as a whole. The volume data are also obtained from WRDS. To remove the trend in the volume series, we define the adjusted vol- ume series by taking the log of the trading volume and then removing the 100- day moving average from the log volume series. The resulting series have an average correlation coefficient around 0.5. Since the volume series, even with detrending adjustment, is still very noisy, we also search for other endogenous threshold variables. Since the adjusted volume does not provide much infor- mation in regime switching, we didn’t report the estimation results.
To further explore the usefulness of the endogenous threshold, we also ob- tain the second data set for 4 most active stocks in NASDAQ2, since the number of trades data is only available for NASDAQ stocks. The number of trades data is available for most of the stocks from Jan. 03, 1995 to Dec. 31, 2010 except YHOO and GOOG, they have the number of trades available from Apr. 15, 1996 and Aug. 20, 2004 respectively. There is always concern about the noise in trading activity variables, volume is one very noisy trading variable. We tried to adjust the volume series by taking log and removing time trend for MMI stocks. Here we use another volume variable for NASDAQ stocks. The volume variable is defined as the ratio of trading volume over the total shares outstanding (Volume/SHOUT) for the stock. This is actually the turnover of the daily stock and it is stationary. For the number of trades, if there is a clear
24 most active stocks in NASDAQ: Yahoo! Inc. (YHOO), Apple Inc. (AAPL), Google Inc.
time trend, we detrend the series by removing the best straight-line fit from the series. The data descriptions for return, volume/SHOUT and number of trades are available in Table 3.2 and Table 3.3 in Appendix.
Since we only obtain the daily price data for MMI and NASDAQ stocks, we use the squared daily return as a proxy for the actual volatility when evalu- ating the forecasting performance. In the volatility forecasting literature how should the true volatility be measured is a big concern. However, it is shown that the daily squared return is a very noisy measure to approximate the ac- tual daily volatility, even though the squared return is an unbiased estimator of daily variance. Taking this into account, besides using daily squared return, we also compare our volatility forecast with realized volatility constructed from intra-day high frequency data. Thanks to Dinghai Xu, we are able to obtain intra-day high frequency data for IBM and GE stocks from Mar. 03, 2005 to Sep. 24, 2008 as well as three currency exchange rates, namely CAD/USD, USD/JPY, and GBP/USD. The high-frequency intra-day transaction prices for currencies are available from Apr. 13, 1998 to July 28, 2006. We summa- rize the sample statistics for three currency exchange returns and the realized volatility constructed from HF data in Table 3.4 to Table 3.6.