• No results found

Dynamic Stock Correlation Network

N/A
N/A
Protected

Academic year: 2021

Share "Dynamic Stock Correlation Network"

Copied!
10
0
0

Loading.... (view fulltext now)

Full text

(1)

Procedia Computer Science 60 ( 2015 ) 1826 – 1835

1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Peer-review under responsibility of KES International doi: 10.1016/j.procs.2015.08.293

ScienceDirect

19th International Conference on Knowledge Based and Intelligent Information and Engineering

Systems

Dynamic Stock Correlation Network

Yuta Arai

a

, Takeo Yoshikawa

b

, Hiroshi Iyetomi

c

aRecruit Sumai Company Ltd., Tokyo 100-6640, Japan

bGraduate School of Science and Technology, Niigata University, Ikarashi, Niigata 950-2181, Japan cFaculty of Science, Niigata University, Ikarashi, Niigata 950-2181, Japan

Abstract

Financial markets are the outcome of highly complex interactions among a number of agents. Such a complex system possibly contains all sorts of features, e.g., not only static but also dynamic. Here we study dynamic correlations hidden in the S&P 500 market by adopting a combined method of the Complex Principal Component Analysis (CPCA) and the Random Matrix Theory (RMT). The CPCA is entirely dependent on complexification of time series using the Hilbert transformation and enables us to extract correlations between stock prices moving with different phases to one another. The RMT serves as a null hypothesis for distinguishing true correlations from noisy financial data. The extracted information on dynamic correlations of the market is projected onto a correlation network in which pairs of stocks with phase difference smaller than certain threshold are linked with strength of their correlations as weight. We then detect communities of comoving stocks in the network and also elucidate lead-lag relationship between those communities.

c

2015 The Authors. Published by Elsevier B.V. Peer-review under responsibility of KES International.

Keywords: Complex principal component analysis, Hilbert transformation, Dynamic correlation, Correlation network, Community

1. Introduction

Quantifying correlations between different stocks is important to our understanding of the finance as a complex dynamical system, and also for practical reasons such as asset allocation and portfolio risk estimation. To uncover the structure of interactions among the elements in a stock market, physicists primarily focus on spectral properties of the correlation matrix of stock price movements, which is known as Principal Component Analysis (PCA) in statistics. In a random case, the distributions of the eigenvalues and eigenvector components of the correlation matrix can be analytically derived using Random Matrix Theory (RMT). By comparing the eigenvalues of the empirical correlation matrix with the corresponding results of the RMT, one can distinguish between meaning information and noise in the complicated behaviors of stock returns1,2,3,4,5

.

The most dominant principal component represents a collective motion of all stocks in markets. This is rather a trivial result since such an Index behavior of stock markets had been well established in business. In the second and subsequent significant principal components, however, formation of stock groups are clearly observed and many of

Corresponding author. Tel.:+81-3-6705-4973

© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

(2)

them are well characterized by industry sectors. The results can be applied to improvement of risk management of portfolio in financial engineering.

Unlike the previous studies, we have recently applied6 an extended PCA called Complex PCA (CPCA)7,8,9 to

stock market data to extract genuine dynamical correlations between the price fluctuations of different stocks. The conventional PCA cannot deal with such dynamical features of stock markets because it entirely depends on static correlations between different stocks. One can express any real time series of a system as a sequence of complex numbers projected onto the real axis using the Hilbert transform. The new time series having envelop and instanta-neous phase at the same time carry more information on dynamics of the system than the original one. The CPCA stands on a correlation matrix constructed from such complexified time series. It thus allows us to detect dynamic cor-relations involved in financial markets. Accordingly the RMT was generalized6so as to accommodate to the CPCA.

The generalized method is also applicable to other multivariate data including world-wide financial data of markets and currencies10and individual prices constituting the consumer price index (CPI) in Japan.11

Along the line that we have laid out, in this paper, we shed light on dynamic correlations in S&P 500 adopting a network-theoretic approach. As the previous works, we purify the correlation matrix by eliminating random compo-nents in the spectral decomposition of the correlation matrix. And we discard the market mode to focus on statistically meaningful group correlations between stock prices. Regarding the correlation matrix thus obtained as an adjacency matrix, we can construct a stock correlation network. The peculiarity of our network model is that the weight associ-ated with each link is a complex number with amplitude and phase. So the characteristics of the dynamic correlations between stock movements are reflected in the network structure. Especially we emphasize communities of comoving stocks embedded in the network and lead-lag relationship between those communities.

2. Complex Principal Component Analysis

In this study, we use 483 (=N) stocks for the listed companies in the S&P 500 market index in the period of 2008 to 2011. In this period, the total number of business days is 1009 (= T+1) days. We begin with introducing the correlation matrix which is a basis for the CPCA. LetSα(t) be a price of stock αat time t(α = 1,2,· · ·,N,t = 1,2,· · ·,T+1). First we calculate the logarithmic return of stockαas

Rα(t)=lnSα(t+1)−lnSα(t). (1)

We then derive complex time seriesξα(t) by combiningRα(t) and its Hilbert transform ˜Rα(t) as

ξα(t)=Rα(t)+iR˜α(t), (2)

whereirepresents the imaginary unit. The Hilbert transform ˜Rα(t) ofRα(t) is given by

˜ Rα(t)=−π1P ∞ −∞ Rα(z) tzdz, (3)

where P dzdenotes Cauchy’s principal value integral1. Since each stock has its own volatility, we standardize the

complex time series according to

Ξα(t)= ξα(t)− ξαt

σα , (4)

with the standard deviationσαofξαgiven by σα=ξαξ∗ αt= ⎛ ⎜⎜⎜⎜⎜ ⎝T1 T t=1 ξα(t)ξα(t)∗ ⎞ ⎟⎟⎟⎟⎟ ⎠ 1/2 , (5)

1 Numerical computations of the Hilbert transformation were actually performed in the Fourier space; cosωtand sinωtare transformed to

(3)

where· · · tdenotes time average and the superscript∗stands for the complex conjugate operation. The envelopAα(t) and the instantaneous phaseφ(t) ofΞα(t) are defined in the polar representation as

Ξα(t)=Aα(t)exp (iφα(t)). (6)

Finally we derive theN×Ncomplex correlation matrixCwith the following elements: Cαβ=T1

T

t=1

Ξα(tβ∗(t)=rαβexp(iθαβ), (7) where we also representCαβin polar coordinates. SinceCis a Hermitian and positive-definite matrix, its eigenvalues are real and positive. The CPCA utilizes the spectral decomposition ofCgiven as

C= 1 2N N =1 λuu†, (8) with u†·u=2N, (9)

whereλanduare the-th largest eigenvalue and its associated eigenvector forC, respectively, andu† is Hermite conjugate ofu.

Let us consider sequences of Gaussian random numbers,xα(t) (α=1,· · ·,N;t=1,· · ·,T), corresponding to the actual stock data. The complex random correlation matrix is calculated as

Crand=

1 TXX

, (10)

whereXis theN×Tdata matrix. ForCrand, in the limitN,T→ ∞withη=2N/T(<1) fixed, the probability density

ρ(λ) of eigenvalueλcan be analytically calculated as ρ(λ)= 1 2πη √ (λ+−λ)(λ−λ−) λ , λ±= 1±√η2 . (11)

On the other hand, the eigenvector components obey a two-dimensional Gaussian distribution: ρ(u,u )= 1 2πexp −u2 2 − u 2 2 , (12)

whereu andu are the real and the imaginary parts of the eigenvector components, respectively. Equations (11) and (12) are valid only in the case that the matrix size is infinite and the components of data matrix are mutually independent. However, empirical data often have limited size and also may contain autocorrelations, leading to modification of the RMT criterion used to single out genuine cross-correlations12,13. The time series data studied here

are large enough to be able to neglect the finite-size effect and are almost free from autocorrelations. 3. Complex Correlation Network

Application of the CPCA to the S&P 500 data detected 7 principal components whose eigenvalues are beyond the upper boundλ+ =3.916 predicted by the RMT. The largest eigenvalueλ1 = 242.11, corresponding to a collective

motion of the whole market, explains half of the total fluctuations of stock prices; note that total sum of the eigenvalues amounts toN=483. The second through the seventh largest eigenvalues are 17.71, 13.96, 8.975, 5.819, 4.990, and 4.665 (the eighth one is 3.990). The principal components associated with those eigenvalues represent stock group correlations well characterized by industrial sectors. And we identify the remaining components just as random fluctuations. The three classifications of the eigenvalues arrange the spectral formula (8) for the correlation matrix as C=Cmarket+Cgroup+Crandom. (13)

(4)

If one regardCas an adjacency matrix A, one can construct a stock correlation network. Here we adoptCgroup

forAby discardingCrandomand alsoCmarketin (13); it turns out that we consider stock price dynamics relative to the

motion of the whole market. To illuminate comoving stocks, we select pairs of stocksαandβfor links of a network only if their phase difference θαβ satisfies the condition|θαβ| ≤ θth. And furthermore we associate the links with

the strengthrαβ of correlation between the selected pairs of stocks as weight. In this way we can construct a stock correlation network, an undirected weighted network, of S&P 500 market.

We suppose that the network thus constructed is well decomposable into communities, that is, groups of stocks in which stocks are synchronous and strongly correlated. To detect such communities in a network, maximization of modularityQis often used14. Here we use the Fast Unfolding method15with modularity maximization to obtain an

optimized community structure. 4. Community Structures

Table 1. Summary of the community decomposition of the stock correlation network at five values ofθth; the size of each community, measured by the number of stocks within it, is listed together with the maximized modularity value.

θth 0.1 0.075 0.05 0.025 0.01 Q 0.60 0.62 0.63 0.65 0.65 comm.1 165 161 168 179 179 comm.2 146 149 142 146 145 comm.3 98 98 101 93 93 comm.4 74 75 72 65 65 comm.5 0 0 0 0 1 (isolated)

We repeated the community decomposition for the stock correlation network constructed with varied values of the threshold θth. Table 1 summarizes the results obtained for five values of θth in the range 0.01 ≤ θth ≤ 0.1.

The maximized molularity, which takes 0.60 even at the largestθth, indicates the community decomposition is highly

effective. The networks are always partitioned into 4 communities of comoving stocks and the size of each community shows no strong dependence onθth; the communities are numbered in descending order of their size. We thus see

that the community structure is very stable against the order of magnitude change ofθth. This finding is ascertained

by Fig. 1, where the stocks grouped according to the Global Industry Classification Standard (GICS) are classified by the communities obtained withθth = 0.1,0.05,and 0.01. Figure 1 shows that the classification of stocks by the

communities is almost independent ofθthand each community is well characterized by industry groups:

• Comm.1: Information Technology (Software & Services; Technology Hardware & Equipment; Semiconductors & Semiconductor Equipment), Retailing, Consumer Services, Transportation

• Comm.2: Consumer Staples (Food, Beverage & Tobacco; Household & Personal Products), Health Care ( Health Care Equipment & Services; Pharmaceuticals, Biotechnology & Life Sciences), Telecommunication Services, Utilities

• Comm. 3: Financials (Banks; Diversified Financials; Insurance; Real Estate) • Comm. 4: Energy, Materials

To demonstrate the stability of the community decomposition more quantitatively, we invoke the Jaccard Index. We definecmandcnas sets of nodes belonging to communitiesmandnrespectively. The Jaccard Index between the two communities is defined as the size of their intersection divided by that of their union:

Jmn= | cmcn| |cmcn|

(5)

(a)

(b)

(c)

Fig. 1. Decomposition of the GICS-grouped stocks by the communities detected here. The panels (a), (b), and (c) are the results obtained at

θth=0.1,0.05,and 0.01, respectively.

IfJmn = 1,cmis identical tocn. In the case that Jmntakes a value well less than 1, however, it does not necessarily mean that the two communities do not match. In fact, the Jaccard index works well only when comparing between two sets of similar size. Let us suppose that the size ofcnis much smaller than that ofcm. This case always gives a small value of Jmn, and we hence would conclude the two sets are dissimilar. However, the conclusion is not appropriate whencncm. To resolve this problem as regards the Jaccard index, we introduce a normalized version of it which is defined as ˆ Jmn= |cmcn| |cmcn| |c n| |cm| (0≤Jˆmn≤1), (15)

(6)

Fig. 2. Schematic diagram illustrating the ideas of (a) the conventional Jaccard index and (b) the normalized Jaccard index newly defined in this paper.

where we assume the size ofcnis smaller than that ofcm(|cn|<|cm|) . Figure 2 schematically compares the idea of the conventional Jaccard index with that of the normalized Jaccard index. Whencncm, the new index gives ˆJmn=1 even if|cn| |cm|as being desired.

Tables 2 through 5 show how the community decomposition changes with step-by-step decreasing ofθthfrom 0.1

to 0.01. We confirm that the community structure does not change appreciably even for such a wide variation ofθth.

In what follows we thereby concentrate on the community structure obtained withθth=0.1.

Table 2. Normalized Jaccard index (15) between the two sets of communities atθth=0.1 andθth=0.075.

comm.1 comm.2 comm.3 comm.4

comm.1 1.00 0.01 0.00 0.01

comm.2 0.00 1.00 0.00 0.00

comm.3 0.00 0.00 1.00 0.00

comm.4 0.00 0.00 0.00 1.00

Table 3. Normalized Jaccard index (15) between the two sets of communities atθth=0.075 andθth=0.05.

comm.1 comm.2 comm.3 comm.4

comm.1 0.94 0.00 0.02 0.01

comm.2 0.03 1.00 0.00 0.00

comm.3 0.01 0.00 0.98 0.00

comm.4 0.04 0.00 0.00 0.97

Figure 3 plots the correlation coefficientsCαβ’s between stocks within or across communities on complex plane. Distribution ofCαβ’s within communities is stretched out along the positivexaxis. As forCαβ’s crossing different communities, on the other hand, a large portion of them are spread to the opposite side on thexaxis. We also show histograms of the phase differencesθαβ’s between stocks within or across communities in Fig. 4 corresponding to Fig. 3. From these results, broadly speaking, we see that the stocks belonging to the same community move in phase

(7)

Table 4. Normalized Jaccard index (15) between the two sets of communities atθth=0.05 andθth=0.025.

comm.1 comm.2 comm.3 comm.4

comm.1 0.99 0.00 0.00 0.00

comm.2 0.00 1.00 0.00 0.00

comm.3 0.03 0.02 1.00 0.00

comm.4 0.07 0.00 0.00 1.00

Table 5. Normalized Jaccard index (15) between the two sets of communities atθth=0.025 andθth=0.01.

comm.1 comm.2 comm.3 comm.4

comm.1 0.87 0.02 0.03 0.03

comm.2 0.02 0.90 0.02 0.00

comm.3 0.04 0.01 0.86 0.00

comm.4 0.02 0.01 0.00 0.91

(as they should be according to the present definition of community) and those belonging to different communities move out of phase.

We have recently carried16,17,18,19out a community analysis on stock networks constructed from the group

corre-lations in S&P 500 and Tokyo Stock Exchange (TSE), where the conventional PCA assisted by the RMT was used instead of the CPCA. The networks have links between pairs of stocks with positive or negative weights depending on whether the pairs are correlated or anticorrelated; both positive and negative correlations were treated on the same footing. We then optimized decomposition of stocks into communities in which stocks are synchronized with positive correlations. Since negative links are pushed away from communities, in return, communities thus detected have ten-dency to be negatively correlated. Finally we were successful in unveiling the fact that the stocks in S&P 500 and TSE are decomposed into communities related to each other through unbalanced triangle relationship, that is,the enemy of my enemy is my enemy, not my friend!

The present analysis thus confirms our previous findings of the frustrated correlation structure in S&P 500. We emphasize such a fascinating relationship among the stock price fluctuations is dynamically stable.

(8)
(9)

1.0 0.5 0.0 0.5 1.0 0 1000 2000 3000 4000 5000 6000 Com.1 Com.1 1.0 0.5 0.0 0.5 1.0 0 1000 2000 3000 4000 Com.1 Com.2 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 Com.1 Com.3 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 Com.1 Com.4 1.0 0.5 0.0 0.5 1.0 0 1000 2000 3000 4000 Com.2 Com.1 1.0 0.5 0.0 0.5 1.0 0 1000 2000 3000 4000 5000 6000 7000 Com.2 Com.2 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 2500 3000 3500 Com.2 Com.3 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 Com.2 Com.4 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 Com.3 Com.1 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 2500 3000 3500 Com.3 Com.2 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 2500 3000 3500 Com.3 Com.3 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 Com.3 Com.4 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 Com.4 Com.1 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 Com.4 Com.2 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 Com.4 Com.3 1.0 0.5 0.0 0.5 1.0 0 500 1000 1500 2000 2500 Com.4 Com.4

(10)

5. Conclusion

We studied dynamic correlations hidden in the S&P 500 market by combining the CPCA and the RMT. Complex-ification of time series using the Hilbert transformation generalizes the conventional PCA to the CPCA. The CPCA allows us to extract correlations between stock prices moving with different phases to one another. The RMT serves as a null hypothesis for distinguishing true correlations from noisy financial data. We projected the extracted informa-tion on dynamic correlainforma-tions of the market onto a correlainforma-tion network in which pairs of stocks with phase difference smaller than certain threshold are linked with strength of their correlations as weight. We then detected communities of comoving stocks in the network and also elucidated lead-lag relationship between those communities. The obtained results affirm the frustrated triangle relationship among stock groups which was reported in our previous work based on a static correlation network with the PCA. However, it is still curious how the frustrated correlation structure latent in the stock market is dynamically stabilized. It may require us to go beyond pairwise correlation.

Acknowledgements

This work was supported by the JSPS Grant-in-Aid for JSPS Fellows, No. 247270 and the JSPS Grant-in-Aid for Scientific Research (C), No. 25400393.

References

1. L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters, Noise dressing of financial correlation matrices, Phys. Rev. Lett.83, 1467 (1999). 2. V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. Nunes Amaral, and H. E. Stanley, Universal and nonuniversal properties of cross correlations

in financial time series, Phys. Rev. Lett.83, 1471 (1999).

3. V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. N. Amaral, T. Guhr, and H. E. Stanley, Random matrix approach to cross correlations in financial data, Phys. Rev. E65066126 (2002).

4. A. Utsugi, K. Ino, and M. Oshikawa, Random matrix theory analysis of cross correlations in financial markets, Phys. Rev. E70, 026110 (2004).

5. D.-H. Kim and H. Jeong, Systematic analysis of group identification in stock markets, Phys. Rev. E72, 046133 (2005).

6. Y. Arai, T. Yoshikawa and H. Iyetomi, Complex principal component analysis of dynamic correlations in financial markets, Frontiers in Artificial Intelligence and Applications, Vol.255, pp. 111-119 (2013).

7. T. P. Barnett, Interaction of the monsoon and pacific trade wind system at interannual time scales part I: The equatorial zone, Monthley Weather Review111, 756 (1983).

8. J D Horel, Complex principal component analysis: Theory and examples, Journal of Climate and Applied Meteorology, 23, pp. 1660-1673 (1984).

9. K. Stein, A. Timmemann and N. Schneider, Phase synchronization of the El Ni˜no-southern oscillation with the annual cycle, Phys. Rev. Lett. 107, 128501 (2011).

10. I. Vodenska, H. Aoyama, Y. Fujiwara, H. Iyetomi, Y. Arai, and H. E. Stanley, Interdependencies and causalities in coupled financial networks (2014). Available at SSRN: http://ssrn.com/abstract=2477242

11. H. Yoshikawa, H. Aoyama, H. Iyetomi, and Y. Fujiawa, Deflation/inflation dynamics: Analysis based on micro prices, RIETI Discussion Paper Series, 15-E-026, pp. 1-60 (2015). http://www.rieti.go.jp/jp/publications/dp/15e026.pdf

12. Y. Arai, K. Okunishi, and H. Iyetomi, Numerical study of random correlation matrices: Finite-size effects, Intelligent Decision Technologies, SIST 10, edited by J. Watada et al. (Springer-Verlag Berlin Heidelberg) pp. 557-565 (2011).

13. Y. Arai and H. Iyetomi, Numerical study of generalized random correlation matrices: Autocorrelation effects, Prog. Theor. Phys. Suppl.194, 84 (2012).

14. M. E. J. Newman, Fast algorithm for detecting community structure in networks, Phys. Rev. E69, 066113 (2004).

15. V. D. Blondel, J.-L. Guillaume, R. Lambiotte and E. Lefebvre, Fast unfolding of communities in large networks, J. Stat. Mech.: Theory and Experiment2008, P10008 (2008).

16. T. Yoshikawa, T. Iino and H. Iyetomi, Market structure as a network with positively and negatively weighted links, Intelligent Decision Tec hnologies, SIST 10, edited by J. Watada et al. (Springer-Verlag, Berlin), pp. 511-518 (2011).

17. T. Yoshikawa, T. Iino and H. Iyetomi, Observation of frustrated correlation structure in a well-developed financial market, Prog. Theor. Phys. Suppl.194, 55 (2012).

18. T. Yoshikawa, T. Iino and H. Iyetomi, Detection of comoving groups in a financial market, Intelligent Decision Technologies, Vol.1, SIST 15 edited by J. Watada et al. (Springer-Verlag, Berlin), pp. 481-486 (2012).

19. T. Yoshikawa, Y. Arai and H. Iyetomi, Comparative study of correlations in financial markets, Frontiers in Artificial Intelligence and Appli-cations Vol.255, pp. 104-110 (2013).

Figure

Table 1. Summary of the community decomposition of the stock correlation network at five values of θ th ; the size of each community, measured by the number of stocks within it, is listed together with the maximized modularity value.
Fig. 1. Decomposition of the GICS-grouped stocks by the communities detected here. The panels (a), (b), and (c) are the results obtained at θ th = 0.1, 0.05, and 0.01, respectively.
Fig. 2. Schematic diagram illustrating the ideas of (a) the conventional Jaccard index and (b) the normalized Jaccard index newly defined in this paper.
Table 4. Normalized Jaccard index (15) between the two sets of communities at θ th = 0.05 and θ th = 0.025.
+3

References

Related documents

Daily returns on these six stock exchange indices were computed and used to calculate dynamic conditional correlations (DCCs) between the markets.. The re- sults indicate that the

The derivation of DCC in equation (12) from a vector random coefficient moving average process is important as it: (i) demonstrates that DCC is, in fact, a dynamic

We show that dynamic correlations of business cycles of OECD countries and China are negative at business-cycle frequencies and positive for short-run

Delay and energy efficient data collection scheme based matrix filling theory for dynamic traffic IoT RESEARCH Open Access Delay and energy efficient data collection scheme based

Figure 1 displays the dynamic conditional correlations between the monthly excess returns on the winner portfolio (Blue dash line) and loser portfolio (Red solid line) portfolios

Macroeconomic factors such as the interest rate, GDP growth, employment rate, and stock market growth are statistically significant predictors of the estimated dynamic

Economical Physics, Mathematical Economy, Financial Risk, Time Series, Crossed Correlation Matrices, Random Matrix Theory, Applied Nuclear Chaos, Stock Market

The proposed system is adopting blowfish which least complex among many cryptographic algorithms by using such algorithm we can achieve better security by using dynamic