A Quantum-inspired Entropic Kernel for Multiple Financial Time Series Analysis

(1)

This is a repository copy of

A Quantum-inspired Entropic Kernel for Multiple Financial

Time Series Analysis

.

White Rose Research Online URL for this paper:

http://eprints.whiterose.ac.uk/159746/

Version: Published Version

Conference or Workshop Item:

Bai, Lu, Cui, Lixin and Hancock, Edwin R orcid.org/0000-0003-4496-2028 (Accepted:

2019) A Quantum-inspired Entropic Kernel for Multiple Financial Time Series Analysis. In:

International Joint Conference on Artificial Intelligence, 11-17 Jul 2020. (In Press)

[email protected] https://eprints.whiterose.ac.uk/ Reuse

Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of the full text version. This is indicated by the licence information on the White Rose Research Online record for the item.

Takedown

If you consider content in White Rose Research Online to be in breach of UK law, please notify us by

(2)

A Quantum-inspired Entropic Kernel for Multiple Financial Time Series Analysis

Lu Bai

1,2

,

Lixin Cui

1,2∗

,

Yue Wang

1,3

,

Yuhang Jiao

1

and

Edwin R. Hancock

4

1

Central University of Finance and Economics, Beijing, China

2

China Center for Fintech Studies, Central University of Finance and Economics, Beijing, China

3

State Key Laboratory of Cognitive Intelligence, iFLYTEK, Hefei, China

4

Department of Computer Science, University of York, York, UK

{

bailucs, cuilixin, wangyuecs

}

@cufe.edu.cn, [email protected]

Abstract

Network representations are powerful tools for the analysis of time-varying financial complex system-s consystem-sisystem-sting of multiple co-evolving financial time series, e.g., stock prices, etc. In this work, we de-velop a new kernel-based similarity measure be-tween dynamic time-varying financial networks. Our ideas is to transform each original financial network into quantum-based entropy time series and compute the similarity measure based on the classical dynamic time warping framework associ-ated with the entropy time series. The proposed method bridges the gap between graph kernels and the classical dynamic time warping framework for

multiple financial time series analysis.

Experi-ments on time-varying networks abstracted from fi-nancial time series of New York Stock Exchange (NYSE) database demonstrate that our approach can effectively discriminate the abrupt structural changes in terms of the extreme financial events.

1

Introduction

Network representations are powerful tools to analyze the financial market that can be considered as a time-varying complex system consisting of multiple co-evolving

finan-cial time series [Zhang and Small, 2006; Nicoliset al., 2005;

Shimadaet al., 2008; Silvaet al., 2015], e.g., the stock

mar-ket with the trade price. This is based on the idea that

the structure of the so-called time-varying financial network-s [Bullmore and Spornnetwork-s, 2009] inferred from the correnetwork-spond- correspond-ing time series of the system can represent richer physical in-teractions between system entities than the original individual time series. One main objective of existing approaches is to detect the extreme financial events that can significantly

in-fluence the network structures [Baiet al., 2020].

In machine learning, graph kernels have been widely em-ployed for analyzing structured data represented by graphs

or networks [Xuet al., 2018]. The main advantage of

em-ploying graph kernels is that they can offer us an effec-tive way of mapping the network structures into a high di-mensional space so that the standard kernel machinery for

∗_{Corresponding Author}

vectorial data is applicable to the network analysis. Most existing graph kernels are based on the idea of decompos-ing graphs or networks into substructures and then mea-suring pairs of isomorphic substructures [Haussler, 1999], e.g., graph kernels based on counting pairs of iso-morphic a) paths [Borgwardt and Kriegel, 2005], b)

walk-s [Kawalk-shimaet al., 2003], and c) subgraphs [Baiet al., 2015b]

or subtrees [Shervashidzeet al., 2009]. Unfortunately,

direct-ly adopting these graph kernels to anadirect-lyze the time-varying financial networks inferred from original vectorial time

se-ries tends to be elusive. This is because these financial

network structures are by nature complete weighted

graph-s [Yeet al., 2015; Baiet al., 2020], where each vertex

repre-sents an individual time series of a stock and is adjacent to all remainder vertices, and each edge represents the interac-tion (e.g., the correlainterac-tion or distance) between a pair of co-evolving financial time series. It is difficult to decompose a complete weighted graph into the required substructures, and thus influences the effectiveness of most existing graph ker-nels for financial network analysis.

One way to address the aforementioned problem is to con-struct sparse con-structures of the original time-varying financial

networks. With this scenario, Cui et al. [Cuiet al., 2018]

have used the well-known threshold-based approach to

pre-serve the weighted edges falling into the larger10% of the

weights, and employed the classical graph kernels associat-ed with the resulting sparse structures for financial network

analysis. Bai et al. [Baiet al., 2020] have abstracted the

min-imum or maxmin-imum spanning trees associated with the com-mute time matrix of the original complete weighted financial networks, and developed a novel quantum graph kernel over the spanning trees of the financial networks. Although, both the approaches overcome the restriction of employing graph kernels for time-varying financial network analysis, their re-quired sparse structures also lead to significant information loss. Since many weighted edges of the original complete weighted financial networks are discarded. In summary, ana-lyzing time-varying financial networks associated with graph kernels still remains challenges.

The aim of this paper is to overcome the aforementioned problems by developing a new kernel measure between time-varying networks for multiple co-evolving financial time se-ries analysis. Overall, the main contributions are threefold.

(3)

network-s, we commence by computing the average mixing ma-trix [Godsil, 2013] to summarize the time-averaged be-haviour of continuous-time quantum walks (CTQW) evolved on the network structures. The reason of using the CTQW is that it not only accommodates complete weighted graphs, but also better reflects richer financial network

characteris-tics than the classical random walks [Baiet al., 2015a] (see

details in Section II-A). We show how the average mixing matrix of the CTQW allows to compute a quantum-based en-tropy for each vertex of the financial networks and represents the original networks as quantum entropy time series.

Second, with each pair of time-varying financial networks

to hand, we define a Quantum-inspired Entropic Kernel be-tween their quantum entropy time series through the classical dynamic time warping framework. The proposed kernel not only accommodates the complete weighted graphs through the entropy time series, but also bridges the gap between graph kernels and the classical dynamic time warping frame-work for time series analysis (see details in Section III-B).

Third, we perform the proposed kernel on time-varying

financial networks abstracted from multiple co-evolving fi-nancial time series of New York Stock Exchange (NYSE) database. Experiments demonstrate that the proposed ap-proach can effectively discriminate the abrupt structural changes in terms of the extreme financial events.

2

Preliminary Concepts

In this section, we briefly review some preliminary concepts.

2.1

The Average Mixing Matrix of the CTQW

The continuous-time quantum walk (CTQW) is the quan-tum analogue of the classical continuous-time random walk (CTRW) [Farhi and Gutmann, 1998]. The CTQW models a Markovian diffusion process over the vertices of a graph through their transition information. Assume a sample graph

is G(V, E), where V is the vertex set and E is the edge

set. Similar to the classical CTRW, the state space of the

CTQW is the vertex setV and its state at time t is a

com-plex linear combination of the basis states|u⟩, i.e.,|ψ(t)⟩=

∑

u∈V αu(t)|u⟩, whereαu(t)∈Cand|ψ(t)⟩ ∈C|V|are the

amplitude and both complex. Furthermore,αu(t)α∗u(t)

indi-cates the probability of the CTQW visiting vertexuat time

t. ∑

u∈V αu(t)α∗u(t) = 1andαu(t)α∗u(t) ∈ [0,1], for

al-lu ∈ V,t ∈ R+_{. Unlike the classical CTRW, the CTQW}

evolves based on the Schr¨odinger equation

∂/∂t|ψt⟩=−iH |ψt⟩, (1)

whereHdenotes the system Hamiltonian. In this work, we

employ the adjacency matrix as the Hamiltonian. When a

CTQW evolves on the sample graphG(V, E), the behaviour

of the walk at time t can be summarized using the mixing

matrix [Godsil, 2013]

M(t) =U(t)◦U(−t) =eiHt◦e−iHt, (2)

where ◦ denotes the Schur-Hadamard product of two

ma-trices, i.e., [A ◦ B]uv = AuvBuv. Since U is unitary,

M(t) is a doubly stochastic matrix and each entryM(t)uv

indicates the probability of the CTQW visiting vertex v at

timet when the walk initially starts from vertex u.

How-ever, QM(t) cannot converge, because U(t) is also

norm-preserving. To overcome this problem, we can enforce con-vergence by taking a time average. Specifically, we take the Ces`aro mean and define the average mixing matrix as Q= limT→∞∫

T

0 QM(t)dt, where each entryQuvof the

av-erage mixing matrixQrepresents the average probability for

a CTQW to visit vertexvstarting from vertexu, andQis still

a doubly stochastic matrix. Godsil [Godsil, 2013] has

indicat-ed that the entries ofQare rational numbers. We can easily

computeQfrom the spectrum of the HamiltonianHthat can

be the adjacency matrixAofG. Letλ1, . . . , λ|V|represent

the|V|distinct eigenvalues ofHandP_j_{be the matrix}

repre-sentation of the orthogonal projection on the eigenspace

as-sociated with theλj, i.e.,H=∑|

V|

j=1λjPj. Then, we rewrite

the average mixing matrixQas

Q=

|V|

∑

j=1

P_j_◦P_j_. (3)

Remarks:The CTQW has been successfully employed to

develop novel approaches in machine learning and data

min-ing [Baiet al., 2014; Baiet al., 2016], because of the richer

structure than their classical counterparts. The reason of uti-lizing the CTQW in this work is that the state vector of the CTQW is complex-valued and its evolution is governed by a time-varying unitary matrix. By contrast, the state vec-tor of the classical CTRW is real-valued and its evolution is governed by a doubly stochastic matrix. As a result, the behaviour of the CTQW is significantly different from their classical counterpart and possesses a number of important properties. For instance, the CTQW allows interference to take place, and thus reduces the tottering problem arising in the classical CTRW. Furthermore, since the evolution of the CTQW is not dominated by the low frequency components of the Laplacian spectrum, it has better ability to distinguish different graph structures. Finally, the CTQW can accommo-date the complete weighted graph, since the Hamiltonian of the CTQW can be the complete weighted adjacency matrix.

2.2

The Dynamic Time Warping Framework

We review the global alignment kernel based on the dynamic

time warping framework proposed in [Cuturi, 2011]. LetT

be a set of discrete time series that take values in a spaceX.

For a pair of discrete time seriesP= (p1, . . . , pm)∈Tand

Q = (q1, . . . , qn) ∈ Twith lengthsmandnrespectively,

the alignmentπbetweenPandQis defined as a pair of

in-creasing integral vectors(πp, πq)of lengthl ≤ m+n−1,

where1 = πp(1) ≤ · · · ≤ πp(l) = mand1 = πq(1) ≤

· · · ≤ πq(l) = nsuch that(πp, πq)is assumed to have

uni-tary increments and no simultaneous repetitions. Note that,

for PandQ, each of their elements can be an observation

vector with fixed dimensions at a time step. For any index

1≤i≤l−1, the increment vector ofπ= (πp, πq)satisfies

( πp(i+ 1)−πp(i) πq(i+ 1)−πq(i) ) ∈ {( 0 1 ) , ( 1 0 ) , ( 1 1 )} . (4)

(4)

Within the framework of the classical dynamic time

warp-ing [Cuturi, 2011], the coordinates πp andπq of the

align-mentπdefine the warping function. AssumeA(m, n)

cor-responds to a set of all possible alignments betweenPand

Q, Cuturi [Cuturi, 2011] has proposed a dynamic time

warp-ing inspired kernel, namely the Global Alignment Kernel, by

considering all the possible alignments inA(m, n). The

ker-nel is defined as

kGA(P,Q) =

∑

π∈A(m,n)

e−DP,Q(π)_, ₍₅₎

whereDP,Q(π)is the alignment cost given by

DP,Q(π) =

|π|

∑

i=1

ϕ(pπp(i), qπq(i)), (6)

and is defined through a local divergenceϕthat quantifies the

discrepancy between each pair of elementspi ∈Pandqi ∈

Q. In general,ϕis defined as the squared Euclidean distance.

Note that, the kernel kGA measures the quality of both the

optimal alignment and all other alignmentsπ ∈ A(m, n),

thus it is positive definite. Moreover, kGA provides richer

statistical measures of similarity by encapsulating the overall

spectrum of the alignment costs{DP,Q(π), π∈ A(m, n)}.

Remarks:The dynamic time warping based Global

Align-ment Kernel kGA is a powerful tool for analyzing

vectori-al time series [Mikvectori-alsenet al., 2018; Jain, 2019]. To extend

kGAinto graph kernel domains, Bai et al. [Baiet al., In Press]

have developed a nested graph kernel by measuring kGA

between the depth-based complexity traces of graph-s [Bai and Hancock, 2014]. Specifically, the complexity trace of each graph is computed by measuring the entropies on a

family ofK-layer expansion subgraphs rooted at its centroid

vertex. Although, the nested graph kernel outperforms local

substructure based graph kernels [Johanssonet al., 2014] on

graph classification tasks. Unfortunately, the financial net-works are by nature complete weighted graphs and it is diffi-cult to decompose such graphs into required expansion sub-graphs rooted at the centroid vertex. Thus, directly preform-ing the dynamic time warppreform-ing inspired graph kernel for time-varying financial networks still remains challenges.

3

Kernels for Time-varying Networks

In this section, we propose a Quantum-inspired Entropic Ker-nel between time-varying networks for multiple co-evolving financial time series analysis. We commence by character-izing each financial network as a discrete quantum entropy time series through the CTQW. Moreover, we define the new kernel associated with the entropy time series, in terms of the classical dynamic time warping framework [Cuturi, 2011].

3.1

The Quantum Entropy Time Series

We introduce how to characterize each financial network structure as the quantum entropy time series through the

C-TQW. AssumeG={G1, . . . , Gp, . . . , Gq, . . . , GT}denotes

a family of time-varying financial networks extracted from

a complex financial system Swith a specific set of N

co-evolving financial time series, i.e., the system has a fixed

number of components (e.g., stocks) co-evolving with time.

Gp(Vp, Ep, Ap)is the sample network extracted from the

sys-tem at time stepp. ForGp, each individual vertexv∈Vp

rep-resents a corresponding time series of a different stock (e.g.,

the stock price), each edgee∈Eprepresents the interaction

(e.g., distances or correlations) between a pair of time series,

andAp is the interaction based weighted adjacency matrix.

This is a popular way of modelling the multiple co-evolving

financial time series as network structures [Silvaet al., 2015;

Baiet al., 2020]. Note that, since the vertices of each

finan-cial networkGp∈Gcorrespond to the sameN components

of the systemS, all the networks inGhave the same vertex

set, whereas the edge setsEtare quite different with timet.

Specifically, for each financial network Gp(Vp, Ep, Ap)

fromGat timep, we first compute the average matrix matrix

Qp_{associated with the CTQW evolved on}_G

p. For eachi-th

vertexvi ∈ Vp, thei-th row ofQp gives the time-averaged

probability distributionPi for the CTQW to visit vertices

v1, . . . , vN ∈V (|Vp|=N) starting fromvi, i.e.,

Pi={Pi(v1), . . . ,Pi(vj), . . . ,Pi(vN)}. (7)

wherePi(vj) =Qpi,jis the time-averaged probability of the

CTQW visitingvj fromvi. The quantum based Shannon

en-tropy [Baiet al., 2016] of vertexvican be defined as

HS(vi) =−

∑

vj∈Vp

Pi(vj) logPi(vj). (8)

As a result, the entropy characteristic vector ofGpassociated

with the entropies over all its vertices can be defined as Ep={HS(v1), . . . , HS(vi), . . . , HS(vN)}⊤, (9)

whereHS(vi) is the quantum Shannon entropy of thei-th

vertexviofGpassociated with the time-averaged probability

distribution residing on thei-th row ofQp.

We move a time interval ofwtime steps over all the

time-varying networks of the financial system S to construct a

time-varying quantum entropy time series for each network

Gp at time p. In this work, we set the value of w as 28.

Specifically, for each networkGp, we compute its quantum

entropy time seriesStassociated with its time window as

Sp={Ep−w+1|Et−p+2|. . .|Es|. . .|Ep}, (10)

where each columnEsofSpis the entropy characteristic

vec-tor of each network Gs ∈ Gat time s and is defined by

Eq.(9).s∈ {p−w+1, p−w+2, . . . , p}. Obviously, the

quan-tum entropy time seriesSpofGpencapsulates a family ofw

time-varying entropy characteristic vectors fromGp−w+1at

timep−w+ 1toGpat timet.

3.2

The Quantum-inspired Entropic Kernel

We develop a new kernel for analyzing time-varying finan-cial networks based on the classical dynamic time warping

framework. For a pair of time-varying networks Gp ∈ G

andGq ∈ Gat timepandqrespectively, we commence by

computing their associated quantum entropy time series as

Sp={Ep−w+1|Ep−w+2|. . .|Ep}

and

(5)

−2 −1 0 1 2 x 104 −2 −1 0 1 2 x 104 −1 −0.5 0 0.5 1 x 104 First Eigenvalue Second Eigenvalue Third Eigenvalue 1000 2000 3000 4000 5000 6000

(a) Path for QEK

−2000 −1000 0 1000 −200 −100 0 100 200 300 400 −300 −200 −100 0 100 200 Second Eigenvalue First Eigenvalue Third Eigenvalue 1000 2000 3000 4000 5000 6000

(b) Path for GAK

−6 −4 −2 0 2 4 x 104 −2 0 2 x 105 −5 −4 −3 −2 −1 0 1 2 x 104 First Eigenvalue Second Eigenvalue Third Eigenvalue 1000 2000 3000 4000 5000 6000 (c) Path for WLSK −40 −20 0 20 40 60 −60 −40 −20 0 20 40 −30 −20 −10 0 10 20 30 Second Eigenvalue First Eigenvalue Third Eigenvalue 1000 2000 3000 4000 5000 6000 (d) Path for DTQK Figure 1: Color Path of Financial Networks Over All Trading Days.

based on the definition in Section 3.1. The proposed Quantum

Inspired Entropic KernelkQEKbetweenGpandGqis

kQEK(Gp, Gq) =kGA(Sp,Sq) =

∑

π∈A(w,w)

e−Dp,q(π)_,

(11)

where kGA is the dynamic time warping inspired Global

Alignment Kernel (GAK) defined in Eq.(5),π is the

warp-ing alignment between the entropy time series ofGpandGq,

A(w, w)is all possible alignments andDp,q(π)refers to the

alignment cost obtained via Eq.(6). Note that, the proposed

k-ernelkQEKis positive definite. This is becausekQEKis based

on the positive definite kernelkGA.

Remarks: Although the proposed kernelkQEK is related

to the general principles of the GAK kernel. The proposed

k-ernelkQEKstill possesses two theoretical differences with the

GAK kernel. First, the original GAK kernel is only

develope-d for vectorial time series andevelope-d thus cannot capture structural relationships between time series. By contrast, the proposed

kernelkQEKis explicitly proposed for time-varying financial

networks that encapsulate physical interactions between pairs

of time series.Second, unlike the GAK kernel, the proposed

kernelkQEK is defined based on the quantum entropy time

series that is developed through the average mixing matrix of the CTQW. As we have stated in Section 2.1, the CTQW can

accommodate the complete weighted graph and better distin-guish different network structures in terms of the low frequen-cy components of its Laplacian spectrum. Thus, the proposed

kernelkQEKcan not only reflect the physical interactions

be-tween the original vectorial financial time series, but also cap-ture richer struccap-ture information than the GAK kernel

associ-ated with the original time series. On the other hand, as we

have stated, the state-of-the-art graph kernels mentioned in Section 1 and Section 2.2 cannot directly accommodate com-plete weighted graphs. Thus, it is difficult to directly perform these graph kernels on the complete weighted financial net-works, unless one transforms these networks into sparse

ver-sions. By contrast, the proposed kernelkQEK can

encapsu-late the whole structural information residing on all weighted

edges. In summary, the proposed kernelkQEK bridges the

gap between state-of-the-art graph kernels and the clas-sical dynamic time warping framework for time-varying

networks, providing an effective way to analyze multiple

co-evolving financial time series.

Time Complexity: For a pair of networks each havingn

vertices, computing the kernelkQEK associated with a time

interval ofwsteps requires time complexityO(n3₊_w2₎_.

Be-cause, computing the entropy time series relies on the spectral

decomposition of CTQWs, thus has time complexityO(n3₎_.

Computing all possible alignments between the entropy time

(6)

−1 −0.5 0 0.5 1 1.5 x 104 −2 0 2 x 104 −6000 −4000 −2000 0 2000 4000 6000 8000 10000 Second Eigenvalue First Eigenvalue Third Eigenvalue

Before Dot−com Bubble Burst After Dot−com Bubble Burst Dot−com Bubble Burst

13.03.2000

(a) Dot-com Bubble for QEK

0 500 1000 −100 −50 0 50 100 150 −150 −100 −50 0 50 100 150 Second Eigenvalue First Eigenvalue Third Eigenvalue

13.03.2000

(b) Dot-com Bubble for GAK

−2 0 2 x 104 −1000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 −5 −4 −3 −2 −1 0 1 x 104 Second Eigenvalue First Eigenvalue Third Eigenvalue

13.03.2000

(c) Dot-com Bubble for WLSK

−30 −20 −10 0 10 20 −50 0 50 −6 −4 −2 0 2 4 6 8 10 12 14 First Eigenvalue Second Eigenvalue Third Eigenvalue

13.03.2000

(d) Dot-com Bubble for DTQK Figure 2: The 3D kPCA Embeddings of Different Kernels for Dot-com Bubble Burst.

kQEKhas a polynomial time complexityO(n3+w2).

4

Experiments of Time Series Analysis

We establish aNYSE datasetthat consists of a series of

time-varying financial networks based on the New York Stock

Ex-change (NYSE) database [Silvaet al., 2015; Yeet al., 2015].

The NYSE database encapsulates 347 stocks and their as-sociated daily prices over 6004 trading days from January 1986 to February 2011, i.e., the market system has 347 co-evolving time series in terms of the daily stock prices. The prices are all corrected from the Yahoo financial dataset (http://finance.yahoo.com). To extract the network represen-tations, we use a time window of 28 days and move this win-dow along time to obtain a sequence (from day 29 to day 6004) in which each temporal window contains a time se-ries of the daily return stock prices over a period of 28 days. To represent trades between different stocks as a network, for each window we compute the Euclidean distance between the time series of each pair of stocks as their connection (edge)

weight, following the same setting in [Baiet al., 2020]. It has

been empirically shown that the financial networks associated with the Euclidean distance are more effective than those as-sociated with the Pearson correlation. Clearly, this operation yields a time-varying financial network with a fixed number of 347 vertices and varying edge weights for each of the 5976 trading days. Each network is a complete weighted graph.

4.1

Kernel Embeddings from kPCA

We evaluate the performance of the proposed Quantum-inspired Entropic Kernel (QEK) on time-varying networks of the NYSE dataset. Specifically, we analyze whether the proposed QEK kernel can distinguish the structural changes of the network evolution with time. Furthermore, we also compare the proposed QEK kernel with three state-of-the-art kernel methods, that is, the dynamic time warping in-spired Global Alignment Kernel (GAK) for original vectori-al time series [Cuturi, 2011] and two graph kernels for time-varying financial networks. The graph kernels for compar-isons include the Weisfeiler-Lehman Subtree Kernel

(WL-SK) [Shervashidzeet al., 2009], and the Discrete-time

Quan-tum Walk Kernel (DTQK) [Baiet al., 2020]. For the GAK

kernel, we also utilize a time window of 28 days for each trad-ing day. For the WLSK kernel, since it can only accommdate undirected and unweighted graphs, we transform each o-riginal network into a minimum spanning tree and ignore the weights on the preserved edges, following the same setting in

the work [Baiet al., 2020]. Since the DTQK kernel can

ac-commodate edge weights, we straightforwardly perform this kernel on the original financial networks. We perform kernel

Principle Component Analysis (kPCA) [Wittenet al., 2011]

on the kernel matrices associated with different kernels, and embed the financial networks or the original time series in-to a vecin-torial pattern space. We visualize the embedding

(7)

re-−2 −1 0 1 2 x 104 −2 0 2 x 104 −1 −0.5 0 0.5 1 x 104 Second Eigenvalue First Eigenvalue Third Eigenvalue

Before Enron Incident After Enron Incident Enron Incident

Prosecution against Arthur Andersen (11.03.2002)

(a) Enron Incident for QEK

−2000 −1000 0 1000 −200 ₀ 200 ₄₀₀ −300 −250 −200 −150 −100 −50 0 50 100 150 200 First Eigenvalue Second Eigenvalue Third Eigenvalue

Before Enron Incident Enron Incident After Enron Incident

(b) Enron Incident for GAK

−5 −4 −3 −2 −1 0 1 2 3 x 104 −2 0 2 x 105 −5 −4 −3 −2 −1 0 1 2 x 104 Second Eigenvalue First Eigenvalue

Third Eigenvalue Before Enron Iincident Enron Incident After Enron Incident

(c) Enron Incident for WLSK

−40 −20 0 20 40 60 ₋₆₀ −40 −20 0 20 40 −40 −20 0 20 40 Second Eigenvalue First Eigenvalue Third Eigenvalue

Before Enron Incident Enron Incident After Enron Incident

(d) Enron Incident for DTQK Figure 3: The 3D kPCA Embeddings of Different Kernels for Enron Incident.

sults using the first three principal components in Fig.1(a), Fig.1(b), Fig.1(c), and Fig.1(d) respectively.

Fig.1 exhibits the paths of the time-varying financial net-works (or the original vectorial time series) in different ker-nel spaces, and the color bar of each subfigure indicates the date in the time series. We observe that the embeddings from the proposed QEK kernel exhibit a better manifold structure. Moreover, only the proposed QEK kernel generates a clear time-varying trajectory and the neighboring networks with time are close together in the embedding principal space. By contrast, the alternative methods hardly result in a trajectory and their embeddings tend to distribute as clusters. To further demonstrate the effectiveness of the QEK kernel, we compare

the distance stress(DS)of the network embeddings from

d-ifferent kernels. Specifically, theDSis defined as

DS = ∑ t∥xt−xt−1∥2 ∑ t∥xt−xtn∥2 , (12)

wheret= 2,3, . . . , n,xtis the network embedding vector at

timet, andxtnis the nearest network embedding vector ofxt

in the pattern space. For each embedding vectorxtat time

t, if the nearest embedding vector is always the embedding

vector at last time step (i.e.,xt−1), the value ofDSwill be1.

In other words, theDSvalue nearer to1indicates the better

performance of the embeddings to form a clear time-varying

trajectory. TheDSvalue of each kernel is shown in Table 1.

Methods QEK GAK WLSK DTQK

Distance Stress 1.0992 2.9677 5.7053 4.7174

Table 1: The Distance Stress of the Network Embeddings

Clearly, only theDSvalue of the proposed QEK kernel is

nearer to1, indicating the better performance of preserving

the ordinal arrangement of the time-varying networks. To take our study one step further, we explore the embed-dings during different periods of three well-known financial

events, i.e., the Black Monday period (from 15th Jun 1987

to 17th Feb 1988), the Dot-com Bubble period (from 3rd

Jan 1995 to 31st Dec 2001), and the Enron Incident period

(from 16th Oct 2001 to 11th Mar 2002). For differen

ker-nels, Fig.2 corresponds to the Dot-com Bubble period and Fig.3 to the Enron Incident period. Due to the limit space, we do not exhibit the embeddings for Black Monday. How-ever, we will observe the similar phenomenon with Fig.2 and

Fig.3. These figures indicate that the Black Monday (17th

Oct, 1987), the Dot-com Bubble Burst (13rd Mar, 2000) and the Enron Incident period are all crucial financial events, sig-nificantly influencing the structural time-varying evolution of the financial networks or the original vectorial financial time series. Excluding the GAK kernel, the embedding points of the remaining kernels before and after these events are well separated into distinct clusters, and the points corresponding

(8)

10 20 30 40 50 60 70 80 90 100 Number of Views 10 20 30 40 50 60 70 80 90 100 Number of Views 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(a) For QEK on Dot-com

10 20 30 40 50 60 70 80 90 100 Number of Views 10 20 30 40 50 60 70 80 90 100 Number of Views 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (b) For DTQK on Dot-com Figure 4: Kernel Matrices Visualizations.

Methods QEK GAK WLSK DTQK

Black Monday 0.9050 0.6673 0.5667 0.6506

Dotcom Bubble 0.6473 0.8882 0.5279 0.7903

Enron Incident 0.8504 0.4992 0.5001 0.7042

Average Rand 0.8009 0.6849 0.5315 0.7150

Table 2: The Rand Index for K-means Clustering on the Embedding

Points of100_{Trading Days around Each Financial Crisis.}

to the crucial events are midway between the clusters. To place our analysis of the kernel embedding clusters on a more quantitative footing, for each kernel we select the

k-ernel embedding points of100 trading days around each

fi-nancial crisis, i.e., we select embedding points of50trading

days before and after each crisis date respectively. We apply

the K-means method to the kernel embeddings of100trading

days for each kernel to explore whether the clusters can be correctly separated in terms of the trading days before and af-ter each financial crisis. We calculate the Rand Index for the resulting clusters and the Rand indicating each kernel is listed in Table 2. The results indicate that the embedding points as-sociated with the proposed QEK kernel can produce the best clusters, i.e., the embedding points before and after the finan-cial crisis are separated better than other kernels.

4.2

Evaluations of the Kernel Matrix

Based on the earlier evaluation, we find that the DTQK kernel is the most competitive kernel with the proposed QEK kernel. To further reveal the effectiveness of the proposed QEK ker-nel, we visualize the kernel matrices of both the kernels.

Due to the limited space, we only compute the kernel ma-trices between the networks belonging to the Dot-com Bubble

period, and the period encapsulate100trading days. In fact,

we will observe similar phenomenons if we compute the ker-nel matrices for other financial event periods. Specifically, the kernel matrices are visualized in Fig.4, where both the x-axis and y-axis represent the time steps. Note that, to compare the two kernels in the same scaled Hilbert space, we consider the normalized version of both the kernels as

kn(Gp, Gq) =

k(Gp, Gq)

√

k(Gp, Gp)k(Gq, Gq)

, (13)

where kn is the normalized kernel, and k is either the

EDTWK or the WLSK kernel. As a result, the kernel

val-ues are all bounded between0to1, and the colour bar beside

each subfigure indicates the kernel value of the kernel matrix. Fig.4 indicates that the kernel values tend to decrease when the elements of the kernel matrix are far away from the matrix trace. This is because such elements are computed between time-varying networks having long time spans and there are more structure changes when the network evolves with a long time variation. Thus, both the QEK and DTQK kernels reflect structural evolutions of financial networks with time. How-ever, on the other hand, the kernel value of the DTQK kernel tends to drop down more quickly when the element is a little far from the trace. By contrast, the kernel value of the QEK kernel tend to decrease more slowly when the element gets farer away from the trace. This observation explains why on-ly the proposed QEK kernel can form a clear trajectory with time variation and generate better clusters before and after financial crisis, i.e., the proposed QEK kernel can better dis-tinguish and understand the structural changes of the network structures evolving with a long time period.

5

Conclusion

In this paper, we have developed a new Quantum-inspired Entropic Kernel for time-varying complex networks. The proposed kernel bridges the gap between graph kernels and the classical dynamic time warping framework for time se-ries analysis. Experimental analysis of NYSE financial time series demonstrates the effectiveness of the new kernel.

Acknowledgments

This work is supported by the National Natural Science Foun-dation of China (Grant no.61976235 and 61602535), the Pro-gram for Innovation Research in Central University of Fi-nance and Economics, and the Youth Talent Development Support Program by Central University of Finance and Eco-nomics (No. QYP1908). This work is also partially supported by the Foundation of State Key Laboratory of Cognitive Intel-ligence (Grant No. COGOSC-20190002), iFLYTEK, China. Corresponding Author: Lixin Cui ([email protected]).

(9)

References

[Bai and Hancock, 2014] Lu Bai and Edwin R. Hancock.

Depth-based complexity traces of graphs. Pattern

Recog-nition, 47(3):1172–1186, 2014.

[Baiet al., 2014] Lu Bai, Luca Rossi, Horst Bunke, and

Ed-win R. Hancock. Attributed graph kernels using the

jensen-tsallis q-differences. In Proceedings of

ECML-PKDD, pages 99–114, 2014.

[Baiet al., 2015a] Lu Bai, Luca Rossi, Andrea Torsello,

and Edwin R. Hancock. A quantum jensen-shannon

graph kernel for unattributed graphs.Pattern Recognition,

48(2):344–355, 2015.

[Baiet al., 2015b] Lu Bai, Zhihong Zhang, Chaoyan Wang, Xiao Bai, and Edwin R. Hancock. A graph kernel based

on the jensen-shannon representation alignment. In

Pro-ceedings of IJCAI, pages 3322–3328, 2015.

[Baiet al., 2016] Lu Bai, Luca Rossi, Lixin Cui, and Ed-win R. Hancock. A novel entropy-based graph signature

from the average mixing matrix. InProceedings of ICPR,

pages 1339–1344, 2016.

[Baiet al., 2020] Lu Bai, Luca Rossi, Lixin Cui, Jian Cheng,

and Edwin R. Hancock. A quantum-inspired

similari-ty measure for the analysis of complete weighted graphs.

IEEE Trans. Cybern., 50(3):1264–1277, 2020.

[Baiet al., In Press] Lu Bai, Lixin Cui, Luca Rossi, Lixiang Xu, Xiao Bai, and Edwin R. Hancock. Local-global nested

graph kernels using nested complexity traces. InPattern

Recognition Letters, In Press.

[Borgwardt and Kriegel, 2005] Karsten M. Borgwardt and Hans-Peter Kriegel. Shortest-path kernels on graphs. In

Proceedings of the IEEE International Conference on Da-ta Mining, pages 74–81, 2005.

[Bullmore and Sporns, 2009] Ed Bullmore and Olaf Sporn-s. Complex brain networks: Graph theoretical analysis of

structural and functional systems. Nature Reviews

Neuro-science, 10(3):186–198, 2009.

[Cuiet al., 2018] Lixin Cui, Lu Bai, Luca Rossi, Zhihong Zhang, Yuhang Jiao, and Edwin R. Hancock. A prelim-inary survey of analyzing dynamic time-varying financial

networks using graph kernels. InProceedings of S+SSPR,

pages 237–247, 2018.

[Cuturi, 2011] Marco Cuturi. Fast global alignment kernels. InProceedings of ICML, pages 929–936, 2011.

[Farhi and Gutmann, 1998] E. Farhi and S. Gutmann.

Quan-tum computation and decision trees. Physical Review A,

58:915, 1998.

[Godsil, 2013] Chris Godsil. Average mixing of continuous

quantum walks. Journal of Combinatorial Theory, Series

A, 120(7):1649–1662, 2013.

[Haussler, 1999] David Haussler. Convolution kernels on

discrete structures. InTechnical Report UCS-CRL-99-10,

Santa Cruz, CA, USA, 1999.

[Jain, 2019] Brijnesh J. Jain. Making the dynamic time

warping distance warping-invariant. Pattern Recognition,

94:35–52, 2019.

[Johanssonet al., 2014] Fredrik D. Johansson, Vinay

Jetha-va, Devdatt P. Dubhashi, and Chiranjib Bhattacharyya.

Global graph kernels using geometric embeddings. In

Pro-ceedings of ICML, pages 694–702, 2014.

[Kashimaet al., 2003] Hisashi Kashima, Koji Tsuda, and

Akihiro Inokuchi. Marginalized kernels between labeled

graphs. InProceedings of ICML, pages 321–328, 2003.

[Mikalsenet al., 2018] Karl Øyvind Mikalsen, Filippo

Mari-a BiMari-anchi, CristinMari-a Soguero-Ru´ız, Mari-and Robert Jenssen. Time series cluster kernel for learning similarities between

multivariate time series with missing data.Pattern

Recog-nition, 76:569–581, 2018.

[Nicoliset al., 2005] G. Nicolis, A. G. Cantu, and C. Nicolis.

Dynamical aspects of interaction networks. International

Journal of Bifurcation and Chaos, 15:3467, 2005.

[Shervashidzeet al., 2009] N. Shervashidze, S.V.N.

Vish-wanathan, K. Mehlhorn T. Petri, and K. M. Borgwardt.

Ef-ficient graphlet kernels for large graph comparison.

Jour-nal of Machine Learning Research, 5:488–495, 2009.

[Shimadaet al., 2008] Y. Shimada, T. Kimura, and

T. Ikeguchi. Analysis of chaotic dynamics using

measures of the complex network theory. InProceedings

of ICANN, pages 61–70, 2008.

[Silvaet al., 2015] Filipi N. Silva, Cesar H. Comin,

Thomas K. Peron, Francisco A. Rodrigues, Cheng Ye, Richard C. Wilson, Edwin R. Hancock, and Luciano

da F. Costa. Modular dynamics of financial market

networks. arXiv preprint arXiv:1501.05040, 2015.

[Wittenet al., 2011] Ian H. Witten, Eibe Frank, and Mark A.

Hall.Data Mining: Practical Machine Learning Tools and

Techniques. Morgan Kaufmann, 2011.

[Xuet al., 2018] Lixiang Xu, Xiaoyi Jiang, Lu Bai, Jin Xiao, and Bin Luo. A hybrid reproducing graph kernel based

on information entropy. Pattern Recognition, 73:89–98,

2018.

[Yeet al., 2015] Cheng Ye, C´esar H. Comin, Thomas K. Peron, Filipi N. Silva, Francisco A. Rodrigues, Luciano da F. Costa, Andrea Torsello, and Edwin R. Hancock. Thermodynamic characterization of networks using graph

polynomials.Physical Review E, 92(3):032810, 2015.

[Zhang and Small, 2006] J. Zhang and M. Small. Complex network from pseudoperiodic time series: Topology

https://eprints.whiterose.ac.uk/