• No results found

2.3 Graphlet Kernels for Large Graph Comparison

2.3.1 Graph Reconstruction

We start our exposition by summarizing the graph reconstruction conjecture and the ma- trix reconstruction theorem, and conclude this section by building a bridge between these reconstruction ideas and graph kernels.

Reconstruction of Graphs

Graph reconstruction is a classic open problem in graph theory [Kelly, 1957, Hemminger, 1969]: Let G = (V, E) be a undirected graph of size n. For each v ∈ V, let Gv denote a node-deleted subgraph of G,i.e., the graph obtained by deleting node v and all the edges incident on it from G. Can G be reconstructed, up to an isomorphism, from its set of node-deleted subgraphs{Gv}v∈V? Intuitively, one asks: Given a graph Gonn nodes, isG determined uniquely up to an isomorphism by its subgraphs of size n−1? Put differently, are there two non-isomorphic graphs with identical n−1 sized subgraphs?

2.3 Graphlet Kernels for Large Graph Comparison 69

trees and g :V →V0 be an isomorphism function such that Gv is isomorphic to G0g(v) for all v ∈V, then G is isomorphic to G0. He conjectured that the following theorem is true for arbitrary graphs:

Theorem 12 (Graph Reconstruction Conjecture) Let G and G0 be graphs of size greater than 2 and g :V →V0 be an isomorphism function such that Gv is isomorphic to

G0g(v) for all v ∈V. Then G is isomorphic to G0.

Kelly [Kelly, 1957] verified his conjecture by enumeration of all possible graphs for 2 < n ≤ 6, which was later extended to 2 < n ≤ 11 by [McKay, 1997]. Special classes of graphs such as regular graphs, and disconnected graphs have also been shown to be reconstructible [Kelly, 1957]. The general case, however, remains a conjecture. It is widely believed though, that if a counterexample to the graph reconstruction problem exists, then it will be of size n11 [McKay, 1997].

Reconstruction of Matrices

While graph reconstruction remains a conjecture for general graphs, reconstruction of matrices has been resolved [Manvel and Stockmeyer, 1971]. We need some terminology to make this result clearer. Let M be any n×n matrix. We call the submatrix obtained by deletion of its k-th row and k-th column the k-th principal minor, and denote it as Mk. The following theorem due to [Manvel and Stockmeyer, 1971] asserts that the principal minors determine the matrix:

Theorem 13 Anyn×nmatrixM withn≥5can be reconstructed from its list of principal minors {M1, . . . , Mn}.

The adjacency matrix of a graph is not invariant to reordering of the nodes, but, if the graph is node ordered then its adjacency matrix is unique. For such graphs, the following corollary is particularly relevant:

Corollary 14 Any graph G= (V, E) of size n ≥ 5 whose nodes are ordered as v1, . . . , vn

can be reconstructed from its set of maximal subgraphs {Gv1, . . . , Gvn}, if their nodes are ordered in the same order as those of G.

The condition that the nodes of all node-deleted subgraphs of G have to be ordered in the same way as those of G implies that the nodes of G must be sorted according to a global canonical vertex ordering. We will explain what we mean by a global canonical vertex ordering in the following. For this purpose, we first have to clarify two concepts:

complete graph invariant(see Section 1.3.3) and canonical form.

A function f of a graph is called acomplete graph invariant if G'G0 is equivalent to

f(G) = f(G0). If, in addition, f(G) is a graph isomorphic toG, thenf is called acanonical form for graphs [Koebler and Verbitsky, 2006, Gurevich, 2001]. [Gurevich, 2001] showed that graphs have a polynomial-time computable canonical form if, and only if, they have a polynomial-time computable complete invariant.

70 2. Fast Graph Kernel Functions

Recall that a vertex ordering π maps every node of a graph to a unique number in

{1, . . . , n}. If π is invariant to isomorphism, then it defines a complete graph invariant. With some abuse of terminology, in the sequel, we will refer to such a vertex ordering as

canonical. This is justified because π can indeed be used to define a canonical form for graphs. Every vertex ordering also induces a vertex ordering on the subgraphs. This is because every subset of an ordered set is also ordered. We denote this induced ordering by πG. Note that even if π is canonical, it does not guarantee that the induced vertex orderings πG are canonical. If every induced vertex ordering ofπ is also canonical, then π is said to be globally canonical.

Unfortunately, computing a global canonical vertex ordering is a NP-hard problem because given a solution to this problem, one can solve subgraph isomorphism – a NP- complete problem – in polynomial time [Garey and Johnson, 1979]. Nevertheless, for many graphs of practical importance, it is easy to compute a global canonical vertex ordering, especially in the field of databases. If we are dealing with a graph whose nodes are distinct objects from the same database, then we can order the nodes in this graph according to their keys in the database. Ordering via database keys obviously results in a global canonical vertex ordering.

Graph Similarity via Graph Reconstruction

Why is the graph reconstruction conjecture interesting for graph kernels? Because it deals with a question that is implicitly asked when designing graph kernels: Which substructures of a graph determine a graph up to isomorphism? If the graph reconstruction conjecture were true, this question could be answered: A graph is determined uniquely up to iso- morphism by its size-(n−1) subgraphs. Although the conjecture has not been proven in general, it has been shown for certain classes of graphs, in particular for graphs with global canonical vertex ordering. We will exploit these results to define novel graph kernels in the following.