Novel approaches for solving large-scale optimization problems on graphs

(1)

A Dissertation by

SVYATOSLAV TRUKHANOV

Submitted to the Office of Graduate Studies of Texas A&M University

in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY

August 2008

(2)

A Dissertation by

SVYATOSLAV TRUKHANOV

Submitted to the Office of Graduate Studies of Texas A&M University

in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY

Approved by:

Chair of Committee, Sergiy I. Butenko Committee Members, Lewis Ntaimo

Wilbert E. Wilhelm Huafei Yan

Head of Department, Brett A. Peters

August 2008

(3)

ABSTRACT

Novel Approaches for Solving

Large-scale Optimization Problems on Graphs. (August 2008) Svyatoslav Trukhanov, B.S., Kyiv Taras Shevchenko University;

M.S., Kyiv Taras Shevchenko University Chair of Advisory Committee: Dr. Sergiy I. Butenko

This dissertation considers a class of closely relatedN P-hard otpimization problems on graphs that arise in many important applications, including network-based data mining, analysis of the stock market, social networks, coding theory, fault diagnosis, molecular biology, biochemistry and genomics. In particular, the problems of interest include the classicalmaximum independent set problem(MISP) andmaximum clique problem (MCP), their vertex-weighted vesrions, as well as novel optimization models that can be viewed as practical relaxations of their classical counterparts.

The concept of clique has been a popular instrument in analysis of networks, and is, essentially, an idealized model of a “closely connected group”, or a cluster. But, at the same time, the restrictive nature of the definition of clique makes the clique model impractical in many applications. This motivated the development of clique relaxation models that relax different properties of a clique. On the one hand, while still possessing some clique-like properties, clique relaxations are not as “perfect” as cliques; and on the other hand, they do not exhibit the disadvantages associated with a clique. Using clique relaxations allows one to compromise between perfectness and flexibility, between ideality and reality, which is a usual issue that an engineer deals with when applying theoretical knowledge to solve practical problems in industry. The clique relaxation models studied in this dissertation were first proposed in the literature on social network analysis, however they have not been well investigated

(4)

from a mathematical programming perspective.

This dissertation considers new techniques for solving the MWISP and clique relaxation problems and investigates their effectiveness from theoretical and compu-tational perspectives. The main results obtained in this work include (i) developing a scale-reduction approach for MWISP based on the concept of critical set and compar-ing it theoretically with other approaches; (ii) obtaincompar-ing theoretical complexity results for clique relaxation problems; (iii) developing algorithms for solving the clique relax-ation problems exactly; (iv) carrying out computrelax-ational experiments to demonstrate the performance of the proposed approaches, and, finally, (v) applying the obtained theoretical results to several real-life problems.

(5)

(6)

ACKNOWLEDGMENTS

I would like to express great thanks to Dr. Sergiy Butenko, my committee chair, for his continuous support, encouragement and patience through my doctoral research studies at Texas A&M University. From the first days he has been a patient advisor, a knowledgeable colleague and a real friend. I feel really happy to have an experience of conducting research under his supervision and would like to express my respect and gratitude to him.

Also, I would like to express my thanks to my committee members, Dr. Lewis Ntaimo, Dr. Wilbert Wilhelm, and Dr. Catherine Yan, for their guidance and support throughout the course of this research.

I would like to thank Dr. Balabhaskar “Baski” Balasundaram, Uanny Brens, Sera Kahruman, Reza Seyedshohadaie, Oleksii Ursulenko for being wonderful colleagues and friends at Texas A&M University. Also, I would like to thank my external colleagues and collaborators, especially Dr. Vladimir Boginski from the University of Florida, Dr. Illya Hicks from Rice University and Dr. Andrew Schaefer from the University of Pittsburgh.

Thanks also go to all the faculty of Industrial and Systems Engineering Depart-ment, for making my doctoral studies at Texas A&M University a great experience; Dr. Brett Peters and Dr. Guy Curry, for providing me with several opportunities to teach and for their guidance; friendly staff, especially Lesley Bell, Michele Bork and Judy Meeks for helping me with the administrative issues; and professional technical specialists Dennis Allen, Mark Henry and Mark Hopcus for their assistance with the computational part of my research.

(7)

Also, I would like to thank all faculty of Computer Science Department at Kyiv Taras Shevchenko University, especially my undergraduate advisor Dr. Stavrovskiy and my graduate advisor Dr. Koval, for providing me base knowledge required for my doctoral study.

Special thanks is addressed to Mr. Kovalyov and Mrs. Osinskaya, my high school teachers, who taught me the basics of mathematics and information technology and played an important role in my future career decisions.

Finally, thanks to my parents for their encouragement and making me who I am. Last, but not least, I would like to thank my wife for her great patience and love.

(8)

IV RELATIONSHIP BETWEEN THE CRITICAL SET METHOD AND OTHER SCALE-REDUCTION TECHNIQUES . . . 39

IV.1. Scale Reduction in the Maximum Weight Indepen-dent Set Problem . . . 39

IV.2. IP Relaxation Based Approach . . . 40

IV.3. Approach Based on Roof Duality . . . 41

IV.3.1. Roof Duality Essentials . . . 41

IV.3.2. Roof Duality and the Maximum Weight In-dependent Set Problem . . . 42

IV.4. Crown Structure Elimination . . . 43

IV.5. Approach Based on Critical Sets . . . 44

IV.6. Relation Between Approaches . . . 45

IV.7. Extensions and Differences . . . 49

IV.8. t-Hat Structures and Maximum Weight Indepen-dent Set . . . 50

V CLIQUE RELAXATION MODELS . . . 53

(9)

CHAPTER Page

V.2. Complexity Results . . . 58

V.3. Mathematical Programming Formulations . . . 64

V.3.1. Maximumk-clique Problem . . . 64

V.3.2. Maximumk-club Problem . . . 64

V.3.3. Maximumk-plex Problem . . . 65

V.4. Maximum 2-club Problem . . . 66

V.5. Numerical Results . . . 66

VI EXACT ALGORITHM FOR THE MAXIMUM WEIGHT k-PLEX PROBLEM . . . 73

VI.1. Algorithm Implementation . . . 73

VI.2. k-plex Verification Routine . . . 78

VI.3. Preprocessing Techniques . . . 82

VI.3.1. Weight Based Ordering . . . 83

VI.3.2. Degree Based Ordering . . . 84

VI.3.3. Coloring Based Approach . . . 87

VI.4. Special Cases . . . 91

VI.4.1. Maximum Weight 2-plex Problem . . . 91

VI.4.2. k-plex in Sparse Graphs . . . 92

VI.5. Numerical Experiments . . . 94

VI.6. Comparison with Existing Approaches . . . 99

VII PORTFOLIO SELECTION VIA IDENTIFYING WEIGHTED k-PLEXES IN FINANCIAL NETWORKS . . . 102

VII.1. Introduction . . . 102

VII.2. Problem Setup . . . 105

VII.2.1. Constructing the Market Graph . . . 105

VII.2.2. Weighted Market Graphs . . . 106

VII.3. Computational Experiments . . . 108

VIII CONCLUSION AND FUTURE WORK. . . 111

REFERENCES . . . 116

APPENDIX A . . . 126

APPENDIX B . . . 154

(10)

LIST OF TABLES

TABLE Page

1 Results of experiments with Sanchis graphs . . . 36

2 Results of experiments with Erd¨os networks . . . 37

3 Results of experiments with coloring problem instances . . . 38

4 S. Cerevisiae. Vertices: 2114; Edges: 2203; Connected compo-nents: 417 . . . 67

5 H. Pylori. Vertices: 1570; Edges: 1403; Connected components: 858 . 68 6 Clique, 2-Clique, 2-Club, 3-Clique, 3-Club numbers of S. Cere-visiae and H. Pylori protein maps . . . 70

7 Top 10 most used orderings . . . 97

8 Parameters of calculated weighted diversified portfolios correspon-ding to 500-day tracorrespon-ding periods . . . 110

9 Graph parameters for small test-bed . . . 126

10 Number of k-plex verification routine calls . . . 127

11 Original and incrementalk-plex verification routine . . . 128

12 5-plex in ln2-san-100-40w . . . 129

13 Running time for weight based ordering . . . 133

14 Running time for ¨Osterg ˙ard ordering . . . 134

15 Running time for ¨Osterg ˙ard like ordering using double neighborhood 135 16 Running time for defective coloring . . . 136

(11)

TABLE Page

18 Running time for weight based defective coloring . . . 144

19 Running time for weight based defective coloring, reverse order . . . 148

20 General and special algorithm fork = 2 . . . 152

21 Running time withN2 based pruning . . . 153

22 Dimacs and Sanchis instances information . . . 154

23 Dimacs and Sanchis instances running time . . . 161

24 Real-life networks information . . . 168

25 Real-life network instances running time . . . 169

26 Running time comparson with McClosky’s algorithm . . . 170

27 Running time comparson with Balasundaram’s algorithms . . . 171

(12)

LIST OF FIGURES

FIGURE Page

1 College football schedule graph for season 2005 . . . 3

2 A graph with no 2-clans . . . 20

3 College football schedule graph clustered on 4-plexes . . . 22

4 Critical set to network flow reduction . . . 29

5 Crown structure . . . 44

6 t-hat structure . . . 50

7 2-club and 2-clique example . . . 54

8 Triangle free 4-plex . . . 55

9 An illustration to the proof of N P-completeness of the k-Club problem, for k= 5 . . . 60

10 An illustration to the proof of Theorem 12 fork = 4 . . . 63

11 Degree distribution, in logarithmic scale, for the protein network of S. Cerevisiae, Xk is the number of vertices of degree k . . . 67

12 Degree distribution, in logarithmic scale, for the protein network of H. Pylori, Xk is the number of vertices of degree k . . . 68

13 Protein-protein interaction map of H. Pylori . . . 70

14 A maximum 2-club and 2-clique of S. Cerevisiae . . . 71

15 A maximum 2-club and 2-clique of H. Pylori . . . 71

16 A maximum 3-clique and 3-club of S. Cerevisiae . . . 72

(13)

FIGURE Page 18 Branching strategies in maximum weight clique algorithm . . . 89 19 Market graph order over time . . . 104 20 Market graph for 10 randomly chosen stocks . . . 106

(14)

CHAPTER I

INTRODUCTION

In a non-formal way, a graph or a network is defined by a set of dots (vertices, nodes) and links (edges) between them. Since its first introduction in 1735 by L. Euler in his famous K¨onigsberg Bridges problem [55, 62], the concept of graph has been serving for more than 250 years as a convenient, effective, simple and easily understandable mathematical abstraction for modeling many real-life problems. In practical appli-cations, a vertex of a graph usually represents an entity, and an edge represents the relationship of interest between two entities.

For example, in chemistry a molecule can be naturally considered as a graph, where atoms are vertices and bonds between pairs of atoms are edges. Since each atom has its valency that may be greater than one, a molecule is an example of a graph with multiedges, where two vertices may be connected by more than one edge. Geographical map is also a graph with the vertices corresponding to the cities and the edges being the roads between cities. In biology, gene co-expression networks are graphs where vertices are genes and an edge exists between two vertices if the corresponding genes are co-expressed with correlation higher than a specified thresh-old, and a protein interaction network is represented by a graph with the proteins as vertices and known interactions between pairs of vertices as edges. These are just two examples of many biological structures that may be modeled as graphs [16, 40].

Many application of graphs are found in industry, such as the phone call graph constructed in the following way: each phone number is represented by a vertex, and each call placed during a specified time period defines an edge between the

(15)

corresponding pair of phone numbers. Experiments with the call graph based on a 12-hour time period in 1997 are described in [3, 121], with the corresponding phone call graph having over 53 million vertices and over 170 million edges. The U.S. stock market was modeled as a graph, named thestock-market graph in [25]. Here a vertex represents a stock, and two vertices are connected by an edge if the correlation of price fluctuations for the corresponding pair of stocks calculated over a certain period of time exceeds a specified threshold. Internet can be modeled as a graph at detailed level, where vertices correspond to individual computers or other networking devices that have IP address, and edges correspond to the physical links between such devices, as well as at macro level, where nodes are autonomous systems (usually Internet service providers or large companies and organizations) and an edge represents an entry from the global routing table or is obtained by using traceroute or ping probes. The resulting networks are extremely large, even at the macro level the whole network consists of more than 100000 nodes [9, 45].

Finally, graphs may be used to model various social phenomena. In social net-works the vertices usually represent people and the edges represent a certain type of relationship between them. The well-known “Erd¨os Number Project” is an example of acollaboration network, where two mathematicians are connected if they have pub-lished a paper as co-authors [76]. In another example, Figure 1 shows the 2005college football schedule graph, where the vertices are Division I-A college football teams, and two vertices are connected by an edge if the corresponding teams played each other during the considered season.

When appropriate, various attributes may be assigned to the graph’s vertices and edges. Attributes could be different by their nature, e.g., in the college football schedule graph one obvious attribute of vertices is the college football team name, and a possible edge attribute is the final score of the corresponding game. In mathematical

(16)

Fig. 1 College football schedule graph for season 2005

programming, only attributes with numerical values are usually considered and called weights. Weight functions may be associated with vertices as well as edges.

In different studies of real-life networks, one is required to find the closely con-nected groups in the graph, that are also calledcohesive subgroups orclusters. There are many ways to define the closeness in the group, but the first model that was used in social network analysis utilized the concept of clique. A clique is defined as a subset of vertices inducing a subgraph that has all possible edges, i.e., all vertices in a clique are connected to each other. Obviously, a clique represents a “perfectly” connected subgroup, thus it is an ideal model for cohesive subgroups. As an example,

(17)

a clique in the aforementioned phone call graph represents a group of people who called each other, thus, most probably, these people are closely connected and share similar interests and preferences. The opposite to the clique structure is an indepen-dent (stable) set, which is defined as a subset of vertices with no edges in between. These two models are closely related to each other, since a clique in the graph corre-sponds to an independent set in this graph’s compliment, so both models have many common properties, as the problem of finding a clique may be reduced to the problem of finding an independent set in the complement graph and vice versa. In terms of the stock market graph, an independent set represents a diversified portfolio, which is one of the key elements of portfolios sought for by the investors.

The problem of finding a clique of the maximum size in the given graph is called the maximum clique problem. When a graph has vertex weight, then the problem can be extended to the maximum weight clique problem, which requires one to find a clique of the maximum weight in the given graph, where the weight of a clique is defined as the sum of its vertex weights. The complementary problems are the max-imum independent set problem and the maximum weight independent set problem, respectively. These problems have many important applications, including network-based data mining, analysis of the stock market, social networks, coding theory, fault diagnosis, molecular biology, biochemistry and genomics. Moreover, these problems often arise as subproblems of more complicated problems, and many other combina-torial problems may be reduced to these problems. It is well-known that all these four problems are hard to solve, as they belong to the class of N P-hard problems, meaning that there is no effective algorithm to solve the problem in general. However, in practice one still needs to find a way to solve real-life instances of these problems. Sometimes, a provably optimal solution of the problem is required, in which cases an exact algorithmis applied to solve the problem. In other cases a non-optimal solution

(18)

is acceptable, and a heuristic algorithm may be applied that does not guarantee the optimal solution, but provides some “good” solution, usually much faster than the exact algorithm. An efficient way to speed-up the problem solving process is prepro-cessing, that is an algorithm executed before the execution of the main algorithm, aiming to improve the performance of the main algorithm. Also, the properties of graphs arising in real-life problems, such as lowedge-density, help to solve the prob-lems efficiently. In our research, we consider a new scale-reduction technique for solving the maximum weight independent set problem, which is particularly efficient on sparse graphs.

While the popularity of cliques in network-based studies can be explained by the fact that it represents an ideal “closely connected group”, due to its restrictive nature, the clique model becomes impractical in many cases, and has been the subject of the clique model criticismin socail networks literature. This motivated the development of the clique relaxation models that generalize the clique definition and relax some of the clique requirements. On the one hand, while still possesing some clique-like properties, clique relaxation models are not as “perfect” as cliques; and on the other hand, they do not exhibit the disadvantages that the clique has been criticized for. Using clique relaxations allows one to compromise between perfectness and flexibility, between ideality and reality, which is a usual issue that an engineer deals with when applying theoretical knowledge to solve practical problems in industry. Using clique relaxations in social networks, such as the phone call graph, allows one to find groups of people that may not necessarily be friends, but are still closely connected. Depend-ing of the clique relaxation model used, the way people within a group are connected may also be different, e.g., they may know each other trough the third person, or they may know only some people from the group, etc. The clique relaxation models were first proposed in social network studies rather long time ago [97], but they still

(19)

have not been well investigated from a mathematical programming perspective. In our research, we emphasize three possible relaxations of clique: the k-clique, k-clubandk-plexmodels. Of course, these three clique relaxation models do not cover all possible ways to relax the clique requirements, but they are the most well-known models in social network analysis. First of all, we defined the optimization problems corresponding to the introduced models, provided their mathematical programming formulations and showed that the problems areN P-hard. Next, we concentrated on methods for solving these problems. We developed an exact algorithm for the max-imum weight k-plex problem, as well as different scale-reduction techniques for the maximumk-clique, k-club andk-plex problems. Finally, we demonstrated the appli-cation of thek-plex model to real life through extensive computational experience.

The remainder of this dissertation is organized as follows. Chapter II provides background on the networks arising in real world problems; introduces the required terms and definitions from graph theory; defines the optimization problems of interest; and provides the relevant literature review. Chapter III concentrates on the scale-reduction technique developed for the MWISP based on the concept of critical weight set. Provided in this chapter theoretical results are used to develop the algorithm. The efficacy of the approach is demonstrated by extensive numerical experiments with large-scale problem instances. Chapter IV investigates the relation between this technique and other scale-reduction approaches for the MWISP. Three more different techniques are considered, the similarity and differences were established and an extension to one of these approaches was proposed.

In Chapter V, we switch from cliques and independent sets to their relaxation models. We define the k-clique, k-club and k-plex formally and consider the rela-tionship between the defined models. Next, the corresponding optimization problems are formulated, their complexity and mathematical programming formulations are

(20)

established. Finally, we concentrate our attention on one of the relatively easily solv-able case of the introduced problems, the maximum 2-club problem, and develop the corresponding algorithm. Chapter VI is dedicated to the development of the exact algorithm for the maximum weight k-plex problem. The chapter presents the gen-eral idea of the algorithm as well as implementation details and discusses possible improvements. The numerical results reported in this chapter allow to evaluate the algorithm’s performance and show its superiority to existing approaches.

Finally, Chapter VII demonstrates the application of clique relaxation models in real world, using the instances of market graph. The approach extends the applica-tion of combinatorial optimizaapplica-tion methods to the market graph, presented in earlier work [27]. Chapter VIII concludes this dissertation and discusses the possible future work.

Some of the results presented Chapters III and V have appeared in [39] and [16], respectively, and papers based on the results of Chapters IV, VI and VII will be submitted for publication.

All figures in this dissertation were generated using Graphviz software [75]

with dot2tex converter [57] and all plots were built using pgfplots package [60]

(21)

CHAPTER II

BACKGROUND

This chapter introduces the definitions and notations used throughout this disserta-tion and provides some background informadisserta-tion. Secdisserta-tion II.1 discusses graph theory basics. The problem definitions, properties, complexity results and existing solution approaches are discussed in Section II.2. Finally, Section II.3 reviews selected applica-tions of the considered problems in solving real-world optimization problems, points out some issues arising in such applications, and introduces the clique relaxation models that may address the issues raised.

II.1. Graph Theory

LetG= (V, E) be a graph with the vertex (node) setV ={1,2, . . . , n} and the edge set E ⊆ V ×V, where (i, j) ∈ E if vertices i and j are adjacent. The number of vertices of the graphn=|V|is called the orderof the graph and the number of edges m=|E|is called thesizeof the graph [53]. Here and later, unless specified explicitly, all considered graphs are assumed to be loopless (edges like (u, u) are not allowed), undirected (edges (u, v) and (v, u) are not distinguishable), and with no multiedges (two vertices may not be connected by more than one edge). The notationsV(G) and E(G) denote the vertex and edge sets of graphG, respectively. Given the graphG, its complementgraph ¯Gis the graph with the same vertex set that has all possible edges not present inGand no edges present in G. A graph is called completeif it contains all possible edges. The complete graph of ordern is denoted by Kn. The subgraph induced byS ⊆V is the graph G[S] = (S, E∩(S×S)) that has S as its vertex set, and its edge set contains all possible edges from the original graph that connect pairs

(22)

of vertices fromS. Given a vertex v ∈V and a vertex subset S ⊆ V, letV −v and V −S denote the graph induced by V \ {v} and V \S, correspondingly. For a set S ⊆ V, its neighborhood N(S) is the set of all vertices of G that are adjacent to at least one vertex of S. The closed neighborhood of S is N[S] =N(S)∪S. If S ={s}

is a single vertex set, then instead of writing N({s}) and N[{s}], we will simplify write N(s) and N[s], respectively, and will speak about the node’s neighborhood. By degG(v) we understand the degree of vertex v in graph G, which is |N(v)|. The maximum and minimum degrees of a vertex in graph G are denoted by ∆(G) and δ(G), respectively. Abipartite graphis a graph whose vertices can be divided into two disjoint sets calledbipartitions V1 and V2, such that every edge connects a vertex in V1 to a vertex inV2. Thecomplete bipartitegraph with bipartitions of sizespandq is denoted byKp,q and is defined as the bipartite graph with all possible edges between vertices from V1 and V2. In particular, the graph K1,n is called a star. A path in a graph is a sequence of vertices such that each two consecutive vertices in the sequence are connected by an edge. A cycle is a path in which the first and the last vertices are the same. Cycle and path on n vertices are denoted by Cn and Pn respectively. A graph is calledconnectedif there is a path between every pair of distinct vertices u andv in the graph. Aconnected componentofGis defined as a maximal by inclusion connected subgraph of G.

For two vertices u and v from V(G), the distance dG(u, v) between u and v in G is the length of the shortest path between u and v, where the length of a path is measured in the number of edges in this path, i.e., is one less than the number of vertices in the sequence defining the path. The largest distance between any two vertices inGis called thediameter ofGand is denoted by d(G). For graphs that are not connected, the diameter is assumed to be +∞. For a positive numberk, thek-th power of graphGis the graph with the same vertex set V(G) and the edge set given

(23)

byE(Gk) = {(u, v) : dG(u, v)≤k}. Let w ∈ _R|V| _(w _∈

R|E|) be a vector defining weights for the graph’s vertices

(edges), which will be assumed to be nonnegative, unless specified explicitly. By w(v) orwv we denote the weight associated with vertexv, and byw(e),weand byw((u, v)) we denote the weight of edge e = (u, v). w(S) = P

i∈Sw(i) denotes the weight of a vertex or edge subset S, and w(G) = P

i∈V(G)w(i) denotes the weight of graph G, which is the sum of weights of all graph vertices.

II.2. Independent Sets and Cliques

A subsetI ofV is called an independent (stable) set if the subgraph G[I] induced by I has no edges. A subset C of V is called a clique if the subgraph G[C] is a complete graph. An independent set (clique) I is called maximal if it could not be extended by adding vertices from V \ I. A maximum independent set (clique) of G is an independent set (clique) of the largest cardinality inG. When speaking about graph vertex subset, the terms maximal and maximum will be distinguished analogously throughout this work, i.e. maximal means maximal by inclusion, while maximum means the largest by cardinality or weight. The same is true for the terms minimal andminimum, i.e.,minimalby inclusion andminimumby cardinality. The cardinality of a maximum independent set is called theindependence (stability) number, and is denoted by α(G). The cardinality of a maximum clique is called the clique number and is denoted byω(G). Obviously, if I is an independent set inG then I is a clique in ¯G and vice versa, so α(G) = ω( ¯G). Therefore, all properties of independent sets in a graphG are also valid for cliques in the complement graph ¯G.

For a vertex-weighted graph G, a maximum weight independent (stable) set is an independent set of the largest weight and the maximum weight clique is a clique

(24)

with the largest weight in this graph. The weighted clique and stability numbers are redefined correspondingly. Let us note that maximal weight independent set (clique) has no meaning, thus maximality by inclusion will be considered in exactly the same way as in the unweighted case. The maximum weight independent set problem (MWISP) may be formulated as follows: given a graph G with the vertex weights defined by vector w, find an independent set of maximum weight in G [28], and the maximum weight clique problem (MWCP) asks one to find a clique of the maximum weight in the given graph.

The independent set (clique) problem is defined as follows: given a graph G and a positive integer k, does G have an independent set (clique) of cardinality at least k? This is the decision version of an optimization problem, the maximum independent set (clique) problem. From the complexity theory [66, 113], it is a well-known fact that the independent set and clique problems areN P-complete and the maximum independent set and clique problem are N P-hard [66], which means that there are no polynomial-time (also called efficient) algorithms for these problems, under assumption that P 6= N P. These complexity results are easily extendable to the weighted version of the problems.

The algorithms for any combinatorial optimization problem may be classified by the type of the solution they provide. In such classification, the algorithms that provide exact solutions belong to the class of exact algorithms. For the MWISP and MWCP, the exact algorithms ensure finding an optimal solution, but take exponential running time to complete, since the problems are N P-hard. Heuristic algorithms or simply heuristics unlike the exact approaches, do not guarantee that an optimal solution will be found. In fact, they do not even guarantee that the solution obtained is close to optimal, but they are constructed with the purpose of finding sufficiently good solutions in reasonable time [6,74]. When a solution is obtained, even if it seems

(25)

to be of good quality, there is no proof that it is not arbitrarily far away from optimum. Despite of the lack of mathematically strict conclusion about the solution, heuristics are very popular in engineering, since they may provide a practical solution when the exact solution is hard to find or, in case of large-scale real-life problem, virtually impossible to compute due to time limitations. After such solution is obtained, it is evaluated from the engineering point of view and may be used in real life, even without the optimality guarantee.

The approximation algorithms are the middle choice between the exact and heuristic ones. Unlike exact algorithms, the approximation algorithms are not guar-anteed to find the exact solution, but unlike heuristics, which usually do not provide any performance guarantees, the approximation algorithms come with a bound on how far away the value of the computed solution may be from the optimal one. For > 0 an -approximation algorithm has the property that any computed approxi-mate solution cannot have a value less (for the maximization problem) than the factor (1−) times the optimal value. Since approximation algorithms provide solutions that may not be exact, one may want to know what is the complexity of computing an approximate solution. This question has different answer for different problems. While for some problems, such as the traveling salesman problem [24, 124] with Eu-clidean distance, there exists a polynomial-time approximation scheme (PTAS), i.e., it is possible to find -approximation polynomial time algorithm for any >0 [10, 132], the maximum weight independent set and clique problems cannot be approximated within any constant factor unless P =N P [11, 83, 114].

Another important class of algorithm are the algorithms that compute bounds for the solution. Of course, any heuristic or approximation algorithm that finds any feasible solution provides a lower bound, but some algorithms may provide the bounds (upper or lower) without even computing a feasible solution. Such algorithms

(26)

play important role when one needs to evaluate the optimal value, either in real life problems, or as a subroutine of an exact algorithm, such as a branch-and-bound based algorithm [94, 96].

Next we review some well-known approaches for the maximum weight indepen-dent set and clique problems. One class of exact algorithms is based on mathematical programming methods. The advantage of the methods based on mathematical pro-gramming formulations is due to the possibility they provide for applying a wide variety of powerful integer, quadratic or non-linear programming techniques for solv-ing the problem. Let xi be a boolean variable. Then for a vertex set S, the vector (x1, x2,· · · , xn), where xi = 1 if and only if i∈ S, is called the characteristic vector of S. The MWISP may be formulated as an integer programming (IP) problem in many possible ways. One of the first and the most well-known formulation is theedge formulationprovided by Nemhauser and Trotter in 1975 [110] is given by:

α(G) = maxP i∈V wixi, s.t xi+xj ≤1 ∀(i, j)∈E, xi ∈ {0,1} ∀i∈V. (2.1)

where the constraints ensure that two adjacent vertices cannot be included in the set together at the same time. Another interesting IP formulation was recently found by Balasundaram [14], when investigating an IP formulation of the maximumk-plex problem: α(G) = maxP i∈V wixi, s. t. : P j∈N(i) xj ≤degG(i)(1−xi) ∀ i∈V, xi ∈ {0,1} ∀ i∈V. (2.2)

The first formulation has as many constraints as the number of edges in the graph, and each constraint involves only two variables, so the corresponding constraint matrix is

(27)

large and sparse. The second formulation, in contrast, creates a relatively small but dense constraint matrix.

A wide variety of IP techniques may be applied to such formulations of the MWISP, including the general, classical IP approaches [49, 67, 73, 109, 137, 138], as well as specialized approaches designed specifically for the MWISP. For example, Warrier et al. [133] used a branch-and-price (BP) approach to decompose the orig-inal graph into smaller subgraphs, solve the MWISP in such subgraphs (which is much easier than solving the original problem), and use the obtained information to generate columns in a BP framework for the original graph. By ignoring the integral-ity requirement for the decision variables, the linear programming (LP) relaxation is obtained that provides an upper bound for the problem solution. The LP relax-ation is just a linear problem, that may be solved efficiently using the classic LP approaches [22, 47, 51].

Other well-known mathematical programming formulations of the MWISP are quadratic boolean optimization formulations. The advantage of these formulations is the absence of the constraints on the decision variables, except for binary restrictions. One such formulation is studied by Hammer et al. [32, 33]:

α(G) = max xi∈{0,1}   X i∈V wixi−W X (i,j)∈E xixj  , (2.3)

where W = w(V) is the weight of graph G. Given an optimal solution x∗ to this problem, the set I ={v :x∗_v = 1} is a maximum independent set of G.

In addition, there exist numerous continuous optimization formulations of the maximum independent set problem, one of which was provided in [14]:

α(G) = max x∈[0,1]n X i∈V xi 1 + P j∈N(i) xj , (2.4)

(28)

whereN(j) ={v : (j, v)∈E} is the neighborhood of vertex j. Note that, unlike two previous formulations, this formulation allows one to find the size of the maximum independent set α(G) but not the set itself. However, any global maximum of this formulation corresponds to a subset of vertices inducing a subgraph whose connected components are cliques (the so-called independent union of cliques), and a maximum independent set can be obtained by taking one vertex from each such clique. More information on non-linear programming approaches for the MWISP may be found in [36, 68, 108, 115].

All known exact combinatorial algorithms for solving the problems of interest have exponential time complexity. Many such algorithms utilize branch-and-bound ideas. Based on a simple heuristic strategy, Carraghan and Pardalos [41] designed one of the best-known exact algorithms for the maximum clique problem that per-forms particularly well on sparse graphs. Using different idea, that is also heuristic,

¨

Osterg ˙ard developed the algorithm for the maximum weight clique problem, which is considered the state of the art at the moment [111, 112]. The benefit of the combi-natorial algorithms is their higher performance compared to the IP approach, due to narrow algorithm specifications.

A wide variety of heuristic techniques was successfully employed in order to solve the maximum weight independent set and clique problems [118]. Even simple local search heuristics [2], may be applied to the problem [21]. The tabu search [69–71] was applied to problems of interest in [63, 128]; Feo and Resende [58, 59] proposed the Greedy Randomized Adaptive Search Procedure (GRASP) for the maximum independent set problem; simulated annealing introduced in [1,92] is also applicable to these problems [29, 30]. Finally, neural networks [84, 143] and genetic algorithms [72] were successfully used for solving the maximum independent set problem in [87, 88] and [31, 82, 101], correspondingly. In addition to the classic approaches, there are

(29)

many special heuristics developed for the problem, such as a relatively new but very promising Global Equilibrium Search by Shylo et al. [116, 126].

The importance and popularity of the MWISP, is the reason why there are so many approaches developed for this problem, but at the same time, the problem also serves as a good challenging benchmark for evaluate the performance of general-purpose approaches.

The methods described above may be applied to all instances of the MWISP and, due to theN P-hardness of the problem, they have exponential time complexity. From the practical point of view this means excessively long running time in general. However, in many cases, the graphs of interest have some properties that may be ex-ploited and utilized in solving the maximum independent set problem. Hence, many studies of the MWISP are restricted to some special classes of graphs. In some cases such studies include strict theoretical results and provide exact algorithms (as an ex-ample, Minty showed that the maximum independent set problem can be solved in polynomial time on a claw-free graph and provided the algorithm for finding the max-imum independent set that is applicable to this graph class only [106]). Other studies provide theoretical results without practical algorithms (e.g., Hunt et al. [85] obtained the result of existence of a polynomial-time approximation scheme for the maximum independent set problem in unit disk graphs, but presented no algorithm that can utilize this fact to solve the problem faster). And, finally, many results have been justified by numerical study only, without any theoretical proof (e.g., based on numer-ical experiments, it is known that Carraghan-Pardalos [41] and ¨Osterg ˙ard [111, 112] algorithms perform extremely well with many instances, but there are no additional theoretical results related to the complexity of the algorithm, except that the algo-rithms have exponential complexity in general).

(30)

may be defined as an optional additional step that is performed before the main al-gorithm or one of its iterations is executed. Preprocessing could either reduce the size of the input instance (this is sometimes referred to as kernelization [43, 46]), or speed up the main algorithm [111]. Such techniques are usually very specific to a particular instance and algorithm. Preprocessing procedures exist both for combina-torial algorithms [39, 43, 46] and for approaches based on mathematical programming formulations [32, 33, 110].

II.3. Real Life Networks and Clique Relaxations∗

Study of biological networks and other complex networks such as the Internet and the world wide web, introduced earlier in Chapter I, have received special attention from scientists because of their interesting properties and the information they hold. In this respect, the concept ofscale-free networks[8, 18] is a recent development. It has been observed that the degree distributions of a large number of such complex networks follow a power law. As a consequence, average degree is no longer representative and a majority of the nodes have few neighbors, while a smaller number of nodes have very high degrees.

The principle of preferential attachment, which suggests that the new nodes have a higher probability to link to nodes that already have a high degree, is used to explain the power-law degree distribution of such scale-free graphs. In addition, these networks are also hierarchical in the sense that they can be partitioned into a collection of functional modules. Analysis of several biological networks provides strong evidence that biological networks are both scale-free and modular. Identifying ∗_{Parts of this section are reprinted with permission from Balasundaram, B., Butenko,} S., Trukhanov, S.: Novel approaches for analyzing biological networks. Journal of Combi-natorial Optimization10(1), 23–39 (2005) c Springer.

(31)

large clusters or functional modules in biological networks can aid different objectives depending on the nature of these networks. Clique models have been most popular in this area as they representtight clusters in a network. Cliques have been used to clustergene co-expression networks[90,119]. Cliques and high density subgraphs have also been used to cluster protein interaction networks in [65, 129]. However, clique models could be overly restrictive in describing clusters in such networks. Graph theoretic clique relaxations that are used in social network analysis for identifying cohesive subgroups can provide interesting insights into these networks and provide more information than what is revealed by cliques. Relaxing the restrictions imposed by clique models could reveal new protein interactions. In particular, structures where interactions of proteins occur through a central protein, which are likely to be found in similar biological processes, can be identified [13] by the models suggested in this paper.

Besides biological networks, cohesive subgroups can be used to cluster airline networks where reachability is a critical issue. An important classical application of cohesive subgroups is the study of terrorist and other criminal networks [12, 42, 52]. More recently, these models have been used to study web graphs in Internet research [130] to facilitate organization and faster retrieval of information from the web. These approaches have also been used in clustering wireless networks [93] and for other graph based data mining applications [48, 61, 134].

Among other applications, cliques are often used to represent clusters of simi-lar elements. For example, in social networks, a clique represents a group of people such that any two of them have a certain kind of relationship (friendship, acquain-tance, etc.) with each other [107]. In fact, some of the earliest works addressing the concept of cliques and methods of their detection were motivated by applications in sociometry [78, 97, 98]. Social network analysis requires three properties in a cohesive

(32)

subgroup model: familiarity, reachability and robustness, that translate to degree, distance/diameter and connectivity in graph theory. Clique is ideal with respect to these three properties: it provides the maximum possible familiarity (degree) among clique members, it has the smallest possible pairwise reachability (distance) between members, as well as, the smallest possible diameter of the whole graph, finally, clique has the maximum possible robustness (connectivity).

The clustering problems studied in this dissertation deal with relaxations of the idea of a clique, in which, for any two vertices, the requirement of their connectedness is replaced with a less tight condition on the distance between them. We first state the corresponding definitions ofk-clique, k-clan and k-club as they originally appeared in the literature. Following which, we will point out some drawbacks in these definitions and modify them according to standard definitions of similar concepts in graph theory. It is not surprising that the clustering concepts of interest first appeared in studying cohesive subgroups in social networks, where the vertices correspond to actors in a social network and an edge indicates a relationship between two actors [135].

Luce [97] defines ank-cliqueof Gas a subset of vertices C ⊆V such that for all u, v ∈C : dG(u, v) ≤k and this subset is maximal by inclusion. In other words, an k-clique C is a set of vertices in which any two vertices are a distance of at most k from each other inG, and no other vertex in the graph is of distance k or less from every other vertex in C. Thus, if two verticesu, v ∈V belong to ank-clique C, then dG(u, v)≤k, however this does not imply thatdG(C)(u, v)≤k. Hence, the concept of k-clique lacks the requirement oftightness in the group corresponding to vertices of a k-clique, while such a requirement is essential to applications in social networks. This observation motivated Alba [7] to introduce the concept of asociometric clique, which was later renamed to k-clan by Mokken [107]. An k-clique C is called an k-clan if the diameter of the induced subgraphG(C) is no more thank. Finally, Mokken [107]

(33)

defines ank-club to be a maximal (by inclusion) subset of vertices, D⊆V such that the diameter of the induced subgraphG(D) is at mostk. A study of relations between cliques, clans and clubs in a graph can be found in [107].

Even though the concepts just defined are used quite extensively in social net-works analysis and are even covered in standard textbooks (see, e.g., [135]), their definitions have some deficiencies from the mathematical viewpoint. One consider-able drawback of the k-clan definition is that for some graphs an k-clan may not exist. This point is illustrated in Figure 2, which shows a graph with two 2-cliques

{1,2,3,4,5,6,7} and {1,2,3,5,6,7,8}, neither of which is a 2-clan.

8 4 1 7 3 5 2 6

Fig. 2 A graph with no 2-clans

Some other difficulties arise from the requirement of maximality (by inclusion) in all three definitions. In particular, this requirement makes checking whether a given subset of vertices is ank-club a nontrivial matter. Indeed, to check thatC is a k-clique, it suffices to show that there is no vertex outside C that could be added to C without violating the requirement that all pairwise distances between vertices do not exceed k. A similar criterion would not work for k-club, however, since in this case the maximality by inclusion is not equivalent to nonexistence of one vertex that could increase the size of thek-club [107].

(34)

Taking into account that the above definitions of 1-clique, 1-clan and 1-club all correspond to the standard definition of a maximal clique, we proposed to modify the definitions of k-clique and k-club accordingly [16]. Namely, by a k-clique of graph G = (V, E) we will mean a subset of vertices C, such that for any u, v ∈ C: dG(u, v) ≤ k. Similarly, by an k-club we will understand a subset of vertices D such that diam(G(D)) ≤ k. A similar definition of k-clan becomes redundant. The example in Figure 2 suggests the impracticality of such a concept, so we do not consider k-clans in the further discussion.

Finally, a degree-based relaxation, known as a k-plex, was defined by Seidman and Foster [125]. A k-plex is a subset of vertices S such that, for each vertexv ∈S, the degree of v in the induced subgraph degG[S](v) ≥ |S| − k. If k = 1, then the k-plex, as previous relaxations, corresponds to the standard definition of a clique.

To demonstrate the advantage of the clique relaxation models over the regular clique, consider as an example, the college football schedule graph, shown in Figure 1. Even though this graph has the maximum clique of order 9, that does not provide much useful information about the graph structure. On the other hand, Figure 3 presents the same graph, with vertices being grouped according to the maximal 4-plexes found in the graph. In such representation, one may easily observe the college football schedule structure, that exactly corresponds to the conferences structure: all teams are divided into 11 conferences of 8 to 12 teams each, and there are 4 independent teams that do not belong to any conference.

An interesting question about cohesive subgroups is, what happens if one or more members leave the subgroup? Will the subgroup preserve its structure in such case? For the clique, independent set, k-clique and k-plex this is true, bot not for the k-club. Strictly speaking, the graph property Π is called hereditary on induced subgraphs, i.e., if G is a graph with property Π, then deletion of any nodes does not

(35)

Fig. 3College football schedule graph clustered on 4-plexes

produce a graph violating Π. Property Π is nontrivial if it is true for a single node graph and is not satisfied by all the graphs. Finally, a property isinteresting if there are arbitrarily large graphs satisfying Π. The maximum Π problem asks to find the maximum (or maximum weight) induced subgraph that does not violate property Π. Yannakakis [140] obtained a general complexity result for such properties Π, that can be restated under our definitions as follows.

Theorem 1(Yannakakis, 1978). The maximumΠ problem for nontrivial, interesting graph properties that are hereditary on induced subgraphs is NP-hard.

(36)

Clearly, in social network studies one is interested with nontrivial and interesting properties only, so it looks like being hereditary on induced subgraphs is the source of the hardness of the problem. But one should not think that if a property is not hereditary on induced subgraphs, then the problem becomes easier. In fact, thek-club is not hereditary on induced subgraphs, but even verifying maximality by inclusion (which is usually easier than finding the maximum subgraph) is a nontrivial problem. Even though not stated explicitly, the cohesive subgroup are intended to be connected. Cliques andk-clubs are always connected, butk-cliques andk-plexes may not be. A graph consisting of two disjoint cliques of orderk−1 should be considered as two separate cliques rather than as a k-plex. Again, a restricted version of the problem, the maximum connected Π problem, may be formulated as follows: given a graph, find a maximum connected subgraph with the property Π. The effect of connectivity was considered again by Yannakakis [141] and the corresponding result may be stated as follows.

Theorem 2(Yannakakis, 1979). The maximum connected Πproblem for graph prop-erties that are hereditary on induced (connected) graphs, nontrivial and interesting on connected graphs is NP-hard.

Indeed, these two complexity results do not cover all possible definitions of the cohesive subgroups, but they give the idea that problems of finding cohesive subgroups are not easy to solve in general.

All clique relaxation models considered above were based on relaxation require-ments for vertex properties of a clique. There are also some models based on relaxing edge properties for clique. Clique has the maximum possible number of edges between its members, which is equal ton(n−1)/2 for a clique of ordern. For a given 0≥γ ≥1, the authors of [4] defined the γ-clique orquasi-cliqueas a vertex set S ⊆V(G), such

(37)

that the graph G[S] is connected and has edge density at least γ, where the edge density is defined in an obvious way as |E(G[S])|/(|S|(|S| −1)/2). There is some confusion with terminology, since other authors define quasi-clique based on vertex degrees. For example, [117] defines aγ-complete graph as a connected graph G such that every vertex in the graph has a degree at least γ(|V(G)| −1). They define a γ-quasi-clique as a maximal by inclusion γ-complete subgraph ofG. For γ = 1 both definitions for clique are just the regular clique. Both variants of defined quasi-clique models are widely used in biological and social network analysis along with many other models that are out of the scope of this dissertation [44, 56, 144].

Even though the concept of cohesive subgroups is borrowed from social net-work analysis, these ideas are applicable to any netnet-work, and finding these cohesive subgroups can reveal several important structural aspects of the networks. Despite a number of important practical applications, the combinatorial optimization prob-lems concerned with finding large k-cliques, k-clubs and k-plexes have not been well studied analytically or computationally. In fact, little has been known about the complexity aspects of such problems and mathematical programming approaches to these problems are in their infancy.

(38)

CHAPTER III

SCALE REDUCTION APPROACH FOR THE MAXIMUM WEIGHT

INDEPENDENT SET PROBLEM∗

In this chapter, we develop a method that utilizes the polynomially solvable critical weight independent set problem for solving the maximum weight independent set problem on graphs with a nonempty critical weight independent set. Section III.1 provides definitions and background of critical weighted sets and critical weight in-dependent sets. Next, in Section III.2, we establish the relationship between critical independent and maximum independent sets that allows us to develop the algorithm in Section III.3. The effectiveness of the proposed approach on large graphs with large independence number is demonstrated through extensive numerical experiments in Section III.4.

III.1. Critical and Critical Independent Sets

Let G = (V, E) be a simple undirected graph with vertex weights w : V → _R+. Zhang [142] introduced the critical sets and critical independent sets as follows:

A vertex setUc ⊆V is called criticalif

µc(G) =|Uc| − |N(Uc)|= max{|U| − |N(U)|: U ⊆V}. (3.1) The numberµc is called thecritical number ofG. An independent setIc ⊆V is called ∗_{Parts of this chapter are reprinted with permission from Butenko, S., Trukhanov S.:} Using critical sets for the maximum independent set problem solving. Operations Research Letters35(4), 519–524 (2007) c Elsevier B.V.

(39)

criticalif

αc(G) =|Ic| − |N(Ic)|= max{|I| − |N(I)|: I is an independent set of G}. (3.2) The corresponding number αc(G) is called the critical independence number of G. Later Ageev [5] made a generalization of these definitions to graphs with vertex weights. A vertex setUc ⊆V is calledcritical weighted if

µc(G) = w(Uc)−w(N(Uc)) = max{w(U)−w(N(U)) : U ⊆V}. (3.3) Ic ⊆V is a critical weightindependent set if

αc(G) = w(Ic)−w(N(Ic)) = max{w(I)−w(N(I)) : I is an independent set of G}. (3.4) The main result provided by Ageev [5] is that the problems of finding a critical weighted and critical weight independent set are polynomially solvable, moreover αc(G) = µc(G). Obviously, αc(G) ≤ µc(G). Let Uc be a critical weighted set in G, so µc(G) =w(Uc)−w(N(Uc)). Let A⊆ Uc be a set of non-isolated vertices of Uc in G[Uc]. Then A⊆N(Uc) and henceA ⊆Uc ∩N(Uc), but in such case

w(Uc\A)−w(N(Uc\A))

≥w(Uc)−w(A)−(w(N(Uc))−w(A)) =w(Uc)−w(N(Uc)) = µc(G).

On the other hand,Uc is a critical weight set, so w(Uc\A)−w(N(Uc\A)) =µc(G), moreover V \A is independent by construction, so Ic = Uc \A is a critical weight independent set of G and αc(G) = µc(G). This proof also provides a way to find a critical weight independent set from a known critical weight set: just take all isolated vertices inG[Uc] and the resulting set will be a critical weight independent set of G. To show that the problem of finding a critical weighted set is solvable in

(40)

polyno-mial time, first consider its IP formulation: µc(G) = max P v∈V wvxv− P v∈V wvyv, s.t. yv ≥xu ∀(u, v)∈E, xv, yv ∈ {0,1} ∀v ∈V. (3.5)

The formulation means that we try to minimize the difference between the weights of two vertex subsets such that the vertex from second subset is chosen whenever at least one of its neighbor is chosen. Ageev has shown that the optimal solution partsx∗ and y∗ correspond to the critical weighted set and its neighborhood, respectively. Next, let us make the substitution zv = 1−yv ∀v ∈ V, then the problem (3.5) transforms to: µc(G) = |V|+ max P v∈V wvxv + P v∈V wvzv, s.t. yv+zu ≤1 ∀(u, v)∈E, xv, zv ∈ {0,1} ∀v ∈V, (3.6)

which is, up to the constant additive |V|, an IP formulation for an instance of the maximum weight independent set problem onB(G), the so-called bidualgraph of G, whose vertex set is defined by V(B(G)) =V ∪V0, where V0 is a copy of vertex set V, and the edge set defined as E(B(G)) = {(u, v0),(u0, v) : (u, v) ∈ E(G)}. Simply speaking, graphB(G) is a bipartite graph, each of the two partitions of which is a copy of vertices of the original graph, and the edge between vertices from different partitions exists if and only if the corresponding vertices in the original graph are connected. The problem of finding a maximum weight independent set on a bipartite graph is known to be polynomially solvable [66] by reduction to the maximum flow (minimum cut) problem on graph B0(G), which is a directed graph with edge weights. B0(G) is obtained fromB(G) by adding artificial vertices s (source) and s0 (sink), connecting s to all vertices fromV, connecting all vertices from V0 to s0, and directing all edges

(41)

of B(G) from V to V0. The weights on edges (upper limit for flow) are equal to wv for edges {(s, v) : v ∈ V} and {(v0, s0) : v0 ∈ V0}, edges from V to V0 have no limits (i.e., the weights are equal to +∞). Vertices that are not adjacent to the edges from the minimum cut in B0(G) create a maximum weight independent set in B(G) [77] and thus, provide the way to find a critical weighted set in the original graph G. This way is not necessarily the best one for finding a critical weighted set, but it establishes the complexity result and plays an important role in further theoretical research. Figure 4 shows the original graph and illustrates the corresponding network flow problem on B0(G).

III.2. Relation Between Critical Independent and Maximum Indepen-dent Sets

This section establishes the results relating the critical weighted independent set problem to the maximum weight independent set problem. In particular, the main theorem states that any critical weighted independent set is a subset of a maximum weight independent set. This fact is then used to develop a scale-reduction proce-dure for the maximum independent set problem in graphs with a nonempty critical independent set.

Lemma 3. If Ic is a critical weighted independent set and a maximal independent set, then Ic is a maximum weight independent set.

Proof. Since Ic is a maximal independent set, we have N(Ic) =V \Ic. Assume that there exists an independent setIwith weightw(I)> w(Ic). Thenw(V\I)< w(V\Ic) and

w(I)−w(N(I))≥w(I)−w(V \I)> w(Ic)−w(V \Ic), which contradicts to the fact thatIc is a critical weighted independent set.

(42)

1 6 2

3

4

5 (a) Original graph G

V V0 w1 w2 w3 w4 w5 w6 w1 w2 w3 w4 w5 w6 s s0 1 2 3 4 5 6 10 20 30 40 50 60

(b) Graph B0(G) for the network flow problem Fig. 4 Critical set to network flow reduction

(43)

Corollary 1. If Ic is a critical weighted independent set and a maximal independent set, then w(Ic)≥w(V)/2.

Proof. Follows from the non-negativity of w(Ic)−w(N(Ic)).

Theorem 4. If Ic is a critical weighted independent set, then there exists a maximum weight independent set I, such that Ic ⊆I.

Proof. LetJ be a maximum weight independent set inG, andIc be a critical weighted independent set. Put

I = (J∪Ic)\N(Ic).

Then I is an independent set, and Ic ⊆ I. To prove that I is a maximum weight independent set, it suffices to show that w(I) ≥ w(J). Assume that w(I) < w(J), thenw(J \I)> w(I\J). SinceI\J =Ic\J and J \I =N(Ic)

T

J, we obtain

w(N(Ic)∩J)> w(Ic\J). Using the last inequality and the inequality

w(N(Ic))≥w(N(Ic)∩J) +w(N(Ic ∩J)), we have

w(Ic)−w(N(Ic)) = w(Ic\J) +w(Ic∩J)−w(N(Ic))

< w(N(Ic)∩J) +w(Ic ∩J)−w(N(Ic)∩J)−w(N(Ic∩J)) = w(Ic∩J)−w(N(Ic∩J)).

We obtain a contradiction with the fact that Ic is a critical weighted independent set.

Theorem 4 states that a nonempty critical weighted set is always a part of a maximum weight independent set, therefore it can be used in order to reduce the

(44)

number of vertices to be analyzed when solving the maximum weight independent set problem. The following two lemmas provide some elementary properties of critical weighted sets and critical weighted independent sets in a graph that will be used in the reduction algorithm.

Lemma 5. LetU be a critical weighted set of the simple undirected graphG= (V, E). Then U0 = U ∪W, where W = V \(U ∪N(U)) is also a critical weighted set of G and U0 ∪N(U0) = V.

Proof. By the definition of W, U ∩W =∅ and N(U)∩W =∅, hence N(W)⊆V \U ⊆W ∪N(U). SinceU0 =U ∪W, we have N(U0) =N(U)∪N(W)⊆N(U)∪W ∪N(U) =W ∪N(U). Thus, w(U0)−w(N(U0)) = w(U) +w(W)−w(N(U0)) ≥ w(U) +w(W)−w(W ∪N(U)) = w(U)−w(N(U)),

soU0 is also a critical weighted set of G.

Note thatU0∪N(U0) =U ∪W ∪N(U)∪N(W) =V.

Lemma 6. Let U be a critical weighted set of G= (V, E), such that U∪N(U) =V, I ⊆U be a critical weighted independent set obtained fromU by taking isolated vertices of G(U). Then

I =U\N(U), N(I) =N(U)\U, and

(45)

Proof. We first show that I = U \N(U). Consider v ∈ I. By the definition of I, v ∈ U. Assume that v ∈ N(U), then there exists w ∈ U, w 6= v : (v, w) ∈ E, so v is not isolated in G(U), therefore v /∈ I. So, I ⊆ U \N(U). On the other hand, if v ∈U\N(U) then v ∈U and is isolated in G(U), so v ∈I. Thus, U \N(U)⊆I, so I =U \N(U).

Next we show that N(I) = N(U)\U. Note that since U ∪N(U) = V and N(I) ∩ U = ∅, we have N(I) ⊆ N(U) \U. Recall that from [5] we know that w(I)−w(N(I)) =w(U)−w(N(U)), but

w(U)−w(N(U)) =w(U \N(U)) +w(U∩N(U))−w(N(U)\U)−w(U ∩N(U)) =w(I)−w(N(U)\U).

So, w(N(I)) =w(N(U)\U) and, since N(I)⊆N(U)\U, we have N(I) =N(U)\U.

Finally,

V \(I∪N(I)) = (U ∪N(U))\((U \N(U))∪(N(U)\U)) = (U ∪N(U))\(U4N(U))

= U ∩N(U) = U \I, where4 denotes the symmetric difference.

III.3. Scale-reduction Algorithm

Theorem 4 allows one to reduce the size of the maximum weight independent set problem and could be used as a preprocessing step before the main algorithm is

(46)

applied. This idea is implemented in Algorithm 1:

Algorithm 1Critical set based scale-reduction algorithm 1: procedure CriticalReduction(G) 2: I ← ∅ 3: repeat 4: Uc ←Critical(G) 5: Ic ←Uc\N(Uc) 6: I ←I∪Ic 7: V ←V \Ic\N(Ic) 8: G←G[V] 9: untilIc =∅ 10: Gr←G 11: return I∪_MIS(Gr) 12: end procedure 1. Initialize I =∅.

2. Compute a critical set Uc of G. To find a critical set Uc in the unweighted case, first use a reduction of the critical set problem to the maximum matching problem in a bipartite graph as proposed by Ageev [5] and then apply a standard algorithm for computing a maximum matching in bipartite graph in O(|E||V|) time (see, e.g., the proof of K¨onig’s theorem in [49]). In the general case, the critical weighted set problem may be reduced to the selection problem [5], which is equivalent to finding a minimum cut (or maximum flow) in a bipartite graph [17, 122].

3. Compute a critical weight independent set of G by putting Ic = Uc \N(Uc), which is equivalent to removing all non-isolated vertices from G(Uc) [5, 39]. 4. If Ic 6=∅, putI =I∪Ic, V =V \(Ic ∪N(Ic)), G=G(V) and go to step 2. 5. Denote the remaining graph Gby Gr= (Vr, Er) and outputGr.

6. Find a maximum weight independent set in Gr. Any exact algorithm (e.g., [41, 111]) may be used for this purpose.

(47)

7. The maximum weight independent set of G will be a union of I and the maxi-mum weight independent set of Gr.

III.4. Numerical Experiments

We tested the critical weighted set approach to the maximum weight independent set problem on graphs G = (V, E) with unit vertex weight and α(G) > |V|/2, since in this case a critical independent set Ic is guaranteed to be nonempty. Note that it is easy to prove that the maximum independent set problem remains NP-hard even if restricted to graphs with α(G)>|V|/2.

To test the proposed approach, we generated a number of graphs with α(G) >

|V|/2 using Sanchis generator of maximum clique instances available from the DI-MACS ftp server at ftp://dimacs.rutgers.edu/pub/challenge/ (see also [91]). Sanchis graph generator details may be found in [79, 123]. We took complements of graphs obtained using the generator, and tested them for connectivity. All graphs considered in our experiments were connected. Since the Sanchis generator produces graphs with a predetermined maximum clique sizeω(G), the sizeα( ¯G) of the maximum indepen-dent set of its complement ¯G is known, α( ¯G) = ω(G). The number of vertices in the graphs used in our computations ranged from 1000 to 18000. Due to their large size, the maximum independent set problem in these graphs cannot be solved using standard exact algorithms. However, the critical set approach presented in this paper was extremely effective with all of the considered cases. The results of computations are summarized in Table 1. Here each row corresponds to one graphG= (V, E) and other notations used in the table are defined as follows. Uc denotes the computed critical set,Ic is the critical independent set obtained fromUc by taking the isolated vertices ofG(Uc), and, as before, αc(G) = |Ic| − |N(Ic)|. By Gr = (Vr, Er) we denote

(48)

the output of our algorithm, i.e., the graph obtained fromG after recursively remov-ing a critical independent set and its neighborhood. Finally, the last column reports the CPU time (in seconds) of execution of the proposed algorithm implemented in C programming language. The programs were compiled with the GCC 3.3.6 compiler and run on a Dell Inspiron 8600 computer running Linux 2.6 and configured with Pentium-4M 1400 MHz processor and 512 MB of RAM.

In another set of numerical tests we experimented with Erdös collaboration net-works available from Batagelj’s Netnet-works/Pajek Graph Files website [20], as well as complements of the graph coloring problem instances from the Trick’s graph coloring page [131]. The results of these experiments are presented in Tables 2 and 3, where all the notations used are the same as in Table 1. In Erdös collaboration networks considered, the vertices represent authors who are “connected” to Paul Erdös through a short path of co-authors [19]. More specifically, Table 2 considers instances with names in the form ERDOS.x.y, where xrepresents the last two digits of the year for which the network was constructed, and y represents the largest Erdös number of an author represented by a vertex in the graph. For example, the vertices in graph ERDOS.99.1 represent 472 co-authors of Paul Erdös, and the vertices in graph ER-DOS.99.2 correspond to 6100 researchers who co-authored a paper either with Erdös or with at least one of his co-authors. We considered such networks for years 1997-1999 and y = 1 and 2. Note that in all considered instances of Erdös collaboration networks their independence number exceeds half of the number of vertices, which is typical for social networks, as well as for large sparse networks arising in many other applications [3, 80, 81]. Naturally, the approach proposed in this chapter is a very effective step in solving the maximum independent set problem for such net-works. On the other hand, complements of most of the standard DIMACS maximum clique instances have relatively small independence numbers [91] and empty critical

(49)

Table 1 Results of experiments with Sanchis graphs |V| |E| α(G) |Uc| |Ic| αc(G) |Vr| |Er| Time 1000 181256 524 524 524 476 0 0 0.07 1000 186723 505 505 505 495 0 0 0.04 2000 686341 1103 1103 1103 897 0 0 0.97 2000 711955 1067 1067 1067 933 0 0 0.44 3000 944175 1535 1535 1535 1465 0 0 0.58 3000 954717 1563 1563 1563 1437 0 0 0.87 4000 1014603 2069 2069 2069 1931 0 0 7.08 4000 1090563 2309 2309 2309 1691 0 0 11.27 5000 1533472 2717 2717 2717 2283 0 0 11.43 5000 720845 3132 3132 3132 1868 0 0 47.74 6000 1775988 3302 3305 3259 2741 46 44 86.97 6000 1815973 3412 3412 3412 2588 0 0 33.64 7000 890777 4493 4493 4493 2507 0 0 154.07 8000 3193335 4394 4394 4394 3606 0 0 103.30 8000 481800 5249 5249 5249 2751 0 0 263.44 9000 4040615 4927 4930 4887 4113 43 41 273.70 9000 681131 5899 5899 5899 3101 0 0 381.69 10000 3775385 5811 5813 5799 4201 14 12 625.58 10000 4908379 5507 5508 5478 4522 30 29 249.76 11000 2546883 6862 6862 6862 4138 0 0 594.24 11000 6528244 5901 5902 5868 5132 34 33 220.11 12000 4862197 7098 7097 7075 4925 22 20 1041.04 12000 5549355 6973 6973 6973 5027 0 0 463.30 13000 1339999 8474 8474 8474 4526 0 0 1131.41 13000 5638263 7698 7705 7640 5358 67 58 1865.82 14000 10772525 7417 7423 7346 6654 77 72 1045.77 14000 3371180 8844 8844 8844 5156 0 0 1330.66 15000 4207335 9413 9417 9386 5614 31 27 2288.17 15000 6912205 8993 8994 8983 6017 11 10 1523.01 16000 14346706 8401 8407 8360 7640 47 41 2923.90 16000 4807361 10042 10042 10042 5958 0 0 2215.34 17000 10748092 9898 9901 9862 7138 39 36 2666.40 17000 913028 11239 11239 11239 5761 0 0 2596.95 18000 2038675 11782 11782 11782 6218 0 0 3249.14 18000 5106081 11412 11412 11402 6594 14 10 3830.01

(50)

Table 2 Results of experiments with Erd¨os networks Graph |V| |E| α(G) |Uc| |Ic| αc(G) |Vr| |Er| Time ERDOS.97.1 472 1314 254 317 179 58 158 271 11.96 ERDOS.97.2 5488 8972 5047 5034 5034 4606 26 13 183.52 ERDOS.98.1 485 1381 261 366 152 58 229 526 1.00 ERDOS.98.2 5822 9505 5368 5356 5356 4914 24 12 213.02 ERDOS.99.1 492 1417 263 375 144 56 246 614 0.42 ERDOS.99.2 6100 9939 5639 5629 5629 5178 20 10 239.61

independent sets, thus the critical independence set approach is useless for these in-stances. The same can be said about the collections of test instances for the maximum independent set problem available online at

http://www.research.att.com/˜njas/doc/graphs.html and

http://www.nlsde.buaa.edu.cn/˜kexu/benchmarks/graph-benchmarks.htm For all of these instances, with exception of several instances that had isolated vertices, the computed critical independent sets were empty. The results reported in Table 3 show that for a critical independent set to be nonempty, the independence number does not necessarily need to be very large, as for some of the considered graphs α(G)<|V|/2.

(51)

Table 3 Results of experiments with coloring problem instances Graph |V| |E| α(G) |Uc| |Ic| αc(G) |Vr| |Er| Time anna 138 493 80 75 61 29 45 45 0.01 david 87 406 36 81 13 9 70 305 0.01 fpsol2.i.1 496 11654 307 496 227 227 269 11654 0.31 fpsol2.i.2 451 8691 261 348 172 120 223 977 0.33 fpsol2.i.3 425 8688 238 322 146 94 223 974 0.27 huck 74 301 27 34 16 4 46 143 0.01 inithx.i.1 864 18707 566 741 430 363 367 11079 2.21 inithx.i.2 645 13979 365 383 257 144 273 642 0.97 inithx.i.3 621 13969 360 349 259 132 221 423 0.86 jean 80 254 38 39 26 15 35 108 0.01 zeroin.i.1 211 4100 120 211 85 85 126 3775 0.02 zeroin.i.2 211 3541 127 143 111 59 48 207 0.04 zeroin.i.3 206 3540 123 140 108 56 46 206 0.04

(52)

CHAPTER IV

RELATIONSHIP BETWEEN THE CRITICAL SET METHOD AND OTHER SCALE-REDUCTION TECHNIQUES

This chapter compares several different approaches used to reduce the size of an in-stance of the maximum independent set problem that were proposed in the literature since 1975. Section IV.1 introduces the problem and general characteristic of consid-ered

Novel approaches for solving large-scale optimization problems on graphs

SVYATOSLAV TRUKHANOV

ABSTRACT

ACKNOWLEDGMENTS

TABLE OF CONTENTS

IV RELATIONSHIP BETWEEN THE CRITICAL SET METHOD AND OTHER SCALE-REDUCTION TECHNIQUES . . . 39

V CLIQUE RELAXATION MODELS . . . 53

VIII CONCLUSION AND FUTURE WORK. . . 111

LIST OF FIGURES

CHAPTER I

CHAPTER II

CHAPTER III

CHAPTER IV