A study of general and security Stackelberg game formulations

(1)

HAL Id: hal-01917798

https://hal.inria.fr/hal-01917798

Submitted on 9 Nov 2018

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

A study of general and security Stackelberg game

formulations

Carlos Casorrán, Bernard Fortz, Martine Labbé, Fernando Ordóñez

To cite this version:

Carlos Casorrán, Bernard Fortz, Martine Labbé, Fernando Ordóñez. A study of general and security Stackelberg game formulations. European Journal of Operational Research, Elsevier, 2019, 278 (3), pp.855 - 868. �10.1016/j.ejor.2019.05.012�. �hal-01917798�

(2)

A study of general and security Stackelberg game formulations

Carlos Casorr´ana,b,c,*_{, Bernard Fortz}a,b_{, Martine Labb´}_ea,b_{, and Fernando Ord´}_o˜_nezc

a_D´_{epartement d’Informatique, Universit´}_{e Libre de Bruxelles, Brussels, Belgium} b_{INOCS, INRIA Lille Nord-Europe, Lille, France}

c_{Departamento de Ingenier´ıa Industrial, Universidad de Chile, Santiago, Chile} *_{Corresponding author. E-mail: [email protected]}

Abstract

In this paper, we analyze different mathematical formulations for general Stackelberg games (GSGs) and Stackelberg security games (SSGs). We consider GSGs in which a single leader commits to a utility maximizing strategy knowing that one of ppossible followers optimizes its own utility taking this leader strategy into account. SSGs are a type of GSG that arise in security applications where the strategies of the leader consist in protecting subsets of targets and the strategies of the p followers consist in attacking a single target. We compare existing mixed integer linear programming (MILP) formulations for GSGs, sorting them according to the tightness of their linear programming (LP) relaxations. We show that SSG formulations are projections of GSG formulations and exploit this link to derive a new SSG MILP formulation that i) has the tightest LP relaxation known among SSG MILP formulations and ii) its LP relaxation coincides with the convex hull of feasible solutions in the case of a single follower. We present computational experiments empirically comparing the difficulty of solving the formulations in the general and security settings. The new SSG MILP formulation is computationally efficient, in particular as the problem size increases.

Keywords: Integer programming, discrete optimization, game theory, bilevel

opti-mization.

1 Introduction

Stackelberg games model situations where players strive to optimize their individual ob-jectives in a single sequential encounter. These models assume a player, referred to as the leader, can commit to a strategy that optimizes its utility function and then players that respond to the leader’s decision, referred to as followers, take this decision into account

(3)

when deciding how to optimize their own utility functions. Stackelberg games were intro-duced to model market competition [von Stackelberg, 2011] and have been used in diverse applications since, such as traffic equilibrium [Krichene et al., 2014], network toll setting [Labb´e et al., 1998], and security [Brown et al., 2006, Jain et al., 2010].

In this work we consider normal form Stackelberg games with finite sets of actions for the leader and followers. We refer to these as general Stackeblerg games (GSG). The utility functions of GSGs are described by matrices, where each combination of actions for the leader and follower gives a reward value for each participant. Selecting a single action corresponds to a pure strategy, while a mixed strategy corresponds to a probability distribution over the set of actions for the player. Therefore, for GSGs the utility functions are bilinear functions of the players’ mixed strategies.

Stackelberg games can be expressed as bilevel optimization problems, where the top level represents the leader’s decision problem and includes the followers’ responses as the optimal solution to the second level problem [Colson et al., 2007]. Mixed integer formulations of GSGs have been introduced thanks to the bilinear objective functions and the linearization of the second level optimality conditions with the use of integer variables [Bard, 1998]. The manner in which the bilinear objectives and second level problem optimality conditions are linearized give rise to the different mixed integer linear programming (MILP) formula-tions considered in this work. For instance, using big M constraints to linearize both the leader objective and the second level optimality conditions give rise to the (D2) formulation [Kiekintveld et al., 2009]. The (DOBSS) formulation considers a single big M constraint but introduces new variables representing the product of the leader and follower strategies, [Paruchuri et al., 2008]. Finally, (MIP-p-G) is a formulation without big M constraints [Yin and Tambe, 2012]. Which of these MILP formulations of the bilevel stackelberg game problem is more convenient for computational efficiency is an underlying question of this work. When the leader in a GSG faces a single follower the problem can be solved in polynomial time, see [Conitzer and Sandholm, 2006]. The same reference shows that if there are multiple followers then the problem is NP-hard. A solution for the multiple fol-lowers problem can be obtained by using the algorithm for the single follower instance on a Harsanyi transformation of the problem, [Harsanyi and Selten, 1972], which combines the multiple adversaries into a single adversary with exponentially many actions. Solution methods based on mixed integer formulations of the multiple follower problem have been presented, for example, by [Jain et al., 2011] and [Yang et al., 2013].

Recent work has applied Stackelberg games in security settings where a leader has a limited budget to protect a set of targets while a follower aims to attack a single target. In

(4)

this domain, the payoff matrices are structured with only two payoff values for every partic-ipant depending on whether or not the defender strategy protects the target attacked. We refer to problems that have this structure as Stackelberg security games (SSGs), which are introduced in detail in Section 2. Some SSG applications have included assigning Federal Air Marshals to transatlantic flights [Jain et al., 2010], determining randomized port and waterways patrols for the U.S. Coast Guard [Shieh et al., 2012], preventing fare evasion in public transport systems [Yin et al., 2012], and protecting endangered wildlife [Yang et al., 2014]. The SSG models considered are closely related to the Interdiction games lit-erature, [McMasters and Mustin, 1970], specially when there is a fortification step. Such fortification-interdiction problems are multi-level optimization problems where a defender decides a limited fortification of a network, so that an interdictor (attacker) blocks a num-ber of edges in the network and an operator tries to maximize flow or minimize a path over the network. If the optimal operation response can be subsumed in the interdictor’s decision problem, then the problem has the structure of a Stackelberg security game. There are many variants and extensions of such fortification-interdiction games that allow multi-ple/sequential interdictions and problem specific formulations and algorithms, see reviews in [Smith and Lim, 2008, Snyder et al., 2016, Fischetti et al., 2018]. However, to the best of our knowledge there is no polyhedral study of different mixed integer optimization for-mulations that arise due to the bilevel nature of the interaction between the defender and the attacker.

In this paper we focus on the polyhedral analysis of different mixed integer formulations for GSGs and SSGs. In particular we provide the following four key contributions. First, we provide an exhaustive comparative study of existing MILP formulations for Stackelberg games. Starting from the natural bilevel representation of Stackelberg games, we use well-known integer programming techniques such as Fourier-Motzkin elimination [Dantzig and Eaves, 1973] and Reformulation Linearization Technique [Sherali and Adams, 1994] to derive known MILP formulations. Our study leads to a ranking of these MILP formulations in terms of the strength of their linear programming (LP) relaxations. Second, we explicit a formal link through projections of variables between the polyhedra of the LP relaxation of the GSGs formulations and those of SSGs. This allows to extend our study of GSG formulations to the security setting, leading to a comparison of SSG MILP formulations. Third, we derive (SDOBSSq,y,s) and (MIP-p-Sq,y), two new SSG MILP formulations. We show that (MIP-p-Sq,y) is the MILP formulation with the tightest linear relaxation among SSG formulations. We further show that if we restrict (MIP-p-Sq,y) to a single attacker type, its LP relaxation provides a complete linear description of the convex hull of its feasible

(5)

solutions. Fourth, we provide computational experiments that compare solution times of the MILP formulations in both settings. Our experiments show that the formulations with the tightest LP relaxations have faster solution times as the problem size increases. In particular (MIP-p-Sq,y) scales better than competing formulations, being able to tackle larger-sized instances.

The remainder of this paper is organized as follows. In Section 2, we define general and security Stackelberg games. In Section 3, we derive GSG formulations from the lit-erature. We provide theoretical results comparing the formulations presented. In Section 4, we describe and analyze computational experiments for the formulations in Section 3. In Section 5, we present SSG formulations using projections, in the appropriate space of variables, of the formulations in Section 3, and derive (SDOBSSq,y,s) and (MIP-p-Sq,y), new MILP formulations for SSGs. We then extend our theoretical comparisons of the general formulations to the security formulations. In Section 6, we describe and analyze the compu-tational experiments for the security formulations. We conclude with some closing remarks in Section 7.

2 Notation and definition of the problem

In this section, we provide a formal definition of the two types of problems we study.

2.1 General Stackelberg games–GSGs

Let K be the set of p followers. We denote by I the set of leader pure strategies and by

J the set of follower pure strategies. The leader has a known probability of facing follower

k∈K, denoted by πk_∈_[0_,_{1]. We denote the} _n_{-dimensional simplex by}

Sn={a∈[0,1]n:

Pn

h=1ah = 1}. A mixed strategy for the leader consists in a vector x∈ S|I| such that for i ∈ I, xi is the probability with which the leader plays pure strategy i. Analogously, a mixed strategy for a follower k ∈ K is a vector qk ∈ _S|J| _{such that,} _qk

j is the probability with which followerkreplies with pure strategyj∈J. The rewards or payoffs for the leader and each follower, resulting from their choice of strategy, are encoded in a different matrix for each follower. These payoff matrices are denoted by (Rk_{, C}k_{), where} _Rk _∈

R|I|×|J| is

the leader’s reward matrix when facing follower k ∈ K and Ck ∈ _R|I|×|J| _{is the reward}

matrix for follower k. The expected reward of the leader and follower k, respectively, can be expressed as follows: X i∈I X j∈J X k∈K πkRk_ijxiqkj, (1) X i∈I X j∈J C_ijkxiqkj, ∀k∈K. (2)

(6)

For all k ∈ K, we define the function Bk _:

S|I| −→ S|J| as the function that, given the

leader’s mixed strategy x, returns a best response qk for each follower k. The solution concept used in these games is the Strong Stackelberg Equilibrium (SSE), introduced in [Leitman, 1978] and defined below.

Definition 1. A profile of mixed strategies (x,{Bk₍_x₎_}

k∈K) form an SSE if:

1. The leader always plays a payoff-maximizing strategy:

xTRkBk(x)≥x0TRkBk(x0) ∀x0 ∈_S|I|,∀k∈K.

2. Each follower always plays a best-response,Bk₍_x₎_∈_Fk₍_x₎_{, where} _∀_k_∈_K_,

Fk(x) = arg max qk {x

T_Ck_qk_:_qk _∈

S|J|}

is the set of best responses for each follower.

3. Each follower breaks ties optimally in favor of the leader:

xTRkBk(x)≥xTRkqk ∀qk∈Fk(x).

An SSE assumes that the follower breaks ties in favor of the leader by choosing, when indifferent between different follower strategies, the strategy that maximizes the payoff of the leader. An SSE is in practice always achievable as the leader can always induce one by selecting a sub-optimal mixed strategy arbitrarily close to the equilibrium, causing the follower to prefer the desired strategy [von Stackelberg, 2011].

Proposition 1. For any leader strategy x and any k∈ K, there is a best response to the

k-th follower’s problem that is given by a vector qk ∈ {0,1}|J| _{such that} P

j∈Jqjk= 1.

Proof. Assume that Bk(x) = ¯qk 6∈ {0,1}|J|_{. We show that any canonical vector} _ejk _such that ¯q_jk>0, is also a best response vector,i.e.,ejk ∈Fk(x) and xTRkejk ≥xTRkqk for all

qk ∈Fk(x). Since ¯qk =P

j∈Jq¯jkejk, with ejk ∈S|J|, andxTCkejk ≤xTCkq¯k for all j∈J, we have that xTCkq¯k = P

j∈Jq¯jk(xTCkejk) ≤ P

j∈Jq¯kj(xTCkq¯k) = xTCkq¯k. This implies that for any ¯qk_j > 0 we have xTCkejk =xTCkq¯k, giving ejk ∈Fk(x). A similar argument shows that for any j such that ¯q_jk > 0 we have xTRkejk = xTRkq¯k; Hence, ejk is a best

response vector.

This result shows that we can restrict the follower’s best response only to pure strategies without influencing the SSE solution concept, as done in [Paruchuri et al., 2008].

In mathematical optimization, Stackelberg games are formulated as bilevel programming (BP) problems [Bracken and McGill, 1973]. In BP the optimization problems have two

(7)

levels where the top level problem considers some variables that are the optimal solution to another, second level, optimization problem. Important BP surveys are those by [Kolstad, 1985, Savard, 1989, Anandalingam and Friesz, 1992, Labb´e and Violin, 2016]. In our setting, the first level problem corresponds to the leader’s decision problem and the nested problem corresponds to the follower’s decision problem. The following model, (BIL-p-Gx,q), is a bilevel program for the general Stackelberg game problem:

(BIL-p-Gx,q) Maxx,q X i∈I X j∈J X k∈K πkRk_ijxiqkj (3) s.t. x∈_S|I| (4) qk∈arg max_rk    X i∈I X j∈J C_ijkxirkj    ∀k∈K, (5) r_jk∈ {0,1} ∀j∈J,∀k∈K, (6) X j∈J r_jk= 1 ∀k∈K. (7)

The objective function maximizes the leader’s expected reward. Condition (4) charac-terizes the mixed strategies considered by the leader. The second level problem defined by (5)-(7) indicates that the follower maximizes its own payoff by giving a best response with a pure strategy to the leader’s mixed strategy. Recall that such a pure strategy always exists as shown in Proposition 1. If there are multiple optimal strategies for the follower, the main level problem selects the one that benefits the objective of the leader.

2.2 Stackelberg security games–SSGs

In a Stackelberg security game (SSG) the defender allocates security resources to protect a subset of targets. Let J be the set of n targets that could be attacked and assume there are security resources to protect up to m < n of these targets. The setI of defender pure strategies is composed by all Pm

i=1

n i

subsets of at mostm targets ofJ that the defender can protect simultaneously. With a slight abuse of notation, we refer toi∈I in this context as both the index running through the set of defender pure strategies I and as i⊂ J the corresponding subset of J with at mostmtargets that are protected by security resources. Similar to GSGs, the elements j∈J constitute the pure strategies of each attacker, which for SSG represents the single target attacked by the follower. In SSGs, payoffs for the players only depend on whether the target attacked is protected or not. This means that many of the strategies have identical payoffs. The authors in [Kiekintveld et al., 2009] use this fact to construct a compact representation of the payoffs.

(8)

the utility of attacker k. Associated with each target and each player there are two payoffs depending on whether or not the target is protected, see Table 1. [Kiekintveld et al., 2009]

Protected Unprotected Defender Dk(j|p) Dk(j|u) Attacker Ak(j|p) Ak(j|u)

Table 1: Payoff structure in an SSG when targetj is attacked by an attackerk

take advantage of the aforementioned compact representation to define a protection vector

c whose components, cj, represent the frequency with which target j is protected. The components of the vector csatisfy

cj = X

i∈I:j∈i

xi ∀j ∈J, (8)

i.e., the frequency with which targetjis protected is expressed as the sum of all probabilities of the strategies that protect that target. Variablesqk

j indicate whether an attackerkstrikes a target j.

The defender’s and attacker k’s expected rewards, are, respectively:

X j∈J X k∈K πkqk_j{cjDk(j|p) + (1−cj)Dk(j|u)}, (9) X j∈J qk_j{cjAk(j|p) + (1−cj)Ak(j|u)}, ∀k∈K. (10)

As with GSGs, such a game can be modeled by means of bilevel programming. (BIL-p-Sx,c,q) Max X j∈J X k∈K πkq_jk{cjDk(j|p) + (1−cj)Dk(j|u)} s.t. (4),(8), qk ∈arg max_rk    X j∈J rk_j(cjAk(j|p) + (1−cj)Ak(j|u)    ∀k∈K, rk_j ∈ {0,1} ∀j∈J,∀k∈K, X j∈J rk_j = 1 ∀k∈K.

The objective function maximizes the defender’s expected reward. Constraints (4) and (8) characterize the exponentially many mixed strategies considered by the defender and relate them to the frequencies with which targets are protected. The remaining constraints constitute the second level optimization problem which ensures that the attacker maximizes

(9)

its profit by attacking a single target that is the best response to the defender’s selected strategy. Notice that a more compact formulation–one involving a polynomial number of variables and constraints–can be obtained if projecting out the exponentially many x

variables does not lead to exponentially many constraints. This would give a polynomial size formulation involving only the c and the q variables. Given an optimal solution to this compact formulation–an optimal protection vectorc and an optimal attack vectorq–a probability vectorx, solution to this game in extensive form, can be obtained by solving the system of linear inequalities defined by conditions (4) and (8). As this system involvesn+ 1 equalities, there exists a solution in which the number of variables xi with a positive value is not larger thann+ 1,i.e., the output size of an SSG, under extensive form, is polynomial in the input size. See Section 5 for more details.

3 General Stackelberg games–GSGs

In Section 3.1, we present equivalent MILP formulations for thepfollower GSG. In Section 3.2 we compare the polyhedra of the LP relaxations for the different formulations.

3.1 General Stackelberg games: single level formulations

[Paruchuri et al., 2008] tackle the problem of solving the bilevel formulation presented earlier, (BIL-p-Gx,q) by using a MILP reformulation. They replace the second level nested optimization problem, described by (5)-(7), by the following set of constraints:

X j∈J q_jk= 1 ∀k∈K, (11) q_jk∈ {0,1} ∀j ∈J,∀k∈K, (12) 0≤(sk−X i∈I C_ijkxi)≤(1−qjk)·M ∀j ∈J,∀k∈K, (13)

where sk ∈ _R for all k ∈ K and M is an arbitrarily large positive constant. The two inequalities in constraints (13) ensure that qk_j = 1 only for a pure strategy that maxi-mizes the follower’s payoff. The problem defined by (3)-(4) and (11)-(13) is referred to as (QUADx,q,s). It is possible to eliminate the nonlinearity in the objective function of

(BIL-p-Gx,q) by adding additional variables that represent the product between x and q. To be more precise, use z_ijk =xiqjk for alli∈I, j ∈J and k ∈K. This gives rise to formulation

(10)

(DOBSSq,z,s) introduced in [Paruchuri et al., 2008]: (DOBSSq,z,s) Max X i∈I X j∈J X k∈K πkRk_ijz_ijk s.t. (11),(12), X j∈J z_ijk =X j∈J z1_ij ∀i∈I,∀k∈K, (14) X i∈I z_ijk =qk_j ∀j ∈J,∀k∈K, (15) z_ijk ≥0 ∀i∈I,∀j ∈J,∀k∈K, (16) 0≤sk−X i∈I X j0_∈_J C_ijkz_ijk0 ≤(1−qk_j)·M ∀j ∈J,∀k∈K, (17) s∈_R|K|_.

Alternatively the quadratic term in the objective of (BIL-p-Gx,q) can be addressed by adding

|K|new variables and introducing a second family of constraints involving a big M constant. This gives rise to formulation (D2x,q,s,f) below (a DOBSS variant with 2 big M constraints that has appeared in [Kiekintveld et al., 2009]):

(D2x,q,s,f) Max X k∈K πkfk (18) s.t. (4),(11)−(13), fk≤X i∈I R_ijkxi+ (1−qkj)·M ∀j∈J,∀k∈K, (19) s, f ∈R|K| ∀k∈K.

Additionally, we project the real variables sk in constraints (13) and (17) out by using Fourier-Motzkin elimination [Dantzig and Eaves, 1973]. This gives rise to constraints:

X i∈I (C_ijk −C_i`k)xi≤(1−q`k)·M ∀j, `∈J,∀k∈K, (20) X i∈I X j0_∈_J (C_ijk −C_i`k)z_ijk0 ≤(1−q_`k)·M ∀j, `∈J,∀k∈K. (21)

Replacing (13) by (20) in (D2x,q,s,f) and (17) by (21) in (DOBSSq,z,s) yields (D2x,q,f) and (DOBSSq,z). We analyze the behavior of these last two new formulations compared to that of (D2x,q,s,f) and (DOBSSq,z,s) to see if removing variablessat the expense of adding constraints is worthwile.

Another equivalent MILP formulation for thep-follower GSG can be obtained by replacing constraints (17) with the following set of constraints:

X

i∈I

(11)

These constraints are derived by multiplying constraints (20) by q_`k, reorganizing and re-placing the nonlinear termsxiqjk by zijk. This leads to (MIP-p-Gq,z):

(MIP-p-Gq,z) Max X i∈I X j∈J X k∈K πkRk_ijz_ijk s.t. (11),(12),(14)−(16),(22).

The linear relaxation of (MIP-p-Gq,z) appears in [Yin and Tambe, 2012]. The MILP for-mulation is a p-follower extension to the single follower formulation (MIP-1-Gq,z), due to [Conitzer and Korzhyk, 2011]. Formal proofs that the formulations seen thus far are equiva-lent MILP formulations,i.e., that they are valid for thep-follower GSG, appear in [Paruchuri et al., 2008] for (DOBSSq,z,s) and [Paruchuri et al., 2008] and [Kiekintveld et al., 2009] for (D2x,q,s,f). These proofs show that each of them is equivalent to (QUADx,q,s). The equivalence of (DOBSSz,q) and (D2x,q,f) is obtained from the Fourier-Motzkin elimination procedure [Dantzig and Eaves, 1973]. The equivalence proof for (MIP-p-Gq,z) is analogous to the proof used to show the equivalence for (DOBSSq,z,s) and is omitted here.

[Paruchuri et al., 2008] state that the big M constants used are arbitrarily large. To be as computationally competitive as possible, we provide the tightest value for each big M constant in the formulations discussed thus far.

Proposition 2. The tightest values for the positive constants M are:

1. In (19), M = maxi∈I{max`∈JRi`k −Rkij} ∀j∈J,∀k∈K.

2. In (13) and (17),M = maxi∈I{max`∈JCi`k −Cijk} ∀j ∈J,∀k∈K.

3. In (20) and (21),M = maxi∈I{Cijk −Ci`k},∀j, `∈J,∀k∈K.

3.2 Comparison of the formulations

Given a formulation F, we denote by F its linear (continuous) relaxation and by P(F) the polyhedral feasible region of F. Further, let Q = {(x, z) ∈ _Rn_×

Rm : Ax+Bz ≤ d}.

Then the projection of Q into the x-space, denoted P rojxQ, is the polyhedron given by

P rojxQ={x∈Rn:∃z∈Rm for which (x, z)∈Q}, see [Pochet and Wolsey, 2006].

First, we introduce an additional formulation which we denote by (DOBSSx,q,z,s,f). This formulation is equivalent to (DOBBSq,z,s), in the sense that the values of their LP relaxations coincide. In this formulation, we introduce variablesfkfor allk∈Kto rewrite the objective function so that it matches the objective function of (D2x,q,s,f). We also add variables xi for all i∈I by rewriting (14) asP

(12)

condition, we can simplify (17) to (13). The formulation (DOBSSx,q,z,s,f) is as follows. (DOBSSx,q,z,s,f) Max X k∈K πkfk s.t. (11)−(13),(15),(16), fk =X i∈I X j∈J Rk_ijzk_ij ∀k∈K, (23) X j∈J z_ijk =xi ∀i∈I,∀k∈K, (24) s∈_R|K|_.

Further, note that from the Fourier Motzkin elimination procedure we have that

P(D2x,q,f) =P rojx,q,fP(D2x,q,s,f) and,

P(DOBSSq,z) =P rojq,zP(DOBSSq,z,s).

Proposition 3. P rojx,q,s,fP(DOBSSx,q,z,s,f)⊆ P(D2x,q,s,f). Further, there exist instances

for which the inclusion is strict.

Proof. Note that all the constraints of P(D2x,q,s,f) can be found in the description of

P(DOBSSx,q,z,s,f) except for constraints (4) and (19). Constraints (4) are implied by con-straints (11), (15), (16) and (24).

Further, the projection of P(DOBSSx,q,z,s,f) on the (x, q, s, f)-space can be obtained by applying Farkas’ Lemma [Farkas, 1902]. Constraints (15), (16), (23) and (24) are the only ones involving variables z_ijk and are separable by k∈K. For a fixed k∈K the projection is given by: Ak ={(x, q, f) :αfk+X i∈I βixi+ X j∈J γjqkj ≥0∀(α, γ, β) : αRk_ij +βi+γj ≥0 ∀i∈I,∀j ∈J} (25)

For a fixed j ∈J, defineα=−1, βi =Rijk for all i∈I,γj = 0 and γ` = maxi∈I(Ri`k −Rkij) for all `∈J with `6=j. This definition of the parameters satisfies αRk_ij +βi+γj ≥0 for all i∈I,j ∈J. Substituting these parameters in the generic constraints ofAk yields

fk≤X i∈I Rk_ijxi+ X `∈J:`6=j max i∈I (R k i`−Rkij)qk` ∀j ∈J,∀k∈K. (26)

Constraints (26) imply constraints (19) for the tight value of M provided in Proposition 2 since for all j ∈J and k∈K,

X `∈J:`6=j max i∈I (R k i`−Rkij)qk` ≤max i∈I max `∈J R k i`−Rkij X `∈J:`6=j qk_` = max i∈I max `∈J R k i`−Rkij (1−q_jk).

(13)

This proves the inclusion. To show that the inclusion may be strict, consider the following example where |I|=|J|= 3 and|K|= 1. Let the payoff matrix for the game be

(R, C) =      (1,0) (0,0) (0,0) (0,0) (1,0) (0,0) (0,0) (0,0) (0,0)     

and consider the point defined by x = (1,0,0)t, q = (1₃,1₃,1₃)t, s= 10 and f = 2/3. Such a point is feasible for (D2x,q,s,f) but violates constraints (26) for j = 2 and is therefore infeasible for P rojx,q,s,fP(DOBSSx,q,z,s,f).

Next, we compare the polyhedra P(MIP-p-G_q,z) and P rojq,zP(DOBSSq,z,s).

Theorem 1. P(MIP-p-G_q,z)⊆ P(DOBSSq,z)=P rojq,zP(DOBSSq,z,s). Further, there exist

instances for which the inclusion is strict.

Proof. The description ofP(DOBSSq,z) differs from that ofP(MIP-p-Gq,z) by only one set of constraints: (21) must hold instead of (22). Hence, the remainder of the proof consists in showing that (21) are implied by (11), (14)-(16), (22) and the nonnegativity of the q

variables. The LHS of (21) can be rewritten as: X i∈I (C_ijk −C_i`k)z_i`k +X i∈I X j0_∈_J_:_j0₆₌_` (C_ijk −C_i`k)z_ijk0 ≤ X i∈I X j0_∈_J_:_j0₆₌_` (C_ijk −C_i`k)zk_ij0, using (22), ≤max i∈I {C k ij−Ci`k} X j0_∈_J_:_j0₆₌_` X i∈I z_ijk0 ≤M X j0_∈_J_:_j0₆₌_`

q_jk0, given Proposition 2 and (15)

=M(1−q_`k), by (11).

To show that the inclusion may be strict consider the p-follower GSG between a leader and a fixed follower k∈K where the payoff bimatrix is:

(Rk, Ck) =   (0,1) (1,0) (0,0) (0,0)  

The point with coordinates x= (1/2,1/2)t, qk= (1/2,1/2)t and

zk=   1/4 1/4 1/4 1/4  

has an objective value of 1/4 and is feasible in P(DOBSSq,z). However it is not a feasible point in P(MIP-p-G_q,z) as it doesn’t verify constraints (22) when j= 2 and`= 1.

From an interpretation point of view, (MIP-p-Gq,z) can be seen as the result of applying Reformulation Linearization Technique (RLT) [Sherali and Adams, 1994] to (DOBSSq,z).

(14)

Indeed, by multiplying both sides of constraints (20) by variable q_`k and noticing that

qk_`(1−q_`k) = 0 sinceq is binary, one obtainsP

i∈I(Cijk −Ci`k)xiq`k≤0 which, once linearized by introducing variables z_i`k, yields (22).

For a given formulation F, we denote its optimal value by v(F) and the optimal value of its LP relaxation by v(F). Since (D2x,q,s,f) and (DOBSSx,q,s,f) and (DOBSSq,z) and (MIP-p-Gq,z) have the same objective function, the following corollary holds.

Corollary 1. v(MIP-p-G_q,z)≤v(DOBSSq,z) =v(DOBSSx,q,s,f)≤v(D2x,q,s,f).

Finally, when (MIP-p-G) is restricted to a single follower type, [Conitzer and Korzhyk, 2011] showed that the integrality costraints are redundant, i.e., the remaining constraints in (MIP-1-G) provide a complete linear description of the convex hull of feasible solutions.

4 Computational experiments for GSGs

Here, we present computational experiments for the formulations in Section 3. The machine used for these experiments is an Intel Core i7-4930K CPU, 3.40GHz, equipped with 64 GB of RAM, 6 cores, 12 threads and running the Ubuntu operating system release 12.10 (kernel Linux 3.5.0-41-generic). The experiments were coded in the programming language Python and GUROBI version 6.5.1 was the optimization solver used with a 3 hour solution time limit.

The instances solved in the computational experiments are randomly generated. We con-sider two different ways of randomly generating the payoff matrices for the leader and the different follower types. First, we consider matrices where all the elements are randomly generated between 0 and 10 and second, we consider matrices where 90% of the values are between 0 and 10 but we allow for 10% of the data to deviate between 0 and 100. In the first case we say that there is no variability in the payoff matrices, in the sense that all the data is uniformly distributed, whereas in the second case, we refer to the payoff matrices as matrices with variability.

A general Stackelberg game instance is defined by three parameters: |I|, the number of leader pure strategies, |J|, the number of follower pure strategies and |K|, the number of follower types. For the purpose of these experiments, we have considered instances where

|I| ∈ {10,20,30}, |J| ∈ {10,20,30} and |K| ∈ {2,4,6}. For each instance size, 5 instances are generated without variability in the payoff matrices and 5 are generated with variability. In total, we consider 135 instances without variability and 135 instances with variability. Performance profiles summarize our results, with respect to the following 4 measures: to-tal running time employed to solve the integer problem, running time employed to solve

(15)

the linear relaxation of the integer problem, total number of nodes explored in the branch and bound (B&B) tree and percentage optimality gap at the root node. The percentage optimality gap at the root node is calculated by comparing the optimal values of the for-mulation and of its LP relaxation: v(F)_v−₍_Fv₎(F) ·100. A performance profile graph plots the total percentage of problems solved for each value of these measures.

We study the behavior of (D2x,q,s,f), (D2x,q,f), (DOBSSq,z,s), (DOBSSq,z) and (MIP-p-Gq,z). Figures 1 and 2 compare the performance profiles when the payoff matrices are generated without variability and with variability, respectively.

We observe that the instances where variability is introduced in the payoff matrices

0.01 0.1 1 10 100 1000 10000 0 10 20 30 40 50 60 70 80 90 100

Time to solve the integer problem (s.)

% o f p ro bl em s so lv ed 0.00010 0.0010 0.0100 0.1000 1.0000 10 20 30 40 50 60 70 80 90 100

Time to solve the LP relaxation (s.)

% o f p ro bl em s so lv ed 1 100 10000 1000000 100000000 0 10 20 30 40 50 60 70 80 90 100

Number of nodes in the B&B tree

% o f p ro bl em s so lv ed 1 10 100 1000 0 10 20 30 40 50 60 70 80 90 100

% optimality gap at the root node

% o f p ro bl em s so lv ed

Figure 1: GSGs: |I| ∈ {10,20,30},|J| ∈ {10,20,30},|K| ∈ {2,4,6}–without variability.

solve faster than those where no variability is considered. When there is no variability, (DOBSSq,z,s) and (MIP-p-Gq,z) are the two most competitive formulations. (D2x,q,s,f) can also be solved efficiently for the mid-range instances but slows down for the more difficult instances. Introducing variability in the payoff matrices, however, leads to a dominance of (MIP-p-Gq,z) with (DOBSSq,z,s) coming in a close second and (D2x,q,s) becoming noncom-petitive for these instances. Regarding the time spent solving the linear relaxation of the problems, formulation (MIP-p-Gq,z) is the hardest to solve due to the fact that is has the most variables and constraints, O(|K||J|2_{). On the other hand, (D2}

x,q,s,f), withO(|K||J|) variables and constraints, is the fastest. With respect to the number of nodes and gap percentage, our theoretical findings are corroborated: (MIP-p-Gq,z) is the tightest formu-lation and therefore uses the fewest nodes. This is even more the case when variability is

(16)

0.01 0.1 1 10 100 1000 0 10 20 30 40 50 60 70 80 90 100

% o f p ro bl em s so lv ed 1 10 100 1000 10000 100000 1000000 0 10 20 30 40 50 60 70 80 90 100

% o f p ro bl em s so lv ed 0.01 0.1 1 10 100 1000 0 10 20 30 40 50 60 70 80 90 100

Figure 2: GSGs: |I| ∈ {10,20,30},|J| ∈ {10,20,30},|K| ∈ {2,4,6}–with variability.

introduced.

Table 2 summarizes the mean percentage optimality gap at the root node obtained across the instances solved. Finally, note that the formulations obtained through Fourier-Motzkin, (D2x,q,f) and (DOBSSq,z), explore slightly less nodes in the B&B tree than their counter-parts, (D2x,q,s,f) and (DOBSSq,z,s), but because of the increase in the number of constraints, the time to solve each linear relaxation increases. This increases the overall solution time of the Fourier-Motzkin formulations.

(D2x,q,s,f) (DOBSSq,z,s) (MIP-p-Gq,z)

Mean % opt. gap (no variability) 117.68 23.01 9.94 Mean % opt. gap (with variability) 103.44 40.74 5.17

Total mean % opt. gap 110.56 31.88 7.56

Table 2: Mean percentage optimality gap at the root node recorded for GSG formulations.

5 Stackelberg security games-SSGs

In this section, we derive three SSG formulations: (ERASERc,q,s,f), due to [Kiekintveld et al., 2009], and (SDOBSSq,y,s) and (MIP-p-Sq,y). We derive these formulations by explor-ing the inherent link between the general settexplor-ing, considered up to now and the security setting, defined in Section 2.2. In this setting, the defender pure strategies i∈I correspond to the different ways in which up to m targets can be protected simultaneously. With a

(17)

slight abuse of notation, i ∈ I refers both to the index running through the set of pure strategies I and to the subset of at mostm targets protected by pure strategyi∈I. Recall that the payoff matrices of SSGs satisfy:

Rk_ij =    Dk(j|p) ifj∈i Dk(j|u) ifj /∈i (27) C_ijk =    Ak(j|p) ifj∈i Ak(j|u) ifj /∈i (28)

The payoff for the leader that commits to a pure strategy i∈I and a follower of type

k∈K responds by selecting strategyj∈J is either a reward if pure strategyi∈I protects attacked target j ∈ J, or, a penalty if strategy i does not protect target j. The same argument explains the link between payoffs for the attackers.

5.1 Stackelberg security games: single level formulations

The first formulation we derive is based on (D2x,q,s,f). Consider (D2c,x,q,s,f), an extended description of (D2x,q,s,f) where we introduce the c variables through constraints (8) (see Section 2.2). We further use relations (27) and (28) to adapt the payoff structure:

(D2c,x,q,s,f) Max X k∈K πkfk s.t. (4),(8),(11),(12), 0≤sk−Ak(j|p)cj −Ak(j|u)(1−cj)≤(1−qkj)·M ∀j∈J,∀k∈K, (29) fk≤Dk(j|p)cj+Dk(j|u)(1−cj) + (1−qjk)·M ∀j∈J,∀k∈K, (30) s, f ∈RK.

This extended formulation is equivalent to (D2x,q,s,f), because, even though they are defined in different spaces of variables, the value of their LP relaxations coincide.

The formulation above has a large number of non-negative variables since in the security setting, the set I of all defender pure strategies is exponential in the number of targets as it contains all subsets of at mostm targets ofJ that the defender can protect simultaneously. In order to avoid having exponentially many non-negative variables in our formulation, we project out variables xi, i ∈ I, from the formulation. Note that only constraints (4) and (8) involve said variables.

Proposition 4. Consider the following two sets:

A=P rojc n (x, c)∈R|I|×R|J|: (4),(8) o B =    c∈R|J|: X j∈J cj ≤m, cj ∈[0,1]∀j∈J   

(18)

Then, A=B.

Proof. Observe first that using Farkas’ Lemma [Farkas, 1902]:

A=    c∈R|J|: X j∈J αjcj+α|J|+1≥0∀α∈R|J|+1: X j∈J:j∈i αj+α|J|+1 ≥0 ∀i∈I :|i| ≤m and α|J|+1 ≥0    ,

Thus A⊆B. Indeed, the following 2|J|+ 1 vectors inR|J|+1:

∀j ∈J, ej ∈_R|J|+1:ej_j = 1, ej_k= 0 ∀k∈J :k6=j and ej_|_J_|₊₁ = 0,

∀j∈J, fj ∈_R|J|+1 :f_jj =−1, f_kj = 0 ∀k∈J :k6=j and f_|j_J_|₊₁= 1 and

g∈_R|J|+1:gj =−1 ∀j∈J and g|J|+1=m,

satisfy P

j∈J:j∈iαj +α|J|+1 ≥0 and α|J|+1 ≥0. Additionally, when we substitute the

above vectors into the generic constraints defining A, they yield all the constraints defining

B.

To show that A=B, it remains to show that any other inequality

X

j∈J

αjcj+α|J|+1 ≥0 (31)

such that α satisfies

X

j∈J:j∈i

αj+α|J|+1≥0 ∀i∈I :|i| ≤m and α|J|+1≥0, (32)

is dominated by some nonnegative linear combination of the constraints defining B. First, note that we can restrict our attention to constraints such that αj ≤0 for all j ∈J. If there exists ˆj ∈J such that αˆj >0, since α must satisfy (32) and |i\ {ˆj}| ≤ |i| ≤ m, it follows that ¯α with ¯αˆ_j = 0 and ¯αj =αj for allj ∈J\ {ˆj}also satisfies (32) and sincec≥0, we have that X j∈J ¯ αjcj+ ¯α|J|+1 ≤ X j∈J αjcj+α|J|+1.

Therefore, the constraint defined by α is dominated by the constraint defined by ¯α. We thus distinguish two cases of α satisfying (32):

Case 1. |{j:αj <0}|=k≤m, and

(19)

In Case 1, by considering a linear combination of inequalities cj ≤ 1 for 1 ≤ j ≤ k with respective weights −αj ≥0, we obtain that:

0≤ k X j=1 αjcj− k X j=1 αj ≤ X j∈J αjcj+α|J|+1,

since αj = 0 for all j > k and α satisfies (32) fori={1, . . . , k}.

For Case 2, assume w.l.o.g that α1 ≤ α2 ≤ . . . ≤ αk < 0 and αj = 0 for all j > k. Then, build a linear combination of inequality P

j∈Jcj ≤ m with weight −αm ≥ 0 and inequalitiescj ≤1 for 1≤j≤m with respective weightsαm−αj ≥0. The valid inequality thus obtained is:

0≤ m X j=1 αjcj + X j>m αmcj− m X j=1 αj ≤ X j∈J αjcj − m X j=1

αj, since αj ≥αm for all j > m

≤X

j∈J

αjcj +α|J|+1,

since α satisfies (32) fori={1, . . . , m}.

Proposition 4 leads to the following formulation based on (D2c,x,q,s,f): (ERASERc,q,s,f) Max X k∈K πkfk s.t. (11),(12),(29),(30), X j∈J cj ≤m, 0≤cj ≤1 ∀j∈J, s, f ∈RK.

The above formulation involves a polynomial number of variables and constraints and was presented in [Kiekintveld et al., 2009]. The next result is also an immediate consequence of Proposition 4.

Corollary 2. P rojc,q,s,fP(D2c,x,q,s,f) =P(ERASERc,q,s,f).

We now derive new SSG formulations based on (DOBSSq,z,s) and (MIP-p-Gq,z). We first present extended descriptions of both formulations by consideringy_`jk variables satisfying:

y_`jk = X i∈I:`∈i

(20)

We use (27) and (28) to adapt the payoffs to the security setting leading to: (DOBSSq,z,y,s) Max X j∈J X k∈K {πk(Dk(j|p)y_jjk +Dk(j|u)(q_jk−y_jjk))} (34) s.t. (11),(12),(14)−(16),(33), 0≤sk−Ak(j|p)X j0_∈_J y_jjk0− Ak(j|u)(1−X j0_∈_J y_jjk0)≤(1−qk_j)·M ∀j∈J,∀k∈K, (35) s∈_R|K|. (36)

X

j∈J

y_`jk =X j∈J

y_`j1 ∀`∈J,∀k∈K, (38)

and let us define the following polyhedra C and D:

C :={(q, z, y, s)∈[0,1]|K||J|×[0,1]|K||I||J|×[0,1]|K||J|2 ×R|K|:

(11),(15),(16),(33),(35),(36),(38)}

D:={(q, z, y)∈[0,1]|K||J|×[0,1]|K||I||J|×[0,1]|K||J|2 : (11),(15),(16),(33),(37),(38)}

Lemma 1. C ⊇ P(DOBSSq,z,y,s) andD⊇ P(MIP-p-Gq,z,y)

Proof. Consider constraints (14) and sum over alli∈I such that`∈i:

X i∈I: `∈i X j∈J z_ijk =X i∈I: `∈i X j∈J z_ij1 ∀`∈J,∀k∈K. (39)

Applying (33) to (39) yields (38) and the result follows.

We now project thezvariables from the larger polyhedraCand D. Said variables only appear in constraints (15), (16) and (33).

(21)

Lemma 2. Consider the following two sets; X =P rojq,y n (q, z, y)∈_R|K||J|2+|K||J|+|I||J||K|: (15),(16),(33)o Y ={(q, y)∈R|K||J| 2₊_|_K_||_J_| :X `∈J y_`jk ≤mqk_j ∀j∈J,∀k∈K, 0≤y_`jk ≤q_jk ∀j, `∈J,∀k∈K} Then, X =Y.

Proof. Note that constraints (15), (16) and (33) can be treated independently for each

k ∈ K and each j ∈ J. First consider the case where q_ˆˆk

j = 0 for ˆj ∈ J and ˆk ∈ K. Constraints (15) then imply that for all i∈I,zˆk

iˆj = 0 and constraints (33) forcey

ˆ

k

`ˆj = 0 for all `∈J and the result holds. For all j ∈J,k∈K such that q_jk6= 0, consider xi =zijk/qjk and c` =y`jk/qjk and apply Propostion 4. The result follows. Consider P rojq,y,sC and P rojq,yD as the feasible regions of the linear relaxations of two MILP formulations–(SDOBSSq,y,s) and (MIP-p-Sq,y)–where we maximize the objective function (34) under the additional requirement that the q variables be binary. Hence, we present (SDOBSSq,y,s), a security formulation based on (DOBSSq,z,y,s),

(SDOBSSq,y,s) Max X j∈J X k∈K πk(Dk(j|p)yk_jj+Dk(j|u)(q_jk−y_jjk)) s.t. (11),(12),(35),(38) X `∈J y_`jk ≤mqk_j ∀j ∈J,∀k∈K, (40) 0≤y_`jk ≤q_jk ∀j, `∈J,∀k∈K, (41) s∈R|K|.

And we also present (MIP-p-Sq,y), a security formulation based on (MIP-p-Gq,z,y),

(MIP-p-Sq,y) Max

X j∈J X k∈K πk(Dk(j|p)y_jjk +Dk(j|u)(q_jk−y_jjk)) s.t. (11),(12),(37),(38),(40),(41)

The following corollaries are an immediate consequence of Lemmas 1 and 2.

Corollary 3. P rojq,y,sP(DOBSSq,z,y,s)⊆ P(SDOBSSq,y,s).

(22)

In addition, note that if we restrict (MIP-p-Gq,z,y) to a single type of follower, constraints (14) disappear and one thus obtains the following corollary.

Corollary 5. P rojq,yP(MIP-1-Gq,z,y) =P(MIP-1-Sq,y)

The above corollary immediately leads to the following theorem.

Theorem 2. (MIP-1-Sq,y) is a linear description of the convex hull of feasible solutions for

the Stackelberg security game with a single type of attacker.

Proof. The result follows from Corollary 5 and from [Conitzer and Korzhyk, 2011] showing that (MIP-1-Gq,z) is a linear description for general games. As in general games, we use Fourier-Motzkin elimination on constraints (29) and (35) to project out the s variables from formulations (ERASERc,q,s,f) and (SDOBSSq,y,s) respec-tively. This leads to the following two families of inequalities:

Replacing constraints (29) by (42) in (ERASERc,q,s,f) and (35) by (43) in (SDOBSSq,y,s) leads to (ERASERc,q,f) and (SDOBSSq,y).

In the same spirit as Proposition 2, we present the following proposition, establishing the tightest values for the big M constants in the formulations seen so far:

Proposition 5. The tightest values for the positive constants M are:

1. In (30), M = max`∈J{Dk(`|p), Dk(`|u)} −min{Dk(j|p), Dk(j|u)}, ∀j ∈J, k∈K.

2. In (29), (35),M = max`∈J{Ak(`|p), Ak(`|u)}−min{Ak(j|p), Ak(j|u)},∀j∈J, k∈K.

3. In (42), (43),M = max{Ak(j|p), Ak(j|u)} −min{Ak(`|p), Ak(`|u)}, ∀j, `∈J, k∈K.

5.2 Comparison of the formulations

First, we introduce an additional formulation which we denote by (SDOBSSc,q,y,s,f). This formulation is equivalent to (SDOBSSq,y,s), in the sense that the value of their LP relax-ations coincide. In this formulation, we introduce variables fk for all k∈K to rewrite the objective function so that it matches the objective function of (ERASERc,q,s,f). We also add variables c` for all ` ∈ J and rewrite constraints (38) as Pj∈Jy`jk = c` for all ` ∈ J

(23)

and all k ∈ K. Using this last condition we can simplify (35) to (29). The formulation (SDOBSSc,q,y,s,f) is as follows.

(SDOBSSc,q,y,s,f) Max X k∈K πkfk s.t. (11),(12),(29),(40),(41), fk =X j∈J {yk_jj(Dk(j|p)−Dk(j|u))+ qk_jDk(j|u)} ∀k∈K (44) X j∈J yk_`j =c` ∀`∈J,∀k∈K, (45) s∈R|K|. Note that

P(ERASERc,q,f) =P rojc,q,fP(ERASERc,q,s,f) and

P(SDOBSSq,y) =P rojq,yP(SDOBSSq,y,s).

Proposition 6. P rojc,q,s,fP(SDOBSSc,q,y,s,f) ⊆ P(ERASERc,q,s,f). Further, there exist

instances for which the inclusion is strict.

Proof. The projection of P(SDOBSSc,q,y,s,f) onto the (c, q, s, f)-space is obtained by ap-plying Farkas’ Lemma. Constraints (40)-(41) and (44)-(45) are the only ones involving variables y_`jk and are separable byk∈K. For a fixed k∈K, the projection is given by:

Ak={(c, q, f) :α(fk−X j∈J Dk(j|u)q_jk) +X `∈J β`c`+m X j∈J γjqjk+ X j∈J X `∈J δ`jqjk≥0 ∀(α, β, γ, δ) :γ, δ≥0, β`+γj+δ`,j≥0∀`, j ∈J :`6=j, and α(Dk(j|c)−Dk(j|u)) +βj+γj+δ`j ≥0 ∀j∈J} (46)

Consider, for each k∈K, the following set Bk:

Bk={(c, q, f) :c` ≤ X j∈J qk_j, ∀`∈J, (47) c`≥0, ∀`∈J, (48) X `∈J c`≤m X j∈J q_jk, (49) fk ≤cj(Dk(j|p)−Dk(j|u))+ X `∈J:`6=j qk_`Dk(`|p) +q_jkDk(j|u) ∀j ∈J, (50) q_jk≥0 ∀j∈J,∀k∈K.}

(24)

Let us see that Ak ⊆ Bk for all k ∈ K. First note that if we set α = 0, the following definitions of the parameters β, γ and δ comply with the conditions in (46):

β=eh, γ={0}j∈J, δ={0}`,j∈J, ∀h∈J,

β =−e`, γ={0}j∈J, δ` ={1}j∈J, ∀`∈J,

β={−1}_`∈J, γ={1}j∈J, δ={0}`,j∈J,

β ={0}_`∈J, γ ={0}j∈J, δ1 ={ej}, ∀j ∈J.

Substituting these valid parameters in the generic constraints in Ak, produces all of the constraints in Bk except (50). Further, for a fixed j ∈ J, consider α = −1, β` = 0 and

γ`= _m1(Dk(`|p)−Dk(`|u)) for all`∈J such that`6=j,βj =Dk(j|p)−Dk(j|u) andγj = 0. Finally, set δ`j = 0 for all `, j ∈J. This definition of parameters is valid as it satisfies the conditions in (46). Substituting in the generic constraints in Ak _{yields (50).}

It remains to show that for allk∈K, constraints (50) imply (30) for the tight value of

M shown in Proposition 5. The implication holds because

X `∈J:`6=j qk_`Dk(`|p)≤max `∈J{D k₍_`_|_p₎_} X `∈J:`6=j q_`k= (1−q_jk) max `∈J{D k₍_`_|_p₎_} _∀_j_∈_J,_∀_k_∈_K.

Hence,P rojc,q,s,fP(SDOBSSc,q,y,s,f)⊆ P(ERASERc,q,s,f). To show that the inclusion may be strict, consider the following example wherem= 1,|J|= 3 and|K|= 1. Let the reward and penalty matrices for the defender and attacker be D(·|p) = [1,0,0], D(·|u) = [0,0,0],

A(·|p) = [0,0,0] and A(·|u) = [0,0,0]. Consider the point defined by q = (1₃,1₃,1₃)t, c = (1,0,0)t, s = 10 and f = 2/3. Such a point is feasible for (ERASERc,q,s,f) but violates constraints (50) whenj= 2 and is therefore infeasible forP rojc,q,f,sP(SDOBSSc,q,y,s,f). Based on Theorem 1 we can present the following theorem comparing the polyhedra

P(MIP-p-S_q,y) and P rojq,yP(SDOBSSq,y,s):

Theorem 3. P(MIP-p-S_q,y)⊆ P(SDOBSSq,y) =P rojq,yP(SDOBSSq,y,s).

Proof. The inclusion is a consequence of Theorem 1, the relations between the payoffs described in (27) and (28) and the relation between thezand yvariables described in (33). To show that the inclusion may be strict, consider the following game. We set m = 2,

|J|= 2 and |K|= 1. The reward and penalty payoff matrices for both the defender and the attacker are given by D(·|p) = [1,0], D(·|u) = [0,0], A(·|p) = [0,0] andA(·|u) = [0,1].

Additionally, the point with coordinates

ct= (1/2,1/2), qt= (1/2,1/2) and yk =   1/4 1/4 1/4 1/4  

(25)

has an objective value of 1/4 and is a valid feasible solution of P(SDOBSSq,y). However, it is not feasible in P(MIP-p-S_q,y) as it does not verify constraints (37) when j = 1 and

`= 2.

Observe that (MIP-p-Sq,y) can be obtained by applying RLT [Sherali and Adams, 1994] to (SDOBSSq,y). Multiplying both sides of constraints (42) by variable q_`k and noticing that qk_`(1−q_`k) = 0, since qk_` is binary, one obtains constraints that once linearized, by introducing variables yk_`j, yield (37).

Since (ERASERc,q,s,f) and (f-SDOBSSc,q,s,f) and (SDOBSSq,y) and (MIP-p-Sq,y) have the same objective function, the following corollary holds.

Corollary 6. v(MIP-p-S_q,y)≤v(SDOBSSq,y) =v(SDOBSSc,q,s,f)≤v(ERASERc,q,s,f).

6 Computational experiments for SSGs

Our security experiments are run on randomly generated instances. For each instance, four payoff matrices have to be generated that satisfy Dk(·|p)≥Dk(·|u) and Ak(·|u)≥Ak(·|p). We consider two ways of generating these matrices. First, we generate matrices where the values for the penalty matrices (Dk(·|u) and Ak(·|p)) are randomly generated between 0 and 5 and all values for the reward matrices (Dk(·|p) andAk(·|u)) are randomly generated between 5 and 10. We refer to these as matrices with no variability. Second, we consider an alternative where 90% of the values for the penalty matrices are randomly generated between 0 and 5 (between 5 and 10 for the reward matrices) and 10% of the values for the penalty matrices are randomly generated between 0 and 50 (between 50 and 100 for the reward matrices). We refer to these as matrices with variability. We use a solution limit of 3 hours.

A Stackelberg security game instance is defined by |J|, the number of targets, |K| the number of attacker types andm, the number of security resources available to the defender. Recall from the computational experiments for GSGs that using payoff matrices with vari-ability, amounts to endowing the game with more structure, thus making it somewhat easier to solve. We have encountered the same phenomenon in SSGs. For games whose payoff matrices have variability, we have considered J ={30,40,50,60,70},K ={6,8,10,12}and we have allowed m to be either 25%, 50% or 75% of the number of targets. For games whose payoff matrices don’t have variability we have had to be less ambitious in order to solve all instances to optimality within the stipulated time limit and have considered

J ={10,20,30,40,50}, K ={2,4,6,8} while still considering m to be either 25%, 50% or 75% of the number of targets. In either case, for each instance size, we generate 5 random

(26)

instances as described above. In total, we consider 300 randomly generated instances. We study the behavior of (ERASERc,q,s,f), (SDOBSSq,y,s) and (MIP-p-Sq,y). For the sake of clarity, we no longer consider the Fourier-Motzkin formulations (ERASERc,q,f) and (SDOBSSq,y). Performance-wise, (ERASERc,q,s,f) and (SDOBSSq,y,s) compare to their Fourier-Motzkin formulations in a similar way to how (D2x,q,s,f) and (DOBSSq,z,s) com-pared to theirs in Section 4 (results not shown). We plot performance profile graphs in Figures 3 and 4. Note that for the experiments with variability, (ERASERc,q,s,f) is the

0 0.01 0.1 1 10 100 1000 0 10 20 30 40 50 60 70 80 90 100

Time to solve the integer program (s.)

% o f p ro bl em s so lv ed 0.001 0.010 0.100 1.000 10.000 100.000 0 10 20 30 40 50 60 70 80 90 100

% o f p ro bl em s so lv ed 1 10 100 1000 10000 100000 0 10 20 30 40 50 60 70 80 90 100

Figure 3: SSGs: K ={6,8,10,12}, J ={30,40,50,60,70}–with variability

fastest formulation for most of the instances. However, we see that for the more difficult instances, its solution time increases significantly, eventually surpassing the solution time of (MIP-p-Sq,y). This indicates that for these instances (ERASERc,q,s,f) ceases to be com-petitive and (MIP-p-Sq,y) is the formulation that solves the fastest. As for the instances whose payoff matrices have no variability, and are thus harder to solve, we observe that (ERASERc,q,s,f) outperforms the running time of the other two formulations for 80% of the instances. However, for the most difficult instances, (MIP-p-Sq,y) is faster than the other two formulations. For the last 5% of the instances, (ERASERc,q,s,f) is the worst formulation. In terms of size of the formulations, (ERASERc,q,s,f) is the formulation with the least number of constraints and variables: O(|J||K|). Observe that (MIP-p-Sq,y) and (SDOBSSq,y,s) have

O(|J|2_|_K_|_{) constraints and variables. Thus, these formulations have larger LP relaxations}

and thus take longer time to solve than (ERASERc,q,s,f) does. However, Figures 3 and 4 confirm our theoretical findings: (MIP-p-Sq,y) has the tightest LP relaxation and this

(27)

0 0.01 0.1 1 10 100 1000 10000 100000 0 10 20 30 40 50 60 70 80 90 100

Time to solve the integer program (s.)

% o f p ro bl em s so lv ed 0.00010 0.0010 0.0100 0.1000 1.0000 10.0000 10 20 30 40 50 60 70 80 90 100

% o f p ro bl em s so lv ed 0.01 0.1 1 10 100 1000 0 10 20 30 40 50 60 70 80 90 100

% optimality gap at root node

Figure 4: SSGs: K={2,4,6,8}, J ={10,20,30,40,50}–without variability

translates into a clear dominance with respect to node usage in the B&B tree.

Based on our results, we observe a trend that indicates that for difficult instances, par-ticularly in the case of payoff matrices with no variability, one could expect (ERASERc,q,s,f) and (SDOBSSq,y,s) to perform very poorly compared to (MIP-p-Sq,y). To analyze this, we consider instances where the payoff matrices have no variability and whereK ={6,8,10,12},

J ={30,40,50,60,70} and m is 25%, 50% and 75% of the targets. We generate 5 random instances for each size. In addition, for practical reasons, we consider a time limit of 30 minutes. The computational results for these instances are shown in Figure 5. Note that (MIP-p-Sq,y) is able to solve 95% of the 300 instances within the stipulated time limit, out-performing (SDOBSSq,y,s) and (ERASERc,q,s,f), which are only able to solve 56% and 45% of the instances, respectively, within the same time frame. For the 45% of instances which can be solved by the three formulations, we observe that (MIP-p-Sq,y) offers a much tighter percentage optimality gap than the other two formulations. Because of this, the node us-age in the B&B tree is significantly smaller in (MIP-p-Sq,y) compared to (ERASERc,q,s,f) and (SDOBSSq,y,s). Table 3 records the mean percentage optimality gap at the root node across all the instances for the three formulations under study. Observe that (MIP-p-S_q,y) is significantly tighter than the LP relaxations of the other formulations. We may thus conclude that for the payoff matrices without variability, (MIP-p-Sq,y) is the fastest formu-lation for the most difficult instances. On the other hand, (ERASERc,q,s,f) is the fastest formulation when we endow the security game with further structure by allowing matrices

(28)

0.1 1 10 100 1000 10000 0 10 20 30 40 50 60 70 80 90 100

% o f p ro bl em s so lv ed 0.001 0.010 0.100 1.000 10.000 100.000 0 10 20 30 40 50 60 70 80 90 100

% o f p ro bl em s so lv ed 0.1 1 10 100 1000 0 10 20 30 40 50 60 70 80 90 100

% optimality gap at root node

Figure 5: SSGs: K={6,8,10,12}, J ={30,40,50,60,70}–without variability

(ERASERc,q,s,f) (SDOBSSq,y,s) (MIP-p-Sq,y) Mean % opt. gap (no variability) 241.26 38.87 3.09 Mean % opt. gap (with variability) 168.37 18.66 0.35

Total mean % opt. gap 204.82 28.76 1.72

Table 3: Mean percentage optimality gap at the root node recorded for SSG formulations.

to experience variability. Even then, (ERASERc,q,s,f) looses ground to (MIP-p-Sq,y). This is due to the fact that (MIP-p-Sq,y) has the tightest LP relaxation. The quality of the upper bound obtained from (MIP-p-S_q,y) translates into a smaller B&B tree and this translates into reaching optimality of the integer problem faster in many cases.

7 Conclusions and future work

In this paper, we consider Stackelberg games in two different settings. We first analyze the general Stackelberg setting, which models a hierarchical competitive game between different agents, and the specific Stackelberg security setting, where an agent must secure subsets of targets from attackers.

In the general setting, we have studied known MILP formulations and have ordered them with respect to the strength of their linear relaxations. We have presented a formal theoretical link between GSG formulations and SSG formulations involving the projection of variables. Exploiting this link has allowed us to i) derive two new SSG MILP formulations

(29)

(SDOBSSq,y,s) and (MIP-p-Sq,y); and ii) extend our study of GSG formulations to SSG formulations, leading to a ranking of the security formulations with respect to the strength of their linear relaxations, where (MIP-p-S) has been shown to be the strongest SSG for-mulation. Further, we have shown its single type of attacker restriction, (MIP-1-Sq,y), to be an ideal formulation.

Our computational studies have shown that (MIP-p-Gq,z) and (MIP-p-Sq,y), the tightest formulations in each setting, are highly competitive with respect to solving time. Further, in the case of (MIP-p-S), we have seen it scales significantly better than competing formulations when tackling instances with no variability in their payoff structure. Formulation (MIP-p-S) represents a significant theoretical and practical improvement over previously existing SSG formulations.

However, the obvious bottleneck, at this time, is solving the tighter but larger LP relaxations for (MIP-p-Gq,z) and (MIP-p-Sq,y). The main challenge is to provide an efficient way of solving these tight formulations. It is our contention that this can be done by exploiting the inherent problem structure in the Stackelberg paradigm to develop either decomposition or cutting plane approaches.

While this paper focuses on the polyhedral analysis of general normal form Stackel-berg games and StacekStackel-berg security games, the work of developing efficient algorithms by conducting similar polyhedral analysis of the bilevel interaction could be carried out for Stackelberg games in specific security applications. In particular, extensions to problems that consider multiple attacks by followers, dynamic settings, imperfect information, or non-rational response would be interesting lines of future research.

Acknowledgements

Casorrán wishes to acknowledge the FNRS for funding his PhD research through a FRIA grant. Fortz and Labbé have been partially supported by the Fonds de la Recherche Scientique - FNRS under Grant(s) no PDR T0098.18. Ordóñez acknowledges the support of CONICYT through grant FONDECYT-1171419 and the Complex Engineering Systems Institute through grant CONICYT-PIA-FB0816. The authors would also like to thank the anonymous reviewers, whose comments have helped to elevate the quality of this paper.

References

[Anandalingam and Friesz, 1992] Anandalingam, G. and Friesz, T. L. (1992). Hierarchical optimiza-tion: An introduction. Annals of Operations Research, 34(1):1–11.

[Bard, 1998] Bard, J. F. (1998).Practial Bilevel Optimization: Algorithms and Applications. Kluwer Academic Publishers.

(30)

[Bracken and McGill, 1973] Bracken, J. and McGill, J. T. (1973). Mathematical programs with optimization problems in the constraints. Operations Research, 21(1):37–44.

[Brown et al., 2006] Brown, G., Carlyle, M., Salmer´on, J., and Wood, K. (2006). Defending critical infrastructure. Interfaces, 36:530–544.

[Colson et al., 2007] Colson, B., Marcotte, P., and Savard, G. (2007). An overview of bilevel opti-mization. Annals of Operations Research, 153:235–256.

[Conitzer and Korzhyk, 2011] Conitzer, V. and Korzhyk, D. (2011). Commitment to correlated strategies. In Burgard, W. and Roth, D., editors,AAAI. AAAI Press.

[Conitzer and Sandholm, 2006] Conitzer, V. and Sandholm, T. (2006). Computing the optimal strategy to commit to. In ACM, editor, Proceedings of the 7th ACM Conference on Electronic Commerce, EC ’06, pages 82–90, New York, NY, USA. ACM.

[Dantzig and Eaves, 1973] Dantzig, G. B. and Eaves, C. B. (1973). Fourier-Motzkin Elimination and Its Dual. J. Comb. Theory, Ser. A, 14(3):288–297.

[Farkas, 1902] Farkas, J. (1902). Theorie der einfachen ungleichungen. Journal f¨ur die reine und angewandte Mathematik, 124:1–27.

[Fischetti et al., 2018] Fischetti, M., Ljubic, I., Monaci, M., and Sinnl, M. (2018). Interdiction games and monotonicity. INFORMS Journal on Computing. to appear.

[Harsanyi and Selten, 1972] Harsanyi, J. C. and Selten, R. (1972). A Generalized Nash Solution for Two-Person Bargaining Games with Incomplete Information.Management Science, 18(5):80–106. [Jain et al., 2011] Jain, M., Kiekintveld, C., and Tambe, M. (2011). Quality-bounded solutions for finite bayesian stackelberg games: Scaling up. InInternational Conference on Autonomous Agents and Multiagent Systems - Volume 3, pages 997–1004.

[Jain et al., 2010] Jain, M., Tsai, J., Pita, J., Kiekintveld, C., Rathi, S., Tambe, M., and Ord´o˜nez, F. (2010). Software assistants for randomized patrol planning for the lax airport police and the federal air marshal service. Interfaces, 40(4):267–290.

[Kiekintveld et al., 2009] Kiekintveld, C., Jain, M., Tsai, J., Pita, J., Ord´o˜nez, F., and Tambe, M. (2009). Computing optimal randomized resource allocations for massive security games. In International Conference on Autonomous Agents and Multiagent Systems - Volume 1, pages 689– 696.

[Kolstad, 1985] Kolstad, C. (1985). A review of the literature on bi-level mathematical program-ming. Technical report, Los Alamos Nat. Lab.

[Krichene et al., 2014] Krichene, W., Reilly, J. D., Amin, S., and Bayen, A. M. (2014). Stackelberg routing on parallel networks with horizontal queues. IEEE Transactions on Automatic Control, 59(3):714–727.

[Labb´e et al., 1998] Labb´e, M., Marcotte, P., and Savard, G. (1998). A bilevel model of taxation and its application to optimal highway pricing. Management science, 44(12-part-1):1608–1622.

(31)

[Labb´e and Violin, 2016] Labb´e, M. and Violin, A. (2016). Bilevel programming and price setting problems. Annals of Operations Research, 240(1):141–169.

[Leitman, 1978] Leitman, G. (1978). On generalized stackelberg strategies. J. Optim. Theory Appl., 26(4):637–643.

[McMasters and Mustin, 1970] McMasters, A. W. and Mustin, T. M. (1970). Optimal interdiction of a supply network. Naval Research Logistics Quarterly, 17(3):261–268.

[Paruchuri et al., 2008] Paruchuri, P., Pearce, J. P., Marecki, J., Tambe, M., Ord´o˜nez, F., and Kraus, S. (2008). Playing games for security: An efficient exact algorithm for solving bayesian stackelberg games. In International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2, pages 895–902.

[Pochet and Wolsey, 2006] Pochet, Y. and Wolsey, L. A. (2006).Production Planning by Mixed Inte-ger Programming (SprinInte-ger Series in Operations Research and Financial Engineering). SprinInte-ger- Springer-Verlag New York, Inc.

[Savard, 1989] Savard, G. (1989). Contirbutions à la programmation mathématique a deux niveaux. PhD thesis, École Polytechnique, Université de Montreal.

[Sherali and Adams, 1994] Sherali, H. D. and Adams, W. P. (1994). A hierarchy of relaxations and convex hull characterizations for mixed-integer zero-one programming problems.Discrete Applied Mathematics, 52(1):83 – 106.

[Shieh et al., 2012] Shieh, E., An, B., Yang, R., Tambe, M., Baldwin, C., DiRenzo, J., Maule, B., and Meyer, G. (2012). Protect: A deployed game theoretic system to protect the ports of the united states. In International Conference on Autonomous Agents and Multiagent Systems -Volume 1, volume 1, pages 13–20.

[Smith and Lim, 2008] Smith, J. and Lim, C. (2008). Algorithms for network interdiction and forti-fication games. In Chinchuluun, A., Pardalos, P., Migdalas, A., and Pitsoulis, L., editors,Pareto Optimality, Game Theory and Equilbiria, volume 17. Springer Optimization and its Applications. [Snyder et al., 2016] Snyder, L. V., Atan, Z., Peng, P., Rong, Y., Schmitt, A. J., and Sinsoysal, B. (2016). OR/MS models for supply chain disruptions: a review. IIE Transactions, 48(2):89–109. [von Stackelberg, 2011] von Stackelberg, H. (2011). Market Structure and Equilibrium. Springer.

Translated by Bazin, D., Urch, L. and Hill, R.

[Yang et al., 2014] Yang, R., Ford, B., Tambe, M., and Lemieux, A. (2014). Adaptive resource allo-cation for wildlife protection against illegal poachers. InInternational Conference on Autonomous Agents and Multiagent Systems, pages 453–460.

[Yang et al., 2013] Yang, R., Jiang, A. X., Tambe, M., and Ord´o˜nez, F. (2013). Scaling-up security games with boundedly rational adversaries: A cutting-plane approach. In Rossi, F., editor,IJCAI, pages 404–410. IJCAI/AAAI.

(32)

[Yin et al., 2012] Yin, Z., Jiang, A. X., Tambe, M., Kiekintveld, C., Leyton-Brown, K., Sandholm, T., and Sullivan, J. P. (2012). Trusts: Scheduling randomized patrols for fare inspection in transit systems using game theory. AI Magazine, 33(4):59–72.

[Yin and Tambe, 2012] Yin, Z. and Tambe, M. (2012). A unified method for handling discrete and continuous uncertainty in bayesian stackelberg games. In Conitzer, W. and van der Hoek, editors, AAMAS.