Discussion - CONSENSUS-BASED MULTI-AGENT OPTIMIZA-

CHAPTER 3 CONSENSUS-BASED MULTI-AGENT OPTIMIZA-

3.8 Discussion

In this chapter, we introduced the problem of multi-agent optimization in the presence of Byzantine agents, and characterized the fundamental limits of the output quality of any algorithms. By exploiting Byzantine broadcast, Algorithms 2 and 3 essentially solve a centralized optimization problem where there are n cost component functions, among which up to f of them are injected by the system adversary. A much simpler distributed algorithm that achieves the optimal fault-tolerance with only local communication is proposed. As a trade-off, the simpler algorithm achieves somewhat weaker convergence property than the convergence achieved by the algorithms in Section 3.5. In particular, while the algorithms in Section 3.5 ensure that the estimates at non-faulty agents have a limit, the simpler algorithm in Section 3.7 only ensures consensus among the non-faulty agents, but does not necessarily ensure that the estimates have a limit.

Many extensions of these results are possible.

When the underlying communication channel is a broadcast channel (over which all transmissions are received correctly and identically by all agents), the results presented in this report can be proved for n ≥ 2f + 1.

We have also obtained a comparable set of results for the scenario when the cost functions are redundant in some manner (e.g., cost function of agent 3 may equal a convex combination of cost functions of agents 1 and 2), or the optimal sets of the local cost functions are guaranteed to overlap. These results can be found in our report [61].

We so far focus on the unconstrained version of the optimization problem in (3.2). However, we can also generalize our results to the constrained version of problem (3.2) [64]. In particular, let X ⊆ R such that X 6= Ø, and X is convex and closed. Then the constrained version of (3.2) is stated below in (3.76). Observe that the output is now constrained to be in set X .

output xo such that (3.76)

there exists weight vector α for which xo ∈ argmin x∈X 1 |N | X i∈N αihi(x), X i∈N αi = 1, and ∀i, αi ≥ 0.

The algorithm SBG can be adapted to solve (3.76) with a simple modification of the state update in (3.23), by projecting x_ej[t − 1] − λ[t − 1]egj[t − 1] on to set X . This projection guarantees that xj[t] is within the constraint set X .

However, compared to the original algorithm, such a projection introduces a projection error at each iteration. Specifically, the update of xj can be written

as follows, where ei[t − 1] denotes the projection error, and P rojectionX

denotes projection on to X .

xj[t] = P rojectionX (xej[t − 1] − λ[t − 1]egj[t − 1])

= x_ej[t − 1] − λ[t − 1]egj[t − 1] + ei[t − 1]. (3.77) The projection error ei[t] can be shown to approach 0 as t → ∞, and Theorem

17 holds true for the modified algorithm as well [64]. A complete algorithm description and analysis is presented in [64].

When agents crash, we can improve on the

2(|N |−f |), |N | − f

-admissibility achieved in case of Byzantine faults. The algorithm SBG is modified in this case to perform no trimming at all, since the agents do not tamper with mes- sages. For the modified algorithm, we have shown [63] that all the non-faulty agents (agents in N ) produce an output that equals an optimum of a global cost function of the form

c X i∈N hi(x) + X i∈F αihi(x) ! , (3.78)

where F is the set of faulty agents (that crash at some point during the exe- cution), 0 ≤ αi ≤ 1 for each i ∈ F and c is a normalization constant such that

c |N | +P

i∈Fαi = 1. Note that in (3.78), all the local functions associated

with non-faulty agents have equal weights. A finite-time interpretation of the above results is also of practical interest.

We have only considered synchronous systems so far. In an asynchronous system as well, when there are up to f Byzantine faults, algorithm SBG can be modified to achieve fault-tolerant optimization. For instance, algorithm SBG may be combined with the reliable broadcast algorithm in [72]. Alternatively, we can require n > 5f , and combine SBG with the simpler asynchronous iterative Byzantine consensus algorithm in [73]. The two ap- proaches will achieve a trade-off between communication cost and optimization performance.

Open Problems

Incomplete networks In this chapter, we assumed that the underlying communication network is a completely connected. We have also explored SBG-like algorithms [64] for incomplete networks. However, our present approach is not believed to be optimal in general in incomplete network topolo- gies. In particular, as seen previously, algorithm SBG achieves optimal fault tolerance, while also ensuring weights (αi’s) are bounded below by an ad-

equately large constant (particularly, _{2(|N |−f )}1 ). Obtaining equally strong results for incomplete networks remains an open problem.

Vector arguments Algorithm SBG assumes that the domain for the argument of the cost functions is R (or, in case of constrained optimization in (3.76) with X = R). In general, we would like to solve problem (3.2) for vector (i.e., multidimensional) arguments in Rk for k ≥ 2 as well. In recent work, the problem of Byzantine vector consensus has been solved [74, 75]. However, a solution for Byzantine vector consensus by itself is not adequate to be able to solve the optimization problem of interest here. The difficulty lies in the geometry of the set of optima, when the argument is a higher dimensional vector. In particular, unlike the one-dimensional case where set Y defined in (3.25) is convex, it is not necessarily convex when the argument is higher dimensional.

Additionally, Theorem 12 can be extended to d-dimensional inputs to show that no more than |N | − df weights can be non-zero.

Non-smooth cost functions In our work, we assumed continuously dif- ferentiable cost functions. In general, the cost functions may be non-smooth, and the optimization algorithm would need to use subgradients instead of gradients. For the failure-free case, distributed subgradient optimization algorithms indeed exist [3, 4]; however, design and analysis of fault-tolerant optimization algorithms for non-smooth cost functions remain open.

In document Defending distributed systems against adversarial attacks: consensus, consensus-based learning, and statistical learning (Page 98-101)