3.2 Model-free Source Seeking
3.2.3 Distributed Multi-robot Source Seeking
Now, we return to the case when a sensor team is available to estimate the complete gradient in (3.26) instead of the directional one. As mentioned earlier, if the sensors are localized in the frame of reference of one of them and if all-to-all communication is available, the algorithm can be implemented as is. However, in many scenarios, all-to-all communication is either infeasible or prone to failures and sensor localization needs to be carried out online. In this section, we describe a distributed implementation of the model-free algorithm (3.26), in which the sensors use relative measurements of their neighbors’ states to estimate the collective state of the formation. In detail, the sensors need to estimate the formation state
xt, the centroid mt, and the stochastic approximation to the signal gradient W(xt)zt at
each timetusing only local information. We introduce a fast time-scalek= 0,1, . . ., which will be used for the estimation procedure at each measurement location, i.e., at each timet. During the formation state and gradient estimation at timet, the sensors remain stationary and we drop the indext to simplify the notation.
Let the communication network of thensensors be represented by an undirected graph
G= ({1, . . . , n}, E). Suppose that each sensorireceives a relative measurement of the state of each of its neighbors j∈Ni:
sij(k) =xj −xi+ij(k), ij(k)∼ N(0, Eij), (3.33)
where ij(k) is the measurement noise which is independent at any pair of times on the
fast time-scale and across sensor pairs. If each sensor manages to estimate the states of the whole sensor formation using the measurements {sij(k)}, then each can compute the
finite-difference weights in (3.24) on its own.
The distributed linear Gaussian estimator (2.25) in Sec. 2.6.1can be employed to esti- mate the sensor statesx. Notice that it is sufficient to estimate xin a local frame because neither the finite difference computation (3.24) nor the gradient ascent (3.26) requires global state information. Assume that all sensors know that sensor 1 is the origin at every mea- surement location. Let x∗ :=
0T (x2−x1)T · · · (xn−x1)T T
denote the true sensor states in the frame of sensor 1. Let ˆxi(k) denote the estimate that sensor i has of x∗ at
timek on the fast time scale. The vector form of the measurement equations (3.33) is:
s(k) = (B⊗Idx)
Tx∗+(k), (3.34)
whereB is the incidence matrix of the communication graph G. The measurements (3.34) fit the linear Gaussian model in (2.22). Since the first element of x∗ is always 0, only
(n−1)dxcomponents need to be estimated. As the rank ofB⊗Idx is also (n−1)dx, Thm.
2.10 allows us to use the distributed estimator (2.25) to update ˆxi(k).
Concurrently with the state estimation, sensoriwould be obtaining observationszi,t(k)
of the signal field for k = 0,1, . . .12. In the centralized case (Sec. 3.2.1), each sensor uses the following gradient approximation:
g(mt, y)≈W(xt)zt= n
X
i=1
coli(W(xt))zi,t, (3.35)
where coli(W(xt)) denotes the ith column of the finite-difference-weight matrix. Since xt
andztare not available in the distributed setting, each sensor can use its local measurements zi,t(k) and its estimate ˆxit(k) of the sensor states to form its own local estimate of the signal
gradient: ˆ gi,t(k) :=coli(W(ˆxit(k))) 1 k+ 1 k X τ=0 zi,t(τ). (3.36)
In order to obtain an approximation to g(mt, y) as in (3.35) in a distributed manner, we
use the high-pass dynamic consensus filter ofSpanos et al.(2005) to have the sensors agree on the value of the sum:
ˆ gt(k) :=n 1 n n X i=1 ˆ gi,t(k) .
Each node maintains a state qi,k, receives an input µi,k, and provides an output ri,k with
the following dynamics:
qi,k+1 =qi,k+β X j∈Ni (qj,k−qi,k) +β X j∈Ni (µj,k−µi,k)
ri,k =qi,k+µi,k
(3.37)
whereβ >0 is a step-size. For a connected networkSpanos et al. 2005, Thm. 1 guarantees that ri,k converges to 1/nPiµi,k ask→ ∞. The following result can be shown by letting µi,k := ˆgi,t(k) and is proved in Appendix D.15.
Theorem 3.7. Suppose that the communication graphGis strongly connected. If the sensor nodes estimate their statesx∗ from the relative measurements (3.34) using algorithm (2.25), compute the finite-difference weights (3.24) using the state estimates, and run the dynamic consensus filter (3.37)with inputµi,k := ˆgi,t(k), which was defined in (3.36), then the output ri,k of the consensus filter satisfies:
n lim k→∞E[ri,k] =g(m∗, y) +b, ∀i∈ {1, . . . , n},
where g(m∗, y) is the true signal gradient at m∗ := Pn
i=1x
∗
i/n and b is the error in the finite-difference approximation (3.22).
12
The time-scales of the relative state measurements and the signal measurements might be different but for simplicity we keep them the same.
After this procedure the sensors agree on a centroid for the formation and a gradient estimate, which can be used to compute the next formation centroid according to (3.26). Since the finite-difference weights are recomputed at every t, the formation need not be maintained accurately. This allows the sensors to avoid obstacles and takes care of the motion uncertainty.