4.2 Constrained Bethe Free Energy Minimization Problem
4.2.4 Connection to the ML Decoding Problem
Based on the above proved continuity properties of the global minimal solution over the temperature, we are now ready to illustrate the connection between the constrained Bethe free energy minimization problem and the ML decoding problem. Briefly, this will be done via three steps. The first two steps are devoted to elucidate Fig. 4.3, while the third step is to provide a complementary view of the constrained Bethe free energy minimization problem in the context of ML decoding.
Connection atT = 0
WithT = 0, the constrained Bethe free energy minimization problem in (4.15) reduces to ˆ
b0 = arg min
b∈R2Ne+1+2Ne+
UB(b) subject to (4.13) and (4.14). (4.27)
As the Bethe average energy UB(b) is a linear function of b just as the constraints in
problem. Following the approach in [120], we obtain the following sufficient condition on ˆ
b0 being linked to the ML solution ˆe.
Proposition 4.4. Whenever a global optimal solution of the LP problem (4.27) is an integer- valued vector, i.e., ˆb0 ∈ {0, 1}2
Ne+1+2Ne
, it has a deterministic relation with the ML solution ˆ
e, i.e.,
ˆbα,T =0(e) = ˆbβ,T =0(e) =
(
1, if e = ˆe
0, otherwise ∀i ˆbi,T =0(ei) = (
1, if ei = ˆei
0, otherwise. (4.28) Proof. Under the assumption that the Bethe average energy is minimized by an element in {0, 1}2Ne+1+2Ne
, we can narrow the set R2Ne+1+2Ne
+ in (4.27) to the set {0, 1}2
Ne+1+2Ne
without loss of optimality. By doing so, the LP problem in (4.27) becomes an integer pro- gramming (IP) problem, i.e.,
ˆ
b0 = arg min
b∈{0,1}2Ne+1+2Ne UB
(b) subject to (4.13) and (4.14) (4.29) as the entries of b can only take on the value0 or 1.
Considering the normalization constraints in (4.13) within the set{0, 1}2Ne+1+2Ne
, any feasible pmfbα(e) and bβ(e) must have value 1 at one and only one bit sequence and have
value0 for the other bit sequences, i.e., bα(e) =
(
1, if e= α
0, otherwise and bβ(e) = (
1, if e= β
0, otherwise (4.30)
where α and β are two bit sequences in the set{0, 1}Ne. Substituting such structured pmf
bα(e) and bβ(e) back into the Bethe average energy, we obtain
UB(b) =− ln Λα(α)− ln Λβ(β) (4.31)
while the marginalization consistency constraints in (4.14) become ∀i bi(ei = 1) = X e:ei=1 bα(e) = ( 1 if αi = 1 0 otherwise (4.32) ∀i bi(ei = 1) = X e:ei=1 bβ(e) = ( 1, if βi = 1 0, otherwise. (4.33)
We note that satisfying both (4.32) and (4.33) is equivalent to satisfying the condition α = β. We also note that minimization of UB(b) over b boils down to minimization
over (α, β). Under such identification, the IP problem is effectively identical to the ML decoding problem in (4.7). As such, the integer-valued ˆb0 is provably linked to the ML solution in the form of (4.28).
Remarks: Generally speaking, the entries of ˆb0 can be integers within{0, 1} or frac-
tional numbers within the open interval(0, 1). When ˆb0is fractional, there is no evidence that the ML solution can be extracted from them. Therefore, the constrained Bethe free energy minimization problem at zero temperature is only an approximation to the ML de- coding problem. In Section 4.4, we will specify some circumstances under which the LP problem tends to have an integer-valued global optimal solution.
Connection atT > 0
In the following, we proceed to show the connection attained at a positive temperature. Proposition 4.5. Suppose ˆb0 is a unique element inΩ(0) and it happens to be an integer-
valued vector, i.e., ˆb0 ∈ {0, 1}2
Ne+1+2Ne
. Then, there must exist a positive temperature thresh- oldTthr such that∆T ∈ (0, Tthr) implies the existence of an element in Ω(∆T ) that reflects
the ML solution in the form of
∀i ∈ {1, 2, . . . , Ne} ln"ˆbi,∆T(ei = 1) ˆbi,∆T(ei = 0) # > 0, if ˆei = 1 ln"ˆbi,∆T(ei = 1) ˆbi,∆T(ei = 0) # < 0, if ˆei = 0. (4.34)
Proof. When ˆb0 ∈ Ω(0) is an integer-valued vector, there exists a deterministic relation
between ˆb0 and the ML solution ˆe (cf. Proposition 4.4). In particular, by computing the log-probability ratio of the bitei with respect to the pmf ˆbi,0(ei), the relation given as
ˆbi,0(ei) =
(
1, ifei = ˆei
0, ifei 6= ˆei
(4.35) can be alternatively expressed as
ln"ˆbi,0(ei = 1) ˆbi,0(ei = 0) # = ∞, ifeˆi = 1 −∞, if ˆei = 0. (4.36)
From the above, the ML solution ˆe can be found by detecting the sign information of the log-probability ratios. In fact, the ML solution can be obtained based on any feasible solu- tion b that attains the same sign information, i.e.,
∀i ∈ {1, 2, . . . , Ne} sgn ! ln"ˆbi,0(ei = 1) ˆbi,0(ei = 0) #% = sgn ln bi(ei = 1) bi(ei = 0) . (4.37)
Based on the understanding in the above, we draw a hyper-sphere centered at ˆb0. The radius ε for the hyper-sphere is determined to ensure any feasible solution within the hyper-sphere, i.e.,kb − ˆb0k2 ≤ ε, satisfies (4.37). Under the assumption that ˆb0is a unique
element inΩ(0), Proposition 4.3 allows us to use the constructed ε to find a temperature thresholdTthr > 0 such that ∆T ∈ (0, Tthr) implies an element in Ω(∆T ) must reside in
the hyper-sphere and thus reflects the ML solution in the form of (4.34).
Remarks: First, in order to use Proposition 4.3 for supporting the above proof, it is im- portant ˆb0is a unique element inΩ(0). If Ω(0) contains not only integer element but also
other fractional elements, the unfavorable case may happen, meaning that the common el- ements in bothΩ(0) and Ω(∆T ) are those fractional elements but not the wanted integer
one. Second, the temperature thresholdTthrdepends on the radiusε. Being more specific,
Tthrincreases along withε. As only the sign information of the log-probability ratios needs
to be preserved, see (4.37), the radiusε can be relatively high and so is its associated tem- perature thresholdTthr. WhenTthris larger than one, we are able to find the ML solution by
minimizing the constrained Bethe free energy at unit temperature. Third, the assumption in Proposition 4.5 forms a sufficient condition for being able to find the ML solution by min- imizing the constrained Bethe free energy at a positive temperature. It remains possible that ˆb0is fractional and/or has no relation to the ML solution, but ˆb∆T ∈ Ω(∆T ) attained at a positive temperature∆T > 0 is linked to the ML solution in the form of (4.34). Complementary View of the Constrained Bethe Free Energy Minimization The whole concept of Bethe free energy originates from physics. In the following, we provide a complementary view of it in the context of ML decoding.
Under the normalization constraints in (4.13) and also the following constraints
bα(e) = bβ(e) = Ne
Y
i=1
bi(ei) ∀e ∈ {0, 1}Ne (4.38)
the Bethe average energy is lower bounded by UB(b) =− X e bα(e) ln Λα(e)− X e bβ(e) ln Λβ(e) (a) = −X e bα(e) ln Λα(e)− X e bα(e) ln Λβ(e) =−X e
bα(e) ln [Λα(e)Λβ(e)]
≥ −nmax e ln [Λα(e)Λβ(e)] o · " X e bα(e) # (b) =−nmax e ln [Λα(e)Λβ(e)] o (4.39) where the equality at(a) is because of the constraint (4.38) and the equality at (b) is due to (4.13). By the definition of the ML solution, i.e., ˆe= arg max ln [Λα(e)Λβ(e)], the above
lower bound is attained at
bα(e) = bβ(e) = Ne Y i=1 bi(ei) = 1, if e= ˆe 0, otherwise (4.40)
which has a deterministic relation to the ML solution ˆe. Based on this identification, an equivalent ML decoding problem can be constructed as follows
min
b∈R2Ne+1+2Ne+
Note that, the number of constraints given in (4.38) grows exponentially with the bit se- quence lengthNe. For cases of interest, the equivalent ML decoding problem in (4.41) is
far too complex. Therefore, approximations are necessary.
One approach is the constraint relaxation. For any b that satisfies the constraints given in (4.38), it can also satisfy the marginalization consistency constraints in (4.14). Since the converse does not hold, the marginalization consistency constraints in (4.14) can be interpreted as an outcome of relaxing the constraints in (4.38). Replacing the constraints in (4.38) by those in (4.14), the equivalent ML decoding problem in (4.41) becomes
min
b∈R2Ne+1+2Ne+
UB(b) subject to (4.13) and (4.14) (4.42)
which is formally identical to the constrained Bethe free energy minimization problem at zero temperature. This identification implies the constrained Bethe free energy minimiza- tion problem at zero temperature can be viewed as an outcome of approximating the ML decoding problem by means of the constraint relaxation.
On the basis of such constraint relaxation, we can improve the approximation by in- volving a penalty term for the violation of (4.38) into the problem in (4.42). The term −HB,1(b) defined as −HB,1(b)=∆ X e bα(e) ln " bα(e) QNe i=1bi(ei) # | {z } A +X e bβ(e) ln " bβ(e) QNe i=1bi(ei) # | {z } B (4.43)
can be used as a measure of the violation of the constraints (4.38). By noting the term A and B are the Kullback-Leibler divergence between the pmf bα(e) andQNi=1e bi(ei), and
between the pmfbβ(e) andQNi=1e bi(ei), the minimum of−HB,1(b) subject to (4.13) and
(4.14) equals zero and is achievable when the constraints (4.38) are not violated. Using the temperatureT as the penalty coefficient, we obtain an improved approximation to the equivalent ML decoding problem (4.42), i.e.,
min
b∈R2Ne+1+2Ne+
UB(b)− T HB,1(b) subject to (4.13) and (4.14). (4.44)
It is worth to note that the global optimal solution of the problem in (4.44) may be located at a vertex of the unit hypercube[0, 1]2Ne+1+2Ne
. For a vertex to be a global optimal solution, being a stationary point is not a necessary condition. Hence, algorithms targeting stationary points are not applicable for locating such global optimal solution. Furthermore, due to a large number of vertexes, we are actually lack of efficient algorithms to search for the global optimal solution among them. To avoid the expensive search, we can add the following term HB,2(b) ∆ = Ne X i=1 X ei bi(ei) ln bi(ei) (4.45)
to the objective function in (4.46), i.e., min
b∈R2Ne+1+2Ne+
whereT′ is a weighting factor. Because the term H
B,2(b) subject to (4.13) and (4.14) is
the negative of the entropy of the whole bit sequence e, any vertex maximizes it. As a consequence, a sufficiently largeT′can make the global minimizer of the problem in (4.46)
away from any vertex. For instance, with T′ equal to T , the probability of any global
minimizer being located a vertex becomes zero. This is based on Proposition 4.1 after noting the equivalence between THB(b) and −T HB,1(b) + THB,2(b) subject to (4.13)
and (4.14), see (4.24), and also the equivalence between the constrained Bethe free energy minimization problem and the optimization problem (4.46) withT′ = T . Concluding from the above, the constrained Bethe free energy minimization problem atT > 0 is an outcome of approximating the equivalent ML decoding problem by means of a penalty method, which also enables efficient ways to solve the problem. In particular, the temperature T has two roles: 1) weight the penalty term and 2) ensure the global minimizer being away from a vertex. Due to its second role, the value ofT is not the larger the better. According to the remarks of Proposition 4.1, the global minimizer of the constrained Bethe free energy approaches an interior point given as (4.26) as T → ∞. Since the pmf bi(ei) yields an
equal probability for the bit value0 and 1, we become unable to extract the ML solution by examining the sign information of the log-probability ratio of such pmf.