Solution to a Function Equation and Divergence Measures

(1)

doi:10.1155/2011/617564

Research Article

Solution to a Function Equation and

Divergence Measures

Chuan-Lei Dong and Jin Liang

Department of Mathematics, Shanghai Jiao Tong University, Shanghai 200240, China

Correspondence should be addressed to Jin Liang,[email protected]

Received 1 January 2011; Accepted 11 February 2011

Academic Editor: Toka Diagana

Copyrightq2011 C.-L. Dong and J. Liang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

We investigate the solution to the following function equationf1xg1y· · ·fnxgny Gx

y, which arises from the theory of divergence measures. Moreover, new results on divergence measures are given.

1. Introduction

As early as in 1952, Chernoff 1used theα-divergence to evaluate classification errors. Since then, the study of various divergence measures has been attracting many researchers. So far, we have known that the Csiszárf-divergence is a unique class of divergences having information monotonicity, from which the dual α geometrical structure with the Fisher metric is derived, and the Bregman divergence is another class of divergences that gives a dually flat geometrical structure different from the α-structure in general. Actually, a divergence measure between two probability distributions or positive measures have been proved a useful tool for solving optimization problems in optimization, signal processing, machine learning, and statistical inference. For more information on the theory of divergence measures, please see, for example,2–5and references therein.

Motivated by these studies, we investigate in this paper the solution to the following function equation

f1xg1

y· · ·fnxgn

yGxy, 1.1

which arises from the discussion of the theory of divergence measures, and show that for

n >1, iffi:a, b → R,gi:a, b → R, i1,2, . . . , n, andG:2a,2b → Rsatisfy n

i1

fixgi

(2)

thenGis the solution of a linear homogenous diﬀerential equation with constant coeﬃcients. Moreover, new results on divergence measures are given.

Throughout this paper, we letRbe the set of real numbers andF ⊂ Rn _{are a convex}

set.

Basic notations:Rn : {x ∈Rn :xi >0, i 1,2, . . . , n};φ :F → Ris strictly convex

and twice diﬀerentiable; π : Rn

→ F is diﬀerentiable injective map; Dφπ is the general

vector Bregman divergence; f : 0,∞ → 0,∞ is strictly convex twice-continuously diﬀerentiable function satisfyingf1 0, f1 0;Df is the vectorf-divergence.

If for everyp, q∈Rn,

D_φπp:qDf

p:q, 1.3

then we say the Dπ_φ or Df is in the intersection of f-divergence and general Bregman

divergence.

For more information on some basic concepts of divergence measures, we refer the reader to, for example,2–5and references therein.

2. Main Results

Theorem 2.1. Assume that there are diﬀerentiable functions

fi:a, b−→R, gi:a, b−→R, i1,2, . . . , n, 2.1

andG:2a,2b → Rsuch that

n

i1

fixgi

yGxy, for everyx, y∈a, b. 2.2

ThenG∈C∞2a,2band

anGnan−1Gn−1· · ·a1Ga0G0, 2.3

for somean, an−1, . . . , a0∈R.

Proof. Sincefi, giis diﬀerentiable functions, it is clear that

fi, gi∈L2a, b, i1,2, . . . , n. 2.4

Let

Mspanf1, f2, . . . , fn

. 2.5

ThenMis a finite dimension space. So we can find diﬀerentiable functions

(3)

as the orthonormal bases ofM, wherem≤n. Observing that

n

i1

fixgi

y n

i1 ⎡ ⎣m

j1

aijsjxgi

y

⎤ ⎦

m

j1

sjx n

i1

aijgi

y

m

j1

sjxtj

y,

2.7

where

aij∈R, tj

y n

i1

aijgi

y, i1,2, . . . , n, j1,2, . . . , m, 2.8

we have

Gxy n

i1

fixgi

y m

j1

sjxtj

y, for everyx, y∈a, b. _2.9

Clearly,

tj ∈L2a, b, j1, . . . , m. 2.10

Next we prove that

tj∈M, j1, . . . , m. 2.11

It is easy to see that we only need to prove the following fact:

span{s1, s2, . . . , sm, t1, t2, . . . , tm}M. 2.12

Actually, if this is not true, that is,

span{s1, s2, . . . , sm, t1, t2, . . . , tm}/M, 2.13

then there existst /0 such that

(4)

(5)

Therefore,

SinceCis a symmetric matrix, we have

(6)

for an orthogonal matrixQ, and a diagonal matrix

Without loss the generalization, we can assume that

λ1, λ2, . . . , λm/0. 2.33

By the similar arguments as above, we can prove

spanr1, . . . , rm, r1, . . . , rm

span{r1, . . . , rm}. 2.35

So there is a matrixAsatisfying

(7)

Thus,

Gxy ∂G

xy

∂x RxAΛR

yT. 2.37

By mathematical induction we obtain

GixyRxAiΛRyT, ∀i0,1, . . . 2.38

SoG∈C∞2a,2b. Let

b0b1λ· · ·bmλm 2.39

be the annihilation polynomial ofA. Then

b0G

xyb1G

xy· · ·bmGm

xy

m

i0

biRxAiΛR

yT

Rx m

i0

biAiΛR

y

0.

2.40

Sincen≥m, we can findan, an−1, . . . , a0∈Rsuch that

anGnan−1Gn−1· · ·a1Ga0G0. 2.41

The proof is then complete.

Theorem 2.2. Let the f-divergence Df be in the section of f-divergence and general Bregman divergence. ThenGx fex_satisfies

n

i0

aiGi0, 2.42

for somean, . . . , a0∈R.

Proof. IfDf, D_φπare in the intersection off-divergence and general Bregmen divergence, then

we have

xfy x

nφπX−φπY− n

i1

∂φπY ∂xi

(8)

where

X x, x, . . . , x∈Rn, Y y, y, . . . , y∈Rn. 2.44

Let

∂φπY ∂xi si

y, πiX tix. 2.45

Then

∂2_xf_y/x_n

∂x∂y

∂2_φ_π_X₋_φ_π_Y₋n i1si

ytix−ti

y

∂x∂y . 2.46

Hence

y x2f

y x

n

i1

s_iyt_ix. 2.47

Let

Gx fex_, _f ix

s_iex

ex , gix ti

e−x_e−2x_. _2.48

Then

Gxy n

i1

fixgi

y. 2.49

Thus, a modification ofTheorem 2.1implies the conclusion.

Moreover, it is not so hard to deduce the following theorem.

Theorem 2.3. Let a vectorf-divergence is are the intersection of vectorf-divergence and general Bregman divergence andπsatisfy

πx π1x1, . . . , πnxn, ∀x∈Rn, 2.50

where π1, . . . , πn is strictly monotone twice-continuously diﬀerentiable functions. Then the f divergence isα-divergence or vectorα-divergence times a positive constantc.

Acknowledgments

(9)

References

1 H. Chernoﬀ, “A measure of asymptotic eﬃciency for tests of a hypothesis based on the sum of observations,”Annals of Mathematical Statistics, vol. 23, pp. 493–507, 1952.

2 S. Amari, “Information geometry and its applications: convex function and dually flat manifold,” in

Emerging Trends in Visual Computing, F. Nielsen, Ed., vol. 5416 ofLecture Notes in Computer Science, pp. 75–102, Springer, Berlin, Germany, 2009.

3 L. M. Br`egman, “The relaxation method of finding a common point of convex sets and its application to the solution of problems in convex programming,”Computational Mathematics and Mathematical Physics, vol. 7, pp. 200–217, 1967.

4 A. Cichocki, R. Zdunek, A. H. Phan, and S. Amari, Non-Negative Matrix and Tensor Factorizations: Applications to Explanatory Multi-Way Data Analysis and Blind Source Separation, Wiley, New York, NY, USA, 2009.