• No results found

A measure of mutual divergence among a number of probability distributions

N/A
N/A
Protected

Academic year: 2020

Share "A measure of mutual divergence among a number of probability distributions"

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

Vol. i0, No.3 (1987) 597-608

597

A MEASURE

OF

MUTUAL

DIVERGENCE AMONG

A

NUMBER

OF

PROBABILITY DISTRIBUTIONS

J.N. KAPUR DepartmentofMathematics Indian Institute ofTechnology

New Delhi,India 110016

VINOD KUMARandUMA KUMAR Schoolof Business CarletonUniversity

Ottawa,

CanadaK1S 5B6

(Received

September 11,1985andinrevisedform September 22,

1986}

ABSTRACT.The principle of optimalityofdynamicprogramming isusedto provethree major inequalities duetoShannon,Renyi and Holder. The inequalitiesarethenused to obtain someusefulresults in informationtheory. Inparticularmeasures areobtained to measurethemutual divergence among twoormoreprobabilitydistributions.

KEYWORDSANDPHRASES. Divergence measures, dynamic programming,Shannon,

Renyi andHolder inequalities.

1980AMSSUBJECT CLASSIFICATIONCODE. 94A17, 94A15,90C50,90C39

1. INTRODUCTION

An

importantproblem inbusiness, economic andsocial sciences is concerned with measuring differences in specific characteristics ofbusiness, economic or social groups. For example,wemay like tocompare the balancesheets ofmbusinessfirms,oreconomic developmentof,nnationsorthe difference in opinions of m directors ofacompanyorthe population compositions of,ncitiesandso on.

Let

qo"representtheshareofthe jth item from the ith group.Thus

qq

mayrepresent proportionate contributionsof thejthitemin the balance sheet of the ith companyorit mayrepresent thepopulationof persons in the jth income groupofthe ith nationorthe proportionateallotmenttothe jthitemof the budget proposedby the ith directororthe proportionatepopulation ofthe jth social group in the ith city andso on. Let
(2)

where

E

qi=l, qiy>_O

Given

Qt, Q,"" ,Q,,,

ourproblemistofindhdren

z,

Q,..-,Q

e. We

cregd

Qz, Q:,...,

Qm mprobdfibugiomd thourproblem tofind

msmureof muufl

vsrgne

bn

t,

z,’",

.

mmureshould hav thfoHongpropri:

(i)

Ishoulddepsndon

Qz,Q,...,

(ii)

I

should b

congruous

feionof

qi’s

Off)

I

shodhogchgwhenqit, qiz,"’, qin pud ong

thslves,

pro-videdthseprmuion

(iv)

Ishould bs

Mways

0

(v)

Ishould vmh when

(vi)

Ishould vh only when

(vii)

Ishould poibly besconvex ncionof

Qt,Q,.-.,

For

o

distributions,

(m

2)

wehve number ofmemurofd divergence due

o

Kullb-Leibler, HavrdCh,nyi,

Kpur

d othe

[5].

However,

inreM fewe econcernedwith

m(

2)

dis&buio. Wecofcoupefind dddivergence between every

p

ofm d&ribu&io d then find me

r

of average of he directed divergenc.

However,

we sh prefer to hve unified memure depending decly on the m dribuions. We develop

su

memu the

presen

paper.

Sceoneimpor condition for thememure

expressed inqufligy, imporr role playedin thedevelopmenofour new mea-sureby the spifl inugi due Shnon,Renyi, Holder d their generMfions.

Wegive

ternate

proo

of th inquti

usg

dync

progrng

d then

usethee

inuti

todevelopournewmere.

2. DYNAMIC

PROGRAMMING AND

INEQUALIT1F

Kapur

[4]

used dynamic programming technique of Bellman

[2]

to show that the maximumvalueof

[i=tPi

In

Pi subject toPt

>

0,P2

>

0,0, ,Pn

>

0,

i=1

Pl cis -c

In(c/n)

and itariseswhen Pl P2 p

c/n.

In

particular, whenc 1,we findthat the maximum value of the Shannon’s

[7]

entropymeasure

[i__

pi

In

psubject

topl

>

0,p

>

0,...,p,

>

0, pi 1 is

In

nand arise when all the probabilitiesare

equal. Theresultisequivalent to showing that

(3)

DISTRIBUTIONS

Thesametechniquecanbe usedtoestablishinequalities of the form

f(p,,q,)

<_

Aor

f(p,,q,) >_

B

(2.2)

In

the first case,weshow that the mmcimumvalue of

,--x

f(P,

q)

isAand in the second case,weshow that theminimumvalue of

,__x

f(p, q)

is B.

Supposewehavetomaximize

,

y(p,

q)

subjectto

p

>_

0,p2

_

0,..-,pn

_>

0; q

_

0, q

>_

0,...,qn

_>

0,

p, a,q,

b

(2.3)

Letthe maximum value be

Wemaykeepp,p2,...,p, fixed’and vary only qx,q,...,q, subjectto

=

q b, then the principle of optimality gives

max

g(a,b)

0

<_

qx

<_

b

[f(P’q)

/

g,,_(a

px,b

q)]

(2.4)

Sincepl isfixed,wehave to maximizeafunction ofonevariable q only. Equation

(2.4)

give a recurrence relation which along with the knowledge of

f(a,b)

can enable us to obtain

g,(a,

b)

for all valuesof n,by proceedingstep by step.

Thesameprocedurecanbeused to find theminimumvalue.

599

3. SHANNON’S

INEQUALITY

Here

f(p,q,)

p,lr,

P--

(3.1)

q

This is a convex functionof q andweektodetermine theminimumvalueof

f

(p, q).

Weget

Using

(2.4)

Assuming

gx

(a, b)

aIn

a_

(3.2)

b

maxql

<

b

[P

Pl__

a-Pl]b-

ql

(3.3)

g,(a,b)

aln

,

{3.4)

(2.4)

gives

9n+l(a,

b)

O

<_

q

<_

b

-

b- qx alna

b Henceby mathematical inductionweget

(4)

or

p,

ln

>

aln

.= q. b

The minimumvalueis obtainedwhen

q q q b

Ifa b

I,

thenweget Shannon’s inequality

pi

In

p

>

0and pi

In

p

0iffpi qifor each

i=t qi i=x qi

Ifb< a,

In

P

>

alna

i=t qi

g

>0 Alternativeproofsaregiven in Aczel and

Daroczy

[I].

(3.10)

4. RENYI’S

INEQUALITY

Here

I

q-a

.f

(Pi,

qi)

a

1

(p

(,,)

.1

(._

)

1

f(Pi,

qi)

aconvexfction ofqidwe

t

todthe

nql

b

[

1 1

(b-

).

--1

(4.2)

By

using mathematicalinduction,wecanshow that

or-1 Piqi

E

Pi

>-i=1 i1

Ifa b 1,weget Renyi’s inequality

The inequality

(4.5)

willhold whenevera b,evenifthecommonvalueis notunity. AlternativeproofsofthisinequalityareavailableinAczel andDaroczy

[1]

andKapur

[6].

5.

HOLDER’S INEQUALITY

(5)

MUTUAL DIVERGENCE OF PROBABILITY DISTRIBUTIONS 601

where

zy’s

and

yy’s

are positive real

numbers,

zi’s

are fixed and

yi’s

vary subject to

zy

M,

then

Now

O<_zi

< M

C5.2)

C5.3)

g(Zl)

’1

+

:

g’(Zl)

q

If

q(q

1)

>0, the minimum value of

g(z)

occurswhen

nd

z

M-

zl

M

[.q(Zl)]m,,t

Mq/[=I/(q-I)

.=

/(q-1)]

q-1

M’I

[

+

]’-’

then

1 1

-+-=1 or q-

1=q/p,

P q

/:(M)

Cz

+

z)

q/)) Mq

M

q

Ifweassumethat

f,(M)

M

q,

then theprincipleofoptimalitygives

)/

f,+l(M)

x

i=l

Mq

O<_z <_M

’-"

Zl)q

(+

+

+,)/.

(5.10)

thus if

q(q-

1)

>O,

(5.11)

Ifq> 1,thisgives

(5.12)

Using

(5.10),

wefindthatthe inequality

(5.12)

holds ifq> l,p> l,p-i

+

q-i

I.
(6)

"= 2"=1 ’=1

(5.13)

6. A GENERALIZATION OF HOLDER’S

INEQUALITY

Nowweconsider the nnJmiationof

where

z.’s, H’s, z/s

areall positive realnumbers,

z.’s

and

y.’s

are kept fixed and

z’s

varysubject to

If

k.(M)

istheminimumvalue of

U,

then

,(M)

M"

providedr> 1,sothat in thiscase

min u

M-

u

0<

uS M

+

Zy

(6.4)

Similarlywhenr

>

1

Now

weconsiderthecasewhen p

>

1,q

>

1,r

>

1,p-*

+

q-*

+

r-*

1,so’that

1 1 r-1 r 1 r 1

-+

or

+

--t

p q r

-1-

r-iq

or

1 1

p,

r-1

q,

r-1

+

7

1;

p----,

q

Again by Holder’sinequalityprovedin the last section when

p’

>1,

q’

> 1
(7)

MEASURE OF PROBABILITY

or or

or

From

(6.6)

and

(6.8),

V

> M

or

r-1

whenever

(6.7)

issatisfied.

The method ofproofisquitegeneraland

(6.9)

isemily generalizedtogive

whenever

qx

>

1,qffi>1,-.-,q,,, >1;

q-X + qX

+...

+

q,,x

1

(e.8)

(6.9)

(6.10)

(6.11)

Inequality

(6.10)

canalso be writtenas

$= if $=

whenever

0<x

<

1,0

<

t

<

2,-..,0

<

,

<

1; t

+ct

+.-.+=

1.

(6.13)

If

p-’s

represent probabilities,weget

(6.14)

In

particular form 2,

(6.14)

gives

x-=<1 whenO<a< I Piqj

Piqi -1 >0

(6.15)

Alternativeproofs of Holder’s inqualityaregiven in Hardy, Littlew00d andPolya

[3]

and Marshalland Olkin

[].

(8)

Holder’sgeneralizedinequality

(6.I0)

holdswheneverql,q2"",qn satisfy

(6.Ii).

How-ever, if

(6.11)

arenotsatisfied,the inequality sign in

(6.10)

may be

reversed

or noinequality

mayhold.

Thus for the specialcasem 3 if 0<r <1, the inequality sign in

(6.5)

isreversed. Ifoneofp or q isnegative, then inHolder’s inequality also the sign is reversedsothat if 0<r< 1, oneofpand q isnegative and

p-1

/

q-1

+

r-i 1,then

’=

.=

.=

.=i

while ifr > l,p > l,q

>

I and

p-

/

-I

/

-

I,

then inequality

(.g)

holds, ff

p-

+

-

+

-

I,

but neithe of the

wo

setsof conditionsis satisfied, thenneither

if

,

,

{o)

holds

Similazly

ziven

,q2,""

,

we cnfredwhether

.I0)

holdsorwhether

{.I0)

with reversed

sizn

ofinequality holdsorwhetheneitherofthese twoinequalities holds.

8.

APPLICATIONS TO

INFORMATION

THEORY

From

Renyi’sorHolder’s inequality,itfollows that

Do(P.

q)

_

i

(8.)

where

P

(P1,p:,’",pn),

Q

(qx,q,-..,qn)

aretwoprobability distributions. Also

D,,(P’Q)

0ilpi qiforeachi.

In

the limitingcasewhen a I,this gives
(9)

PROBABILITY DISTRIBUTIONS Generalizingweget themeasure

where0<c, <1,0<c,

<

1,...,0 <

,,

<

1,

**

+

c +...

c, 1. Byputtingomeof a,, c,...,

,,

equaltozero,it can be uedt measureall divergenceamongsubsets of

P,,

P,,

P,,,.

for threedistributions,Theil

[9]

had proposedthe measure

i=1

(8.4)

as a measure of information improvement.

However,

this can be both negative and positive. Ontheother handourmeasure

(8.3)

isalways

_

0andcanbeamoreeffective measure ofimprovement.

Themeasure

(8.3)

and

(8.4)

arealsorecursivesince

D(,,,.(P’Q"

R)

a’+

/’-

1

pqrl

---

I or

or

+

(a’

+

9’

1)D.,#,.-,(p,,p,,ps,’-"

,p.; q,

+

q,@s,...,.; r,

+

r,rs,-..,r.)

(8.5)

(8.)

9. MEASURE OFSYMMETRIC

MUTUAL

DIVERGENCE

605

(8.7)

Themeasure

(8.3)

isnot symmetric in thesensethat

D,,,(P’Q" R),D=,(Q"

P"

R),D(,,(P"

R’Q)

etc. arenotequal unlessa

=/

1/3.

Themeasure

D],](P’Q

R)

.

1-

2p/"

i ri

i=1

issymmetric.Another symmetricmeasureof mutal divergenceisgivenby

D,,(P.Q

R)

+

D.,$(P"

R’Q)

+

D,,,(Q

P"

R)

+

D,,,(Q"

R’P)

+

D..,(R’Q"

P)

(9.1)

(10)

The corresponding symmetric measure formdistributions willbe thesumof

m

mutual divergences.

Thisisatwo-parameterfamily of measures of mutal symmetric divergences. Themeasure

(9.2)

canbecomparedwith the measure

D,,,(P )

+

D,( P)

+

D,(P R)

+

D(R P)

+

D,,,( R)

+

D,,,(R )

which is alsoameasureof symmetric mutal

vergence.

10. CONCLUDING

REMARKS

By

using dynamic programming, itcanbeeasilyshownthatthe maximum value of Xl,Z...z,, subject to Zl

+

z=

+--.

+

z, c, Zl

_

0,..-,x,

_

0 is

(c/n)’*

and the minimumvalueofzl

+z=+.-.+,,

subject to1’",, 1,zl

>

0,zn

>

0,...,x,>0 is

n’r.

Fromeitherofthese

results,

the Arithmetic-Geometric-Mean Inequality viz.

(10.1)

canbededuced. Fromthisinequality,one canobtainShannon’s and Holder’s inequalities. OnecanalsoproveRenyi’sentropyfirstbyusing dynamicoroeTammingand thendeduce Shannon’s inequality from it. Onecanalso prove first Holder’s inequality, by using

dy-namic programming and then deduce Renyi’s Shannon’s andAGM inequalities fromit.

Inasenseall these inequalitiesareequivalent.

Holder’s,

Renyi’sandShannon’sinequality enableustoget anumber ofmeasureof directd divergence between two given distributions and with the helpofthese measures of directed divergence. Wecanobtainanumber of measures of entropy and inaccuracy.

The generalized Holder’s inequality enableus togetanumber of measures of mutal divergence between

rn(> 2)

probability distributions.

ACKOWLEDQEIENT

Thisresearchwassupportedin partby the NaturalSciencesand

Engineering

ResearchCouncil ofCanada.

REFERENCES

I.

ACZEL,

G.and

DAROCZY,

Z.

On

Measures

of Information and Their

Characteri-zatio__..._n,

Academic

Press,

New York,

1975.

2.

BELLMAN,

R.and

DREYFUS,

S.E.,

Applied DynamicProgramming, 1962. 3.

HARDY,

G.H., LITTLEWOOD

and

POLYA,

G.Inequalities,CambridgeUniversity
(11)

4.

KAPUR,

J.N. On Some Applications of Dynamic ProgrammingtoInformation The-ory,Proc. Ind. Acad. Sci.

68A (1968)

1-11.

5.

KAPUR,

J.N. AComparative Assessmentof

Measure

ofDirected Divergence, Ad___v.

Manag.

Studies 5

(1984)

1-16.

6.

KAPUR,

J.N. Convex Semi-Metric Spaces and Generalized Shannon’s Inequalities. Ind. Journ. PureAppl. Math.

(to

appear),

1986.

7.

MARSHALL, A.W.,

and

OLKIN,

I.Inequalities Theoryof Majorisation andits Ap-plications, Academic

Press,

New York, 1979.

8.

SHANNON,

C.E. AMathematicalTheoryofCommunication,BellSystem_Technical

J.,

2_7 (194s),

379-423,623-656.

References

Related documents

Control area Selection of the control variable (angular position, speed, torque) leaves basic control structures unchanged. The system is proven to be robust to

Abstract:- Different isotopes of medium and heavy beams from neutron poor to neutron rich in an energy range 30-100MeV cross section at 0 o is very

8: Comparison of power sharing by micro sources (MS-1 and MS-2) between proposed model and base

These cracks distributed from microscopic to macroscopicIn order to realize the material,multi-resolution damage method need an equivalent transformation method, makes

Our quantitative analysis differs from that study in three important respects–first we rely on actual data from the United States on accidents related to cellular phone use to

- Comparative qualitative analysis of essential oils in species Satureja subspicata showed similarities with other species from Lamiaceae family such as Th ymus L. In fact,

Experiments were designed with different ecological conditions like prey density, volume of water, container shape, presence of vegetation, predator density and time of

The results in fig.6 and fig.7 shows that the proba- bility of forced termination for the fast users increases slightly than that obtained from the guard channel scheme (fig.1),