Dempster-Shafer Models - Relation to other Imprecise Probability Models

4.3 Relation to other Imprecise Probability Models

4.3.2 Dempster-Shafer Models

The textA Mathematical Theory of Evidenceby Shafer [1976] develops a theory of monotonic, sub-additive probabilities it calls belief functions. These have been modified appropriately in this chapter to describe a model of imprecise probabilities we call Dempster-Shafer Belief Models. The key component of a Dempster-Shafer Belief Model is that the imprecise probability can be generated using an idea of ‘evidences’. For any non-empty event there may be (direct) evidence for that event. The evidence for some event E ∈ 2Ω is the mass of E, denoted m(E).4 _{In standard probability mod-}

eling, the likelihood for an event is simply the sum of the evidence for each state in that event, that isP∗E =P_ω∈Em(ω). Dempster-Shafer Belief Models go beyond this

restriction to allow there to be evidence, or mass, on non-singleton events, that is,

P∗E =P_F_⊂_Em(F).

Suppose you are trying to determine the likelihood that Ann committed a murder. If she has motive, then there is at least a 20% chance she committed the murder. If she had means, there is at least a 30% chance she did it. But if she has motive and means, then there is at least a 70% chance it was her. Notably, the likelihood estimate is not additive. Knowing that she had both motive and means gives greater likelihood than just the sum of these. In Dempster-Shafer Belief Models there may be evidence from having motive, evidence from having means, and additional evidence from having both motive and means.

The Dempster-Shafer Belief Function is extended to an imprecise probability by treating the belief function as a lower probability, and the upper probability as the complement probability to the belief that the event does not occur. This mirrors the work in Shafer [1976], and is formalized in Definition 4.13.

Definition 4.13. A Dempster-Shafer Belief Model is a tuple(Ω, m, P)whereΩis a finite non- empty state space,m : 2Ω →[0,1]is a mass function such thatm(∅) = 0, andP

E∈2Ωm(E) =

1, andP : 2Ω _{→ I}_{is an imprecise probability such that for all events}_E _∈₂Ω

P E =   X F∈2Ω_|_F_⊂_E m(F), X F∈2Ω_|_F_∩_E₆₌_∅ m(F)   (4.5)

A function P : 2Ω _{→ I} _{is a Dempster-Shafer Imprecise Probability if it is the imprecise}

probability for some Dempster-Shafer Belief Model. The lower limitP∗ of a Dempster-Shafer

Imprecise Probability is a Dempster-Shafer Belief Function.

Proposition 4.10 shows that Dempster-Shafer Imprecise Probabilities are a sub- class of Multiple Prior Imprecise Probabilities. The argument is constructive, with a particular Multiple Prior Model constructed from a Dempster-Shafer Belief Model. However, for any Dempster-Shafer Belief Model there are typically many Multiple Prior Models with the same imprecise probability.

Proposition 4.10. Let (Ω, m, P) be a Dempster-Shafer Belief Model. Then, there exists a Multiple Prior Model(Ω, I,{µi}i∈I, PM P)such thatP =PM P. Moreover, one such model is

(Ω,2Ω,{µE}E∈2Ω, P_{M P})where µE(ω) =                  X F∈2Ω_|ω∈F F⊂E m(F) |F| ; ifω∈E X F∈2Ω_|ω∈F F6⊂E m(F) |F ∩(¬E)| ; ifω /∈E

Dempster-Shafer Imprecise Probabilities can be characterized by the property of being totally monotonic capacities. Definition 4.14 describes total monotonicity. Definition 4.14. LetP : 2Ω → Ibe a capacity. The functionP is a totally monotone capacity if P∗ n [ i=1 Ei ! ≥ n X i=1 X I⊂{1,...,n}:|I|=i (−1)i+1P∗ \ j∈I Ej ! (4.6)

The definition of total monotonicity might appear somewhat convoluted at first. However, if the inequality in Equation 4.6 is replaced with an equality this becomes the usual Inclusion-Exclusion Principle. As such, the idea of total monotonicity is that the probability of unions of events must be at least as large as that required by the Inclusion-Exclusion Principle.

Proposition 4.11 states that an imprecise probability is a Dempster-Shafer Imprecise Probability precisely when it is totally monotone. Proposition 4.11 is a restatement of Theorem 2.1 from Shafer [1976]. The proof of this statement can be found there. Proposition 4.11. Theorem 2.1 of Shafer [1976]. LetΩbe a state space andm: 2Ω _→_[0,_1]

such that m(∅) = 0, and P

E∈2Ωm(E) = 1. Then, there exists a unique Dempster-Shafer Belief Model(Ω, m, P), and in this modelP is a totally monotone capacity. Similarly, letΩbe a state space, andP a totally monotone capacity. Then, there exists a unique Dempster-Shafer Belief Model(Ω, m, P).

We now wish to investigate the relationship between Dempster-Shafer Belief Mod- els and Behavioral Imprecise Probability Models. Corollary 4.2 contains the main result, and states that(Ω, m, P)is a Dempster-Shafer Belief Model with |Supp(m)| ≤ |Ω| if and only if there exists a Behavioral Imprecise Probability Model where the knowledge operator is a correspondence operator. This is a clear and systematic link between Dempster-Shafer Belief Models where the mass function has relatively small support, and Behavioral Imprecise Probability Models.

However, there is no uniqueness result on the Behavioral Imprecise Probability Models which match each Dempster-Shafer Belief Model. Moreover, while there is a structural link between these frameworks when the support of the mass function is relatively small, it is still possible that a Dempster-Shafer Belief Model can be represented as a Behavioral Imprecise Probability Model even when the mass function has large support. Similarly, it is possible that a Behavioral Imprecise Probability Model can be represented as a Dempster-Shafer Belief Model even when the knowledge operator is not a correspondence operator. These issues are introduced in Examples 4.7 and 4.8, and explored more fully in Subsection 4.3.3.

Proposition 4.12 states that, for a Behavioral Imprecise Probability Model where the knowledge operator is a correspondence, then the Behavioral Imprecise Probability is also a Demster-Shafer Imprecise Probability. Moreover, there is a unique Dempster- Shafer Belief Model which generates this imprecise probability, and the mass function for this Dempster-Shafer Belief Model has relatively small support. The proof contains an explicit construction of the Dempster-Shafer Belief Model for each Behavioral Imprecise Probability Model.

Proposition 4.12. Let (Ω, K, µ, P) be a Behavioral Imprecise Probability Model such that

K is a correspondence operator. Then, there exists a unique Dempster-Shafer Belief Model

(Ω, m, PDS)such thatPDS =P. Moreover,|Supp(m)| ≤ |Ω|.

Unfortunately, we cannot achieve converse statements for this kind of proposition as sometimes it is possible to have a ‘terrible’ knowledge operatorK which just hap- pens to end up with a very nice imprecise probability P, as in Example 4.5. This phenomenon was also seen in the Multiple Prior framework, and examined there in Example 4.4.

Example 4.5. Consider the roll of a single fair die. LetΩ = {1,2,3,4,5,6}andµ(ω) = 1/6

for all ω ∈ Ω. Agent A is fully informed, KAE = E for all events E ⊂ Ω. The imprecise

probability of each event is the true probability of that event,PAE = {|E|/6}. This agent’s

imprecise probabilityPAcan be modeled by a Dempster-Shafer Belief Model, or by an inner and

outer measure, or indeed by a probability function.

By contrast, suppose AgentB is foolish. He knows that each face of the dice is equally likely, and can count the number of faces in each event, but is totally inept at matching the faces in the

event to the faces of the die. Then,|KBE| =|E|, andµ(E) =|E|/6for allE ∈2Ω. However,

for example, K{1} = {2}, K{1,2,4} = {1,3,5}, and others defined similarly at random. However, this agent’s imprecise probability will still have PBE ={|E|/6}. HencePA =PB.

This means the agent’s imprecise probability can still be modeled nicely, even though it was derived from a terrible knowledge operator, violating many of the nice properties including monotonicity, and possibly Axiom D.

Proposition 4.13 provides a partial converse to Proposition 4.12. For every Dempster- Shafer Belief Model with relatively small support, the Dempster-Shafer Imprecise Prob- ability is also a Behavioral Imprecise Probability. Unlike in Proposition 4.12, the imprecise probability can be generated by many different Behavioral Imprecise Probability Models. The rationale for this is explored in Example 4.6, but in essence is because if

PDS =P(K, µ), thenPDS =P(σ◦K, µ◦σ)for any bijectionσ: Ω→Ω.

Proposition 4.13. Let(Ω, m, PDS)be a Dempster-Shafer Belief Model where |Supp(m)| ≤

|Ω|. Then, there exists a Behavioral Imprecise Probability Model(Ω, K, µ, P)such thatP = PDS. If|Ω| 6= 1, then this Behavioral Imprecise Probability Model is not unique.

Example 4.6. Let Ω = {1,2,3} and imprecise probabilities be generated by the DS mass function

m{1}= 1/7, m{1,2}= 2/7, m{2,3}= 4/7, andm(E) = 0otherwise

This generates lower probabilities of

P∗∅= 0, P∗Ω = 1

P∗{1}= 1/7, P∗{2}= 0, P∗{3}= 0

P∗{1,2}= 3/7, P∗{1,3}= 1/7, P∗{2,3}= 4/7

As illustrated in Example 4.5 this lower probability function could be generated from very ugly knowledge structures. However, it can also be generated from knowledge operators which are correspondence operators. For example, the pair(K, µ)with

µ(1) = 1/7, µ(2) = 4/7, µ(3) = 2/7 K∅=∅, KΩ = Ω

K{1}={1}, K{2}=∅, K{3}=∅

is a correspondence operator and generates the sameP∗. This example(K, µ)is not unique. For

every bijectionσ: Ω→ Ω, the pair(σ◦K, µ◦σ)will also be a correspondence operator, and generateP∗.

Propositions 4.12 and 4.13 can be combined to give Corollary 4.2.

Corollary 4.2. LetP be an imprecise probability overΩ. Then,(Ω, m, P)is a Dempster-Shafer Belief Model with|Supp(m)| ≤ |Ω|if and only if there exists a Behavioral Imprecise Probability Model(Ω, K µ, P)such thatK is a correspondence operator.

Proof. Follows immediately from Propositions 4.12 and 4.13.

Mukerji [1997] has a similar result which could be used to link Dempster-Shafer Be- lief Models and Behavioral Imprecise Probability Models when the underlying knowledge structure is a correspondence operator. However, as noted in Section 4.2, the idea of a correspondence information structure in Mukerji is different from here. In Mukerji, the correspondence information functionΓis a mapping from a signal space

Sto non-empty events2Ω_{\ ∅}_{, that is,}_{Γ :}_S_→₂Ω_{\ ∅}_{. In particular, there may be no rela-}

tionship between the signal space and the event space. The restriction in Corollary 4.2 that|Supp(m)| ≤ |Ω|comes about precisely because the support of the mass function must be no larger than the size of thesignal space. In our model the signal space for the corresponence is fixed as the state spaceΩ, while in Mukerji [1997] the signal space may be much larger. This is why Mukerji finds a total equivalence between imprecise probabilities from correspondences and Dempster-Shafer Belief Models, while we do not.

Once the focus moves to operators which are not correspondence operators, or to a Dempster-Shafer Belief Model where the mass function has relatively large support, it becomes more difficult to identify obvious links. We are faced with the same problem as in Example 4.5 where there may happen to be nice representations, but there may not.

Example 4.7 describes a Dempster-Shafer Belief Model whose mass function has relatively large support, but for which there still exists a Behavioral Imprecise Proba- bility Model with the same imprecise probability. In Example 4.7 the operatorK is not a correspondence operator.

Example 4.7. Let(Ω, m, P)be a Dempster-Shafer Belief Model whereΩ ={1,2,3}and

m{1}=m{2}=m{3}=m{1,2}= 1/4, andm(E) = 0otherwise

This generates lower probabilities according to Equation 4.5 of

P∗∅= 0, P∗Ω = 1

P∗{1}= 1/4, P∗{2}= 1/4, P∗{3}= 1/4

P∗{1,2}= 3/4, P∗{1,3}= 1/2, P∗{2,3}= 1/2

Even though|Supp(m)| > |Ω|, nonethelessP∗ can be represented by a Behavioral Imprecise

Probability Model(Ω, K, µ, P). For example, consider

µ(1) = 1/4, µ(2) = 1/2, µ(3) = 1/4 K∅=∅, KΩ = Ω

K{1}={1}, K{2}={1}, K{3}={3}

K{1,2}={1,2}, K{1,3}={2}K{2,3}={2}

This knowledge operatorK is not a correspondence operator, but does satisfyKΩ = Ω, and Axiom D.

By contrast, Example 4.8 has a Behavioral Imprecise Probability Model which does not admit a Dempster-Shafer representation.

Example 4.8. Let(Ω, K, µ, P)be a Behavioral Imprecise Probability Model whereΩ ={1,2,3} and

µ(1) = 3/4, µ(2) = 1/8, µ(3) = 1/8 K∅=∅, KΩ = Ω

K{1}={1}, K{2}={1}, K{3}={3}

K{1,2}={1,2}, K{1,3}={2}K{2,3}={2}

The knowledge operatorKis the same as Example 4.7. Onlyµis different. This generates lower probabilities of

P∗∅= 0, P∗Ω = 1

P∗{1}= 3/4, P∗{2}= 3/4, P∗{3}= 1/8

Suppose, by way of contradiction, that(Ω, m, P)is a Dempster-Shafer Belief Model. Then

m{1}=P∗{1}= 3/4, m{2}=P∗{2}= 3/4, m{3}=P∗{3}= 1/8

But in this case

E⊂Ω

m(E)≥m{1}+m{2}+m{3}>1

which is a contradiction. The model(Ω, K, µ, P)cannot be represented by any Dempster-Shafer Belief Model.

Even though the knowledge operators used in these examples are the same, never- theless in Example 4.7 there is a Dempster-Shafer representation, but in Example 4.8 there is not.

In document Topics in Information Structures (Page 133-140)