• No results found

Probability Functions and Some Basic Properties and Results

Some Probabilistic Concepts and Results

2.1 Probability Functions and Some Basic Properties and Results

Intuitively by an experiment one pictures a procedure being carried out under a certain set of conditions whereby the procedure can be repeated any number of times under the same set of conditions, and upon completion of the proce-dure certain results are observed. An experiment is a deterministic experiment if, given the conditions under which the experiment is carried out, the outcome is completely determined. If, for example, a container of pure water is brought to a temperature of 100°C and 760mmHg of atmospheric pressure the out-come is that the water will boil. Also, a certificate of deposit of $1,000 at the annual rate of 5% will yield $1,050 after one year, and $(1.05)n× 1,000 after n years when the (constant) interest rate is compounded. An experiment for which the outcome cannot be determined, except that it is known to be one of a set of possible outcomes, is called a random experiment. Only random experiments will be considered in this book. Examples of random experiments are tossing a coin, rolling a die, drawing a card from a standard deck of playing cards, recording the number of telephone calls which arrive at a telephone exchange within a specified period of time, counting the number of defective items produced by a certain manufacturing process within a certain period of time, recording the heights of individuals in a certain class, etc. The set of all possible outcomes of a random experiment is called a sample space and is denoted by S. The elements s of S are called sample points. Certain subsets of S are called events. Events of the form {s} are called simple events, while an event containing at least two sample points is called a composite event.S and

∅ are always events, and are called the sure or certain event and the impossible event, respectively. The class of all events has got to be sufficiently rich in order to be meaningful. Accordingly, we require that, if A is an event, then so is its complement Ac. Also, if Aj, j= 1, 2, . . . are events, then so is their union UjAj.

2.4 Combinatorial Results 15

DEFINITION 1

2.1 Probability Functions and Some Basic Properties and Results 15

(In the terminology of Section 1.2, we require that the events associated with a sample space form a σ-field of subsets in that space.) It follows then that IjAj is also an event, and so is A1− A2, etc. If the random experiment results in s and s∈ A, we say that the event A occurs or happens. The UjAj occurs if at least one of the Aj occurs, the IjAj occurs if all Aj occur, A1− A2 occurs if A1 occurs but A2 does not, etc.

The next basic quantity to be introduced here is that of a probability function (or of a probability measure).

A probability function denoted by P is a (set) function which assigns to each event A a number denoted by P(A), called the probability of A, and satisfies the following requirements:

(P1) P is non-negative; that is, P(A)≥ 0, for every event A.

(P2) P is normed; that is, P(S) = 1.

(P3) P is σ-additive; that is, for every collection of pairwise (or mutually) disjoint events Aj, j= 1, 2, . . . , we have P(ΣjAj)= ΣjP(Aj).

This is the axiomatic (Kolmogorov) definition of probability. The triple (S, class of events, P) (or (S, A, P)) is known as a probability space.

REMARK 1 IfS is finite, then every subset of S is an event (that is, A is taken to be the discrete σ-field). In such a case, there are only finitely many events and hence, in particular, finitely many pairwise disjoint events. Then (P3) is reduced to:

(P3′) P is finitely additive; that is, for every collection of pairwise disjoint events, Aj, j= 1, 2, . . . , n, we have

P Aj P A

j n

j j

n

= =

∑ ∑

⎝⎜

⎠⎟=

( )

1 1

.

Actually, in such a case it is sufficient to assume that (P3′) holds for any two disjoint events; (P3′) follows then from this assumption by induction.

2.1.1 Consequences of Definition 1 ( )C1 P( )∅ =0. In fact, S S= + + ⋅ ⋅ ⋅∅ , so that

P

( )

S =P

(

S+ + ⋅ ⋅ ⋅

)

=P

( )

S +P

( )

+ ⋅ ⋅ ⋅,

or

1= +1 P

( )

+ ⋅ ⋅ ⋅ and P

( )

=0,

since P(∅) ≥ 0. (So P(∅) = 0. Any event, possibly ⫽ ∅, with probability 0 is called a null event.)

(C2) P is finitely additive; that is for any event Aj, j= 1, 2, . . . , n such that Ai∩ Aj= ∅, i ≠ j,

P Aj P A

j n

j j

n

= =

∑ ∑

⎝⎜

⎠⎟=

( )

1 1

.

Indeed, for Aj=0,j≥ +n 1,P

(

Σjn=1Aj

)

=P

(

Σj=1PAj

)

=Σj=1P A

( )

j = Σjn=1P A

( )

j .

(C3) For every event A, P(Ac)= 1 − P(A). In fact, since A + Ac= S, P A

(

+Ac

)

=P S

( )

, or P A

( )

+P A

( )

c =1,

so that P(Ac)= 1 − P(A).

(C4) P is a non-decreasing function; that is A1⊆ A2 implies P(A1)≤ P(A2).

In fact,

A2=A1+

(

A2A1

)

,

hence

P A

( )

2 =P A

( )

1 +P A

(

2A1

)

,

and therefore P(A2)≥ P(A1).

REMARK 2 If A1⊆ A2, then P(A2− A1)= P(A2)− P(A1), but this is not true, in general.

(C5) 0 ≤ P(A) ≤ 1 for every event A. This follows from (C1), (P2), and (C4).

(C6) For any events A1, A2, P(A1∪ A2)= P(A1)+ P(A2)− P(A1∩ A2).

In fact,

A1A2 =A1+

(

A2A1A2

)

.

Hence

P A A P A P A A A

P A P A P A A

1 2 1 2 1 2

1 2 1 2

(

)

=

( )

+

(

)

=

( )

+

( )

(

)

,

since A1∩ A2⊆ A2 implies

P A

(

2A1A2

)

=P A

( )

2 P A

(

1A2

)

.

(C7) P is subadditive; that is,

P Aj P A

j

j

= j

=

⎝⎜

⎠⎟≤

∑ ( )

1 1

U

and also

P Aj P A

j n

j j

n

= =

⎝⎜

⎠⎟≤

∑ ( )

1 1

U

.

2.4 Combinatorial Results 17

This follows from the identities

Aj A A A A A A

(P3) and (C2), respectively, and (C4).

A special case of a probability space is the following: Let S = {s1, s2, . . . , sn}, let the class of events be the class of all subsets of S, and define P as P({sj})= 1/n, j= 1, 2, . . . , n. With this definition, P clearly satisfies (P1)–(P3′) and this is the classical definition of probability. Such a probability function is called a uniform probability function. This definition is adequate as long as S is finite and the simple events {sj}, j = 1, 2, . . . , n, may be assumed to be “equally likely,” but it breaks down if either of these two conditions is not satisfied.

However, this classical definition together with the following relative frequency (or statistical) definition of probability served as a motivation for arriving at the axioms (P1)–(P3) in the Kolmogorov definition of probability. The relative frequency definition of probability is this: Let S be any sample space, finite or not, supplied with a class of events A. A random experiment associated with the sample space S is carried out n times. Let n(A) be the number of times that the event A occurs. If, as n→ ∞, lim[n(A)/n] exists, it is called the probability of A, and is denoted by P(A). Clearly, this definition satisfies (P1), (P2) and (P3′).

Neither the classical definition nor the relative frequency definition of probability is adequate for a deep study of probability theory. The relative frequency definition of probability provides, however, an intuitively satisfac-tory interpretation of the concept of probability.

We now state and prove some general theorems about probability functions.

(Additive Theorem) For any finite number of events, we have

P A P A P A A

PROOF (By induction on n). For n = 1, the statement is trivial, and we have proven the case n= 2 as consequence (C6) of the definition of probability functions. Now assume the result to be true for n = k, and prove it for n= k + 1.

2.1 Probability Functions and Some Basic Properties and Results 17

THEOREM 1

We have Replacing this in (1), we get

P A P A P A A P A A

2.4 Combinatorial Results 19

THEOREM 2

2.1 Probability Functions and Some Basic Properties and Results 19

+ −

( ) (

∩ ⋅ ⋅ ⋅ ∩

)

by the assumption that An↑. Hence

P A P A P A P A A

Hence

and the theorem is established. ▲

This theorem will prove very useful in many parts of this book.

Exercises

2.1.1 If the events Aj, j= 1, 2, 3 are such that A1⊂ A2⊂ A3 and P(A1)= 14, P(A2)= 125, P(A3)= 127, compute the probability of the following events:

A1cA2, A1cA3, A2cA3, A1A2cA3c, A1cA2cA3c.

2.1.2 If two fair dice are rolled once, what is the probability that the total number of spots shown is

i) Equal to 5?

ii) Divisible by 3?

2.1.3 Twenty balls numbered from 1 to 20 are mixed in an urn and two balls are drawn successively and without replacement. If x1 and x2 are the numbers written on the first and second ball drawn, respectively, what is the probability that:

2.4 Combinatorial Results 21

Compute P(A), P(B), P(C), where P is the equally likely probability function on the events of S.

2.1.5 LetS be the set of all outcomes when flipping a fair coin four times and let P be the uniform probability function on the events of S. Define the events A, B as follows:

A s s T s H s

B s T s H s

=

{

}

=

{

SS

}

; contains more ’ than

; any in precedes every in

’ , . Compute the probabilities P(A), P(B).

2.1.6 Suppose that the events Aj, j= 1, 2, . . . are such that P Aj

j

( )

=

1

< .

Use Definition 1 in Chapter 1 and Theorem 2 in this chapter in order to show that P(A¯ )= 0.

2.1.7 Consider the events Aj, j= 1, 2, . . . and use Definition 1 in Chapter 1 and Theorem 2 herein in order to show that

P A P A P A P A

n n

n n

( )

lim inf→∞

( )

lim→∞sup

( )

( )

.