Experimental Uncertainty and Probability

(1)

Road Map

The meaning of experimental uncertainty The fundamental concepts of probability

PHY310: Lecture 03

Experimental Uncertainty

and Probability

(2)

Experimental Uncertainty

Statements you might have heard

The temperature is

22 ± 1 °C

The electron charge is

1.602176462 ± 0.000000063 x 10-19_C

The solar neutrino flux is

2.35 ± 0.02 (stat) ± 0.08 (sys) x 106_cm-2_s-1

(3)

What Does the Uncertainty Mean?

The electron charge is 1.602176462 ±

0.000000063

x 10

-19

C

We intuitively know what this means

The “best estimate” of the value is 1.602176462

There is some probability (usually 68%) that the true charge is between 1.602176309 and 1.602176525.

This is a statement about our knowledge. Or is it?

Depends on the definition of probability

The True Value cannot be found statistically

This is fundamental:

There is a single true value.

(4)

What Does Uncertainty Mean?

A thermometer with a random uncertainty of ±5K is used to measure a

true temperature T=0.1K

About half the measurements are like T = 1 ± 5K About half the measurements are like T = -1 ± 5K

Is this reasonable?

You KNOW the temperature is greater than zero (2nd_{law of}

thermodynamics)

But, the thermometer does measure negative values.

Time to define a confidence interval, but it depends on the definition of probability

(5)

Classical Probability

Definition: Probability is the relative frequency of an event as the

number of trials tends towards infinity.

This is an objective definition

There are some strange consequences

How do you assign uncertainty to an approximation?

What about systematic uncertainty? (Discussion later this semester)

(6)

Confidence Interval

There is a single true value.

There is no such thing as a “classical probability” of a true value.

We don't (

can't!

) know that value.

Must construct a meaningful distribution to represent uncertainty as a

probability:

A confidence interval is a member of a set of intervals with contain the true value with given frequency

This means that T = -1 ± 5 K is a member of a set of intervals that contains the true value 68% of the time

In the usual since, there is not a 68% chance that the true temperature is in the confidence interval

Consider T = -6 ± 5 K

The true temperature must be outside of this interval You can't say that there is a 50% change that T < -6 K That is OK!

(7)

Confidence Intervals Reiterated

A 68% confidence interval is a

member

of

a set of intervals

of which

68% contain the true value.

A false statement: “There is a 68% change that the true value is in the confidence interval”

A single confidence interval (one member of the set) doesn't necessarily tell you much about the rest of the set

If you are near a physical boundary, report the expected sensitivity of your experiment (e.g. T_sens = 5 K)

(8)

Confusing the Issue:

Subjective/Bayesian/Modern Probability

Definition: Probability is a measure of the degree of belief that an event

will occur.

More general than the classical definition of probability

In fact, the classical definition is a special case of the subjective definition

Matches the colloquial meaning of probability

More importantly matches our ideas about theoretical and systematic uncertainty

Examples:

T = 22 ± 1 °C means there is a 50% chance that T < 22 °C

If your thermometer measures T = -6 K there is a 50% chance that T < ~4 K

BUT

This introduces a subjective degree of belief

Considered EVIL by some physicists

(9)

Sorting it out

When you report an experimental result

Use a “Classical” confidence interval to report your measurements Understand that your systematic uncertainty is “Subjective”

Understand that your theoretical uncertainty is “Subjective” Keep the classical and subjective uncertainties separate

The solar neutrino flux is 2.35 ± 0.02 (stat) ± 0.08 (sys) x 106_cm-2_s-1

When you need to make a decision use a Subjective confidence interval

When your write a paper:

Include enough information so that the reader can calculate both types of confidence intervals

(10)

Notation for Probability

P(A): The probability of A

This is the probability that an event A will occur P is not a function, it's just notation

When A is discreet, this is a single number (e.g. the probability of a coin toss) When A is continuous, this is represented by a function

The probability of A satisfies

0 ≤ P(A) ≤ 1

P(not A) = 1- P(A)

P(A|B): The conditional probability of A given B

This is the probability that an event A will occur given that B has occurred P is not a function, it's just notation (&c)

(11)

Random Variables

When we talk about P(A), we implicitly assume that A is being drawn

from a set of possible values (the “Parent Distribution”)

The value of P(A) is the ratio the number of times A occurs in the parent distribution to the total number of elements in the parent distribution

Example:

Parent Distribution: {AAABBCCCCD}

P(A) = 3/10 P(B) = 1/5 P(C) = 2/5 P(D) = 1/10

Sum Rule: P(A) + P(B) + P(C) + P(D) = 1

This is the “Law of Total Probability”

If the set is continuous it's got an infinite number of elements

A variable with values drawn from a parent distribution is called a

“Random Variable”

PA=Number of times A is in the set

Number of elements in the set

∑

i

(12)

Describing Continuous Parent

Distributions

Think about a random variable

x

drawn from a continuous parent

distribution

We want to describe the probability that x₀ is between x and x+dx

P(x<x₀<x+dx) is the probability that x₀ is in the interval [x,x+dx]

This can be described using a “Probability Density Function”

P(x<x₀<x+dx) = f(x)dx

f(x) is called the “probability density function” or p.d.f.

The law of total probability gives the normalization

Sometimes its useful to deal with the “Cumulative Distribution”

∫

−∞ ∞ f xdx=1 Fx=

∫

−∞ x f x 'dx '

(13)

Multi-Dimensional P.D.F.s

A p.d.f. can depend on several parameters, for instance

f(x,y)

The probability that a measurement is in the both intervals [x,x+dx] and [y,y+dy] is P([x,x+dx],[y,y+dy]) = f(x,y)dxdy

Sometimes you need to know the probability that x is in the interval [x,x+dx], but y can be any value:

Normalization

∫

−∞ ∞ dx

∫

−∞ ∞ dy f x , y=1 Marginalization P[x , xdx]=f _xxdx=

∫

−∞ ∞ dy f x , y

Conditional Probability of x given y

P[x , xdx]∣y=f x ; ydx= f x , y

∫

dx 'f x ', y= f x , y f _yy y y+dy x x+dx

(14)

Bayes Theorem

This is the jackknife of probability theory

P(A) is the “prior probability” for A

P(B) is the probability that B will occur (the “normalization”) P(A|B) is the “posterior probability” for A

You have to be

extremely

careful with Bayes Theorem if you are trying

to have a classical probability

With subjective probability:

the prior probability is what we know about A before we start

the posterior probability is what we learned about A from our measurement This is used a lot in information theory, robotics, control theory

(engineering), artificial intelligence, &c

PA∣B=PB∣APA

(15)

Bayes Theorem Example

You go for a medical test and the result comes back positive

What you know

Only 0.1% of the population has the disease

So you can assume that P(disease) = 0.001 and P(no disease) = 0.999

Notice this is subjective: if you are at high risk, P(disease) might be higher!

The test is 98% efficient to detect the disease

P(+|disease) = 0.98 and P(- |disease) = 0.02

The test has 3% false positives:

P(+|no disease) = 0.03 and P(- |no disease) = 0.97

What you want to know: What is the probability that you have the disease?

Pdisease∣+=P+∣diseasePdisease P+

Pdisease∣+= P+∣diseasePdisease