Road Map
The meaning of experimental uncertainty The fundamental concepts of probability
PHY310: Lecture 03
Experimental Uncertainty
and Probability
Experimental Uncertainty
Statements you might have heard
The temperature is
22 ± 1 °C
The electron charge is
1.602176462 ± 0.000000063 x 10-19 C
The solar neutrino flux is
2.35 ± 0.02 (stat) ± 0.08 (sys) x 106 cm-2 s-1
What Does the Uncertainty Mean?
The electron charge is 1.602176462 ±
0.000000063
x 10
-19C
We intuitively know what this means
The “best estimate” of the value is 1.602176462
There is some probability (usually 68%) that the true charge is between 1.602176309 and 1.602176525.
This is a statement about our knowledge. Or is it?
Depends on the definition of probability
The True Value cannot be found statistically
This is fundamental:
There is a single true value.
What Does Uncertainty Mean?
A thermometer with a random uncertainty of ±5K is used to measure a
true temperature T=0.1K
About half the measurements are like T = 1 ± 5K About half the measurements are like T = -1 ± 5K
Is this reasonable?
You KNOW the temperature is greater than zero (2nd law of
thermodynamics)
But, the thermometer does measure negative values.
Time to define a confidence interval, but it depends on the definition of probability
Classical Probability
Definition: Probability is the relative frequency of an event as the
number of trials tends towards infinity.
This is an objective definition
There are some strange consequences
How do you assign uncertainty to an approximation?
What about systematic uncertainty? (Discussion later this semester)
Confidence Interval
There is a single true value.
There is no such thing as a “classical probability” of a true value.
We don't (
can't!
) know that value.
Must construct a meaningful distribution to represent uncertainty as a
probability:
A confidence interval is a member of a set of intervals with contain the true value with given frequency
This means that T = -1 ± 5 K is a member of a set of intervals that contains the true value 68% of the time
In the usual since, there is not a 68% chance that the true temperature is in the confidence interval
Consider T = -6 ± 5 K
The true temperature must be outside of this interval You can't say that there is a 50% change that T < -6 K That is OK!
Confidence Intervals Reiterated
A 68% confidence interval is a
member
of
a set of intervals
of which
68% contain the true value.
A false statement: “There is a 68% change that the true value is in the confidence interval”
A single confidence interval (one member of the set) doesn't necessarily tell you much about the rest of the set
If you are near a physical boundary, report the expected sensitivity of your experiment (e.g. Tsens = 5 K)
Confusing the Issue:
Subjective/Bayesian/Modern Probability
Definition: Probability is a measure of the degree of belief that an event
will occur.
More general than the classical definition of probability
In fact, the classical definition is a special case of the subjective definition
Matches the colloquial meaning of probability
More importantly matches our ideas about theoretical and systematic uncertainty
Examples:
T = 22 ± 1 °C means there is a 50% chance that T < 22 °C
If your thermometer measures T = -6 K there is a 50% chance that T < ~4 K
BUT
This introduces a subjective degree of belief
Considered EVIL by some physicists
Sorting it out
When you report an experimental result
Use a “Classical” confidence interval to report your measurements Understand that your systematic uncertainty is “Subjective”
Understand that your theoretical uncertainty is “Subjective” Keep the classical and subjective uncertainties separate
The solar neutrino flux is 2.35 ± 0.02 (stat) ± 0.08 (sys) x 106 cm-2 s-1
When you need to make a decision use a Subjective confidence interval
When your write a paper:
Include enough information so that the reader can calculate both types of confidence intervals
Notation for Probability
P(A): The probability of A
This is the probability that an event A will occur P is not a function, it's just notation
When A is discreet, this is a single number (e.g. the probability of a coin toss) When A is continuous, this is represented by a function
The probability of A satisfies
0 ≤ P(A) ≤ 1
P(not A) = 1- P(A)
P(A|B): The conditional probability of A given B
This is the probability that an event A will occur given that B has occurred P is not a function, it's just notation (&c)
Random Variables
When we talk about P(A), we implicitly assume that A is being drawn
from a set of possible values (the “Parent Distribution”)
The value of P(A) is the ratio the number of times A occurs in the parent distribution to the total number of elements in the parent distribution
Example:
Parent Distribution: {AAABBCCCCD}
P(A) = 3/10 P(B) = 1/5 P(C) = 2/5 P(D) = 1/10
Sum Rule: P(A) + P(B) + P(C) + P(D) = 1
This is the “Law of Total Probability”
If the set is continuous it's got an infinite number of elements
A variable with values drawn from a parent distribution is called a
“Random Variable”
PA=Number of times A is in the set
Number of elements in the set
∑
i
Describing Continuous Parent
Distributions
Think about a random variable
x
drawn from a continuous parent
distribution
We want to describe the probability that x0 is between x and x+dx
P(x<x0<x+dx) is the probability that x0 is in the interval [x,x+dx]
This can be described using a “Probability Density Function”
P(x<x0<x+dx) = f(x)dx
f(x) is called the “probability density function” or p.d.f.
The law of total probability gives the normalization
Sometimes its useful to deal with the “Cumulative Distribution”
∫
−∞ ∞ f xdx=1 Fx=∫
−∞ x f x 'dx 'Multi-Dimensional P.D.F.s
A p.d.f. can depend on several parameters, for instance
f(x,y)
The probability that a measurement is in the both intervals [x,x+dx] and [y,y+dy] is P([x,x+dx],[y,y+dy]) = f(x,y)dxdy
Sometimes you need to know the probability that x is in the interval [x,x+dx], but y can be any value:
Normalization
∫
−∞ ∞ dx∫
−∞ ∞ dy f x , y=1 Marginalization P[x , xdx]=f xxdx=∫
−∞ ∞ dy f x , yConditional Probability of x given y
P[x , xdx]∣y=f x ; ydx= f x , y
∫
dx 'f x ', y= f x , y f yy y y+dy x x+dxBayes Theorem
This is the jackknife of probability theory
P(A) is the “prior probability” for A
P(B) is the probability that B will occur (the “normalization”) P(A|B) is the “posterior probability” for A
You have to be
extremely
careful with Bayes Theorem if you are trying
to have a classical probability
With subjective probability:
the prior probability is what we know about A before we start
the posterior probability is what we learned about A from our measurement This is used a lot in information theory, robotics, control theory
(engineering), artificial intelligence, &c
PA∣B=PB∣APA
Bayes Theorem Example
You go for a medical test and the result comes back positive
What you know
Only 0.1% of the population has the disease
So you can assume that P(disease) = 0.001 and P(no disease) = 0.999
Notice this is subjective: if you are at high risk, P(disease) might be higher!
The test is 98% efficient to detect the disease
P(+|disease) = 0.98 and P(- |disease) = 0.02
The test has 3% false positives:
P(+|no disease) = 0.03 and P(- |no disease) = 0.97
What you want to know: What is the probability that you have the disease?
Pdisease∣+=P+∣diseasePdisease P+
Pdisease∣+= P+∣diseasePdisease