Part A: Probability, Statistics, Stochastic Signal
2.3. Sampling and Estimation
This section deals with sampling and estimation, an important and necessary subject related to simulation and measurements of jitter, noise, and signal integrity. An inappropriate sampling or estimation method would cause inaccuracy in the subsequent measurement or analysis, or inaccurate or even wrong conclusions. It is necessary to understand the relevant basic math before we get into the detailed applications.
2.3.1. Sample Estimators and Convergence
In our discussion of the probability distribution, we either explicitly or implicitly assume that we have knowledge of its distribution function, and the estimations are drawn or derived from those known distribution functions. This is also called population distribution-based statistics. In practical applications, however, what we face is different. Often what we have are values of a statistical distribution or process through a number of experiments. We try to learn or derive the distribution function and estimators of the underlying statistical distribution through these sampling experiments. In other words, we are trying to learn or draw conclusions about the population statistics from the sampling statistics. This subsection first introduces the estimation methods for mean and variance based on statistical samples, rather than statistical probability distribution. Then we will introduce the law of large numbers and answer the question when an estimation based on samples approaches that of the distribution. After that we will introduce the central limiting theorem, which deals with the sampling distribution when the samples are drawn from various statistically independent processes.
2.3.1.1. Mean, Standard Deviation, and Peak-to-Peak Estimations
Suppose that x1, x2, ... xN are experimental values for a random variable x in N independent and identical experiments. With those values, we can define the statistical estimators. The mean that describes a central tendency for xi is given by
Equation 2.71
The sample mean gives the estimation of the sample's "clustering" value but offers little information on the deviation from this central tendency. Another statistical estimator that offers information on the deviation from the central tendency called sampling variance or standard deviation is defined as follows:
Equation 2.72
The standard deviation is simply the square root of the variance:
Equation 2.73
Sometimes a sample estimator called range or peak-to-peak is also used. Caution must be exercised when a sample range is used. It can easily give unstable and inaccurate estimation because it uses only two extreme values from the N experiments and is subject to least constraints among the other two estimators we introduced.
The statistical range or peak-to-peak is defined as the following:
Equation 2.74
where Max(xi) gives the maximum value and Min(xi) gives the minimum value among those N samples. It is clear that whether the estimator can converge to the value for the population statistical estimator depends on both the sample size and the nature of the population distribution. For example, if the population distribution is unbounded, such as a Gaussian, both the mean and standard deviation converge for a large sample size, but the peak-to-peak value does not.
2.3.1.2. Law of Larger Numbers
Section 2.2.4 discussed the Chebyshev Inequality. We will use it to derive another useful theorem called the law of larger numbers, which is helpful in answering the question of when the sampling mean will converge to the population mean.
Before deriving the law of larger numbers, we would like to define convergence. Suppose y1, y2, ... yN are random variable series. If for any positive value of ε we always have
Equation 2.75
we say that yN converges to a constant c.
Recall the Chebyshev Inequality of equation 2.57. In the current context, the random variable becomes
.
The statistical expectation and variance are calculated as
Equation 2.76
and
Equation 2.77
Substituting equations 2.76 and 2.77 into equation 2.57 gives us
Equation 2.78
If we let N , the preceding probability will be to 1 because it cannot be larger than 1:
Equation 2.79
Equation 2.77 suggests that as the sample size keeps increasing, the sampling mean approaches the population mean. As the sample size becomes very large, the population mean can be well represented by the sample mean.
The large number theorem has a direct implication for daily practice. Unless the sample size used to estimate the mean is very large, it does not converge to the population mean. In other words, if the sample size is not large enough, an error exists between the sample mean and the population mean. In practice, the number of samples can be identified when the sample mean is converged or unchanging when the sample size is larger than a threshold value.
2.3.1.3. Convergence of the Estimator
We have shown that as the sample size becomes very large, the sample mean approaches the population mean with a confidence probability of 1.
For the sample standard deviation and peak-to-peak, a similar theorem can be drawn. However, the speed of convergence is much slower compared to the mean estimator, particularly for peak-to-peak when the random variable distribution is unbounded.
2.3.2. Central Limiting Theory
In practical application, we are often confronted with the sum of many independent variables and what their distribution function will be.
Suppose that x1, x2, ..., xN are independent sequences of random variables that share the same probability distribution and each has the same mean µ and standard deviationσ. The central limit theorem says that the distribution of the sum variable Sn = x1+ x2+ ... +xN approaches a Gaussian or normal if N becomes very large, regardless of the original distributions of the random variable, as long as it has a finite mean and a standard deviation (or variance).
The proof of the central limit theorem can be fairly straightforward if we introduce the characteristic function for a distribution function.
For a distribution function of p(x), its corresponding characteristic function is defined as follows:
Equation 2.80
This transformation from PDF p(x) to characteristic function Φ(ω) is in fact a Fourier transformation, so we should expect that all the properties for a Fourier pair should apply here. It can be shown that if p(x) = exp(–x2/2), we will have
Equation 2.81
This implies that the characteristic function for a Gaussian (or normal) is a Gaussian (or normal) distribution in a dual domain.
Let us consider a normalized Sn variable of
Equation 2.82
Its character function is as follows:
Equation 2.83
We have used independent and identical distribution properties in the last two steps to derive equation 2.83. The exponential term inside the expectation estimation brackets can be calculated by using the Taylor expansion theorem of the following:
Equation 2.84 [View full size image]
From equations 2.83 and 2.84 we find the character function for the sum as follows:
Equation 2.85
URL http://access.proquest.safaribooksonline.com/9780132429610/ch02lev1sec4
Comparing equation 2.81 to equation 2.85 indicates that has the same character function as that of a Gaussian or normal; therefore, its probability distribution function must also be Gaussian or normal.
Central limit theorem has a wide range of applications in solving practical problems. For example, the random noise observed in electronics is Gaussian because it is composed of many independent random noise events and, at a macroscopic level, it appears to always be a Gaussian. The same is true of random jitter, because it can be caused by many independent jitter events, with forming mechanisms of random noise-to-jitter conversion through the finite slew rate of the edge transition, amplitude-to-phase conversion, random frequency, or phase modulation.
User name: CSU San Diego
Book: Jitter, Noise, and Signal Integrity at High-Speed
Section: Chapter 2. Statistical Signal and Linear Theory for Jitter, Noise, and Signal Integrity
No part of any chapter or book may be reproduced or transmitted in any form by any means without the prior written permission for reprints and excerpts from the publisher of the book or chapter. Redistribution or other use that violates the fair use privilege under U.S. copyright laws (see 17 USC107) or that otherwise
violates these Terms of Service is strictly prohibited. Violators will be prosecuted to the full extent of U.S. Federal and Massachusetts laws.
Information Theory Computer Science Mike Peng Li Prentice Hall Jitter, Noise, and Signal Integrity at High-Speed