SAMPLING THEORY AND FINITE DATA CONSIDERATIONS

Basic Concepts

SAMPLING THEORY AND FINITE DATA CONSIDERATIONS

0.5020 0.0000 0.0000 0.0000 0.0000 -0.4474 0.0000 2.0078 -0.0000 -0.0000 1.9172 -0.0137 0.0000 -0.0000 1.1294 0.0000 -0.0000 0.9586 0.0000 -0.0000 0.0000 4.5176 -2.0545 -0.0206 0.0000 1.9172 -0.0000 -2.0545 2.8548 0.0036 -0.4474 -0.0137 0.9586 -0.0206 0.0036 1.5372 In the covariance matrix, the diagonals which give the variance of the six signals vary since the amplitudes of the signals are different. The covariance between the first four signals is zero, demonstrating the orthogonality of these signals. The correlation between the 5^th and 6^th signals and the other sinusoids can be best observed from the correlation matrix:

Rxx =

1.0000 0.0000 0.0000 0.0000 0.0000 -0.5093 0.0000 1.0000 -0.0000 -0.0000 0.8008 -0.0078 0.0000 -0.0000 1.0000 0.0000 -0.0000 0.7275 0.0000 -0.0000 0.0000 1.0000 -0.5721 -0.0078 0.0000 0.8008 -0.0000 -0.5721 1.0000 0.0017 -0.5093 -0.0078 0.7275 -0.0078 0.0017 1.0000

In the correlation matrix, the correlation of each signal with itself is, of course, 1.0. The 1.5 Hz sine (the 5^th column of the data matrix) shows good correlation with the 1.0 and 2.0 Hz cosine (2^ndand 4^th rows) but not the other sinewaves, while the 1.5 Hz cosine (the 6^thcolumn) shows the opposite. Hence, sinusoids that are not harmonically related are not orthogonal and do show some correlation.

SAMPLING THEORY AND FINITE DATA CONSIDERATIONS

To convert an analog waveform into a digitized version residing in memory requires two operations: sampling the waveform at discrete points in time,* and, if the waveform is longer than the computer memory, isolating a segment of the analog waveform for the conversion. The waveform segmentation operation is windowing as mentioned previously, and the consequences of this operation are discussed in the next chapter. If the purpose of sampling is to produce a digi-tized copy of the original waveform, then the critical issue is how well does this copy represent the original? Stated another way, can the original be recon-structed from the digitized copy? If so, then the copy is clearly adequate. The

*As described in Chapter 1, this operation involves both time slicing, termed sampling, and ampli-tude slicing, termed quantization.

TLFeBOOK

answer to this question depends on the frequency at which the analog waveform is sampled relative to the frequencies that it contains.

The question of what sampling frequency should be used can be best addressed assuming a simple waveform, a single sinusoid.* All finite, continu-ous waveforms can be represented by a series of sinusoids (possibly an infinite series), so if we can determine the appropriate sampling frequency for a single sinusoid, we have also solved the more general problem. The “Shannon Sam-pling Theorem” states that any sinusoidal waveform can be uniquely recon-structed provided it is sampled at least twice in one period. (Equally spaced samples are assumed). That is, the sampling frequency, fs, must be≥ 2fsinusoid. In other words, only two equally spaced samples are required to uniquely specify a sinusoid, and these can be taken anywhere over the cycle. Extending this to a general analog waveform, Shannon’s Sampling Theorem states that a continuous waveform can be reconstructed without loss of information provided the sam-pling frequency is greater than twice the highest frequency in the analog wave-form:

fs> 2fmax (22)

As mentioned in Chapter 1, in practical situations, fmaxis usually taken as the highest frequency in the analog waveform for which less than a negligible amount of energy exists.

The sampling process is equivalent to multiplying the analog waveform by a repeating series of short pulses. This repeating series of short pulses is sometimes referred to as the sampling function. Recall that the ideal short pulse is called the impulse function, δ(t). In theory, the impulse function is infinitely short, but is also infinitely tall, so that its total area equals 1. (This must be justified using limits, but any pulse that is very short compared to the dynamics of the sampled waveform will due. Recall the sampling pulse produced in most modern analog-to-digital converters, termed the aperture time, is typically less than 100 nsec.) The sampling function can be stated mathematically using the impulse response.

Samp(n)=

∑

^∞

k=−∞

δ (n − kTs) (23)

where Tsis the sample interval and equals 1/fs.

For an analog waveform, x(t), the sampled version, x(n), is given by multi-plying x(t) by the sampling function in Eq. (22):

*A sinusoid has a straightforward frequency domain representation: only a single complex point at the frequency of the sinusoid. Classical methods of frequency analysis described in the next chapter make use of this fact.

TLFeBOOK

Basic Concepts 55

x(n)=

∑

^∞

k=−∞

x(nTs)δ (n − kTs) (24)

The frequency spectrum of the sampling process represented by Eq. (23) can be determined by taking advantage of fact that multiplication in the time domain is equivalent to convolution in frequency domain (and vice versa).

Hence, the frequency characteristic of a sampled waveform is just the convolu-tion of the analog waveform spectrum with the sampling funcconvolu-tion spectrum.

Figure 2.9A shows the spectrum of a sampling function having a repetition rate of Ts, and Figure 2.9B shows the spectrum of a hypothetical signal that has a well-defined maximum frequency, fmax. Figure 2.9C shows the spectrum of the sampled waveform assuming fs= 1/Ts≥ 2fmax. Note that the frequency character-istic of the sampled waveform is the same as the original for the lower frequen-cies, but the sampled spectrum now has a repetition of the original spectrum reflected on either side of fs and at multiples of fs. Nonetheless, it would be possible to recover the original spectrum simply by filtering the sampled data by an ideal lowpass filter with a bandwidth > fmax as shown in Figure 2.9E.

Figure 2.9D shows the spectrum that results if the digitized data were sampled at fs< 2fmax, in this case fs= 1.5fmax. Note that the reflected portion of the spec-trum has become intermixed with the original specspec-trum, and no filter can un-mix them.* When fs < 2fmax, the sampled data suffers from spectral overlap, better known as aliasing. The sampled data no longer provides a unique repre-sentation of the analog waveform, and recovery is not possible.

When correctly sampled, the original spectrum can by recovered by apply-ing an ideal lowpass filter (digital filter) to the digitized data. In Chapter 4, we show that an ideal lowpass filter has an impulse response given by:

h(n)=sin(2πfcTsn)

πn (25)

where Tsis the sample interval and fcis the filter’s cutoff frequency.

Unfortunately, in order for this impulse function to produce an ideal filter, it must be infinitely long. As demonstrated in Chapter 4, truncating h(n) results in a filter that is less than ideal. However if fs>> fmax, as is often the case, then any reasonable lowpass filter would suffice to recover the original waveform, Figure 2.9F. In fact, using sampling frequencies much greater than required is the norm, and often the lowpass filter is provided only by the response charac-teristics of the output, or display device which is sufficient to reconstruct an adequate looking signal.

*You might argue that you could recover the original spectrum if you knew exactly the spectrum of the original analog waveform, but with this much information, why bother to sample the wave-form in the first place!

TLFeBOOK

FIGURE2.9 Consequences of sampling expressed in the frequency domain. (A) Frequency spectrum of a repetitive impulse function sampling at 6 Hz. (B) Fre-quency spectrum of a hypothetical time signal that has a maximum freFre-quency, fmax, around 2 Hz. (Note negative frequencies occur with complex representation).

(C) Frequency spectrum of sampled waveform when the sampling frequency was greater that twice the highest frequency component in the sampled waveform.

(D) Frequency spectrum of sampled waveform when the sampling frequency was less that twice the highest frequency component in the sampled waveform. Note the overlap. (E) Recovery of correctly sampled waveform using an ideal lowpass filter (dotted line). (F) Recovery of a waveform when the sampling frequency is much much greater that twice the highest frequency in the sampled waveform (fs= 10fmax).

In this case, the lowpass filter (dotted line) need not have as sharp a cutoff.

TLFeBOOK

Basic Concepts 57

Edge Effects

An advantage of dealing with infinite data is that one need not be concerned with the end points since there are no end points. However, finite data consist of numerical sequences having a fixed length with fixed end points at the begin-ning and end of the sequence. Some operations, such as convolution, may pro-duce additional data points while some operations will require additional data points to complete their operation on the data set. The question then becomes how to add or eliminate data points, and there are a number of popular strategies for dealing with these edge effects.

There are three common strategies for extending a data set when addi-tional points are needed: extending with zeros (or a constant), termed zero pad-ding; extending using periodicity or wraparound; and extending by reflection, also known as symmetric extension. These options are illustrated in Figure 2.10.

In the zero padding approach, zeros are added to the end or beginning of the data sequence (Figure 2.10A). This approach is frequently used in spectral anal-ysis and is justified by the implicit assumption that the waveform is zero outside of the sample period anyway. A variant of zero padding is constant padding, where the data sequence is extended using a constant value, often the last (or first) value in the sequence. If the waveform can be reasonably thought of as one cycle of a periodic function, then the wraparound approach is clearly justi-fied (Figure 2.10B). Here the data are extended by tacking on the initial data sequence to the end of the data set and visa versa. This is quite easy to imple-ment numerically: simply make all operations involving the data sequence index modulo N, where N is the initial length of the data set. These two approaches will, in general, produce a discontinuity at the beginning or end of the data set, which can lead to artifact in certain situations. The symmetric reflection approach eliminates this discontinuity by tacking on the end points in reverse order (or beginning points if extending the beginning of the data sequence) (Figure 2.10C).*

To reduce the number of points in cases where an operation has generated additional data, two strategies are common: simply eliminate the additional points at the end of the data set, or eliminate data from both ends of the data set, usually symmetrically. The latter is used when the data are considered peri-odic and it is desired to retain the same period or when other similar concerns are involved. An example of this is circular or periodic convolution. In this case, the original data set is extended using the wraparound strategy, convolution is performed on the extended data set, then the additional points are removed

*When using this extension, there is a question as to whether or not to repeat the last point in the extension; either strategy will produce a smooth extension. The answer to this question will depend on the type of operation being performed and the number of data points involved, and determining the best approach may require empirical evaluation.

TLFeBOOK

FIGURE2.10 Three strategies for extending the length of a finite data set. (A) Zero padding: Zeros are added at the ends of the data set. (B) Periodic or wrap-around: The waveform is assumed periodic so the end points are added at the beginning, and beginning points are added at the end. (C) Symmetric: Points are added to the ends in reverse order. Using this strategy the edge points may be repeated as was done at the beginning of the data set, or not repeated as at the end of the set.

symmetrically. The goal is to preserve the relative phase between waveforms pre- and post-convolution. Periodic convolution is often used in wavelet analysis where a data set may be operated on sequentially a number of times, and exam-ples are found in Chapter 7.

PROBLEMS

1. Load the data inensemble_data.matfound in the CD. This file contains a data matrix labeled data. The data matrix contains 100 responses of a

second-TLFeBOOK

Basic Concepts 59

order system buried in noise. In this matrix each row is a separate response.

Plot several randomly selected samples of these responses. Is it possible to eval-uate the second-order response from any single record? Construct and plot the ensemble average for this data. Also construct and plot the ensemble standard deviation.

2. Use the MATLAB autocorrelation and random number routine to plot the autocorrelation sequence of white noise. Use arrays of 2048 and 256 points to show the affect of data length on this operation. Repeat for both uniform and Gaussian (normal) noise. (Use the MATLAB routinesrandandrandn, respec-tively.)

3. Construct a 512-point noise arrray then filter by averaging the points three at a time. That is, construct a new array in which every point is the average of the preceding three points in the noise array: y(n)= 1/3 x(n) + 1/3 x(n − 1) + 1/3 x(n − 2). Note that the new array will be two points shorter than the original noise array. Construct and plot the autocorrelation of this filtered array. You may want to save the output, or the code that generates it, for use in a spectral analysis problem at the end of Chapter 3. (See Problem 2, Chapter 3.)

4. Repeat the operation of Problem 3 to find the autocorrelation, but use con-volution to implement the filter. That is, construct a filter function consisting of 3 equal coefficients of 1/3:^(h(n) = [1/3 1/3 1/3]). Then convolve this weighting function with the random array using^conv.

5. Repeat the process in Problem 4 using a 10-weight averaging filter. (Note that it is much easier to implement such a running average filter with this many weights using convolution.)

6. Construct an array containing the impulse response of a first-order process.

The impulse of a first-order process is given by the equation: y(t)= e^−t/τ(scaled for unit amplitude). Assume a sampling frequency of 200 Hz and a time con-stant,τ, of 1 sec. Make sure the array is at least 5 time constants long. Plot this impulse response to verify its exponential shape. Convolve this impulse re-sponse with a 512-point noise array and construct and plot the autocorrelation function of this array. Repeat this analysis for an impulse response with a time constant of 0.2 sec. Save the outputs for use in a spectral analysis problem at the end of Chapter 3. (See Problems 4 and 5, Chapter 3.)

7. Repeat Problem 5 above using the impulse response of a second-order un-derdamped process. The impulse response of a second-order unun-derdamped sys-tem is given by:

y(t)= δ

δ − 1e^−δ2πf^ntsin(2πfn

√

¹^{− δ}²^t)

TLFeBOOK

Use a sampling rate of 500 Hz and set the damping factor, δ, to 0.1 and the frequency, fn (termed the undamped natural frequency), to 10 Hz. The array should be the equivalent of at least 2.0 seconds of data. Plot the impulse re-sponse to check its shape. Again, convolve this impulse rere-sponse with a 512-point noise array and construct and plot the autocorrelation function of this array. Save the outputs for use in a spectral analysis problem at the end of Chapter 3. (See Problem 6, Chapter 3.)

8. Construct 4 damped sinusoids similar to the signal, y(t), in Problem 7. Use a damping factor of 0.04 and generate two seconds of data assuming a sampling frequency of 500 Hz. Two of the 4 signals should have an fnof 10 Hz and the other two an fn of 20 Hz. The two signals at the same frequency should be 90 degrees out of phase (replace thesinwith acos). Are any of these four signals orthogonal?

TLFeBOOK

3

In document Biosignal and Biomedical Image Processing - Matlab-Based Applications (Page 72-80)