Linear systems and filtering theory - Correlation Pattern Recognition

Correlation involves two signals or images. A reference image is correlated with a test image (also called a scene) to detect and locate the reference image in the scene. Thus the correlator can be considered as a system with an input (the scene), a stored template or filter (derived from the reference image), and an output (correlation). As we will see in this chapter, such a system is linear in the sense that a new input that is a weighted sum of original inputs results in an output that is an identically weighted sum of the original outputs. Thus a correlator can take advantage of the many properties of linear systems. The most important property is that a linear, time-invariant system can be characterized in terms of its frequency response. We use this and other related properties for the synthesis and use of correlation filters with attractive fea- tures such as distortion-tolerance and discrimination. In this chapter, we provide a review of some of the useful properties of signals and linear systems.

3.1 Basic systems

Strictly speaking, the signal is denoted s(), and s(x) is the value of s() when the argument value is x. We will occasionally require the strict notation, but usually there is no confusion from writing s(x) to mean ‘‘s() with x being used as a general value for the argument.’’ Figure3.1is a simple block diagram of a system. A system can be characterized as producing an output signal o(x) in response to an input signal i(x). We are using the space variable, x (as opposed to the more commonly used time variable, t) to emphasize our interest in images which are intensities that are functions of two space variables, x and y. A signal can be thought of as the variation of an independent variable (e.g., voltage, gray level of an image) as a function of a dependent variable (e.g., time, spatial coordinates in an image). While signals are usually thought of as one-dimensional (1-D) functions, we can use the theory presented in this

chapter with higher-dimensional signals such as images, which can be thought of as 2-D signals (e.g., gray scale as a function of spatial coordinates x and y). In that sense, the input to the system is i(x, y) and the output is o(x, y). For notational brevity, we will refer to these as i(x) and o(x) from now on and show explicitly the two independent variables only where needed. An important sub-class of systems known as linear, shift-invariant (LSI) systems can be completely characterized by the system’s output for just one particular input, namely a point input at the origin. The resulting output is known as the point spread function (PSF) in 2-D systems, and the impulse response in 1-D systems.

The independent variable used with a signal or an image can be either continuous (e.g., time) or discrete (e.g., pixel number). When the independent variable is continuously varying, we will loosely refer to it as a continuous-time (CT) signal and as a discrete-time (DT) signal if the independent variable is discrete. Thus, a CT signal i(x) is defined for all possible values of its continuous argument x. On the other hand, a DT signal i[n] is defined only for discrete values n of the independent variable. A DT example is the signal representing the Dow Jones daily closing index. This sequence of numbers is defined for only one instant every day and there is no meaning for the closing index for any other time of the day. Similarly, the music signal stored on an audio CD is obtained by taking 44 100 samples for every second of the music signal and only these samples are stored on a CD. Thus, a CD contains a DT signal. The process of converting a CT signal to a DT signal is known as analog-to-digital conversion (ADC), or sampling. We will discuss sampling theory in some detail later in this chapter. Most input devices in optical correlators are pixelated and employ sampling. The DT signal stored on the CD is converted to a CT music signal before it is played through the speakers. This process of converting DT signals to CT signals is known as digital-to-analog conversion (DAC).

In the next section, we will establish the notation for some special signals we will be encountering. This will be followed by Section3.3 which reviews the basics of LSI systems, and discusses the convolution operation that allows us to determine the output of an LSI system for any arbitrary input. Section 3.4

reviews the important concept of Fourier analysis of CT signals, and this is followed by Section3.5which reviews the sampling theory. Sampling theory is

o(x)

i(x)

Output Input

System

important to understand well since, although most signals are CT to begin with, anytime we use a digital computer to process them, we have to convert them to DT signals via sampling. Fourier analysis of these DT signals is reviewed in Section3.6. Finally, Section3.7provides a brief review of how to characterize random signals and what happens to them as they pass through linear systems. Knowing what happens to random signals through linear systems enables us to analyze and design correlation filters with the required noise tolerance.

3.2 Signal representation

Physical inputs to physical systems, the systems themselves, and the physical outputs, are all decidedly real. However, the mathematics of the LSI system and the signals – particularly for sinusoidal or nearly sinusoidal signals – is often very conveniently shortened with complex notation. We will often use Aexp[ j(2pfxþ )] to represent A cos(2pfx þ ). The complex exponential is the phasorrepresenting the signal.

We will use x to represent continuous time and n for discrete time. Thus, CT system signals are i(x) and o(x), whereas DT system signals are denoted by i[n] and o[n]. Note the notational difference between parentheses and square brackets used for CT and DT signals, although it should normally be obvious from the context whether we are dealing with CT or DT signals. Several basic signal operations are defined in Table3.1in terms of a 1-D CT signal i(x). The focus of this book being image correlation, we need to deal with 2-D signals i(x, y). DT images are denoted by i [n, m]. Basic signal operations such as shift, scaling, and reflection are applied to 2-D signals in the same way as 1-D signals, except that both x and y must be taken into consideration. Some commonly encountered 1-D CT and DT signals are summarized in Table3.2and Table3.3, respectively. However, it is worth highlighting some special 2-D signals. Separable signals Often, a 2-D signal can be written as the product of two 1-D signals as below. Such 2-D signals are known as product-separable, or commonly, just separable, signals.

i x; yð Þ ¼ ixðxÞiyðyÞ i½n; m ¼ in½nim½m (3:1) Separable signals are easier to handle than non-separable, as they require only two 1-D signals or two vectors, instead of one 2-D signal or one matrix. Real images are rarely separable and, even worse, rarely allow compact analytical representa- tions. As a result, we will mostly denote an image as i(x, y) or i[ n, m] without any further simplification.

Coordinate transformation An image i(x, y) can be mapped to another image iˆ(x, y) by transforming the coordinate system. A particular coordinate transform of interest is the polar transform (PT) from Cartesian coordinates x and y to polar coordinates, namely radius r and angle .

i x; yð Þ ! î r; ð Þ where r ¼pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffix2_{þ y}2 _and_{¼ tan}1 y x

i x; yð Þ ! ^i r; ð Þ where x ¼ r cos and y ¼ r sin

(3:2)

If the PT of an image is independent of angle , then that image is circularly symmetric. If the PT is independent of radius r, it is a radially constant image. Figure3.2(a) shows a circularly symmetric image and Figure3.2(b) indicates its PT. The PT in Figure3.2(d) is of the radially constant image in Figure3.2(c). While we have only discussed the polar transform here, there exist other useful coordinate transformations such as the Mellin transform [23] and the log-polar transform [24].

Table 3.1. Basic signal operations

Operation Description

Time shift Signal i(x x0) is i(x) shifted to the right by x0. If x0is positive, then the shift is to the right, and if it is negative, the shift is to the left. For a DT signal i[n] , the shift must always be an integer. A signal is periodic with period T, if i(xþ nT )¼ i(x) for any integer value of n.

Time scaling Signal i(ax) denotes the original signal i(x) scaled by a factor a. If a > 1, the signal is compressed, whereas if 0 < a < 1, the signal is dilated. Reflection Signal i(x) denotes a time-reversed or reflected signal. A signal is

considered to be an even signal when its reflection equals itself, i.e., i(x) ¼ i(x). A signal is considered to be an odd signal if its reflection equals the negative of itself, i.e., i(x) ¼ i(x). Not every signal has to be either even or odd, though every signal can be expressed as the sum of unique even and odd parts.

Even–odd parts

An arbitrary signal i(x) can be decomposed as the sum of an even part ie(x) and an odd part io(x), as i(x)¼ ie(x)þ io(x). The component ieð Þ ¼x i xð Þþi x2ð Þis even, and ioð Þ ¼x i xð Þi x2ð Þis odd.

Energy _{The energy of a signal defined as E}_¼R1 1ji xð Þj

dx. For periodic signals i(x) with period T, we define an average energy

Ep¼_T1 RT=2

T=2ji xð Þj 2

dx. Similar energy and average energy definitions exist for DT signals.

Table 3.2. Special one-dimensional CT signals

Signal Definition Comments

CT impulse (delta function)

(x) Loosely speaking, this function is

zero everywhere except at the origin where it is infinitely large. Multiplying a smooth function by a delta function forces the product to become zero everywhere except at the location of the delta function, i.e.,

1i xð Þ x xð 0Þdx ¼ i xð Þ,0 provided i(x) is continuous at x¼ x0. This is known as the sifting property since it picks out the value of i() at x0.

unit step

uðxÞ ¼ 1 for x 0 0 for x50

_{The unit step is the integral of a delta} function, i.e., uðxÞ ¼R₁x ð Þd. It can be used for representing switching systems.

comb function combTð Þ ¼x P 1

k¼1

xð kTÞ The comb function is an infinite train of delta functions spaced at uniform intervals of T.

Multiplying any signal i(x) by the comb function combT(x) results in a sampled signal that is non-zero only at the sampling instants. rect function rðxÞ ¼ u x þ 1=2ð Þ u x 1=2ð Þ

¼ 1 for xj j 1=2 0 otherwise

The rectangle function r(x), also

known as the box function, equals 1 in the interval [1/2, 1/2] and zero outside. It is easy to verify that the unit rectangle has energy 1.

Sinusoids iðxÞ ¼ A cos 2pfx þ ð Þ Ais the amplitude, f is the frequency, and is the phase (indicates the relative position of the signal with respect to the origin) of the sinusoid. The period T is related to the frequency as f¼ 1/T. A sinusoid of a particular frequency input to a linear, shift-invariant (LSI) system must lead to an output sinusoid of the same frequency. Thus, sinusoids are

Table 3.2. (cont.)

Signal Definition Comments

The amplitude and phase, but not the frequency, of an input sinusoid are altered by an LSI system. Complex exponentials iðxÞ ¼ A exp j2pfxð Þ ¼ A cos 2pfx½ ð Þ þ j sin 2pfxð Þ

Because of their close connection with sinusoids, complex

exponentials are eigenfunctions of LSI systems. Complex

exponentials are periodic signals. Unit Gaussian Gausð Þ ¼ 1=x pffiffiffiffiffiffi2pexpx2₌₂ _{Often used to describe smoothly}

tapering apertures and windows. This is the same shape as that of a Gaussian PDF, with zero mean and unit variance.

Table 3.3. Special one-dimensional DT signals

Signal Definition Comments

Unit DT step

u n½ ¼ 1 for n 0 0 for n50

Takes on a value of 1 at the origin and at all positive integer values of n, and a value of 0 at all negative integer values of n. Unit DT

impulse n½ ¼ 1 for n¼ 0 0 for n6¼ 0

¼ u n½ u n 1½

Also known as the DT delta function, or Kronecker delta function. The DT delta function is 1 when n¼ 0, but 0 everywhere else. A DT delta function is the difference between the DT unit step and the DT unit step shifted by 1 to the right.

Sinusoids i n½ ¼ A cos 2pfn þ ð Þ Unlike the CT sinusoid the DT sinusoid is not always periodic. In fact, the DT sinusoid is periodic if and only if 2pfmN is an integers multiple of 2p for some integers m and N, which means that f must be a ratio of integers (e.g., if f is 1/2, 3/8, 5/2, etc., the DT sinusoid is periodic. On the other hand, if f ispffiffiffi2, it is not periodic).

Rectangle function The 2-D rectangle function is 1 inside a rectangular region centered at the origin and 0 outside. It is useful in truncating images and to describe the region of support of images. It is a separable function as shown below:

rect x; yð Þ ¼ rðxÞrðyÞ ¼ 1 if xj j 1=2 and yj j 1=2 0 otherwise

(3:3)

Circ function For 2-D signals, circular apertures may be more natural than rectangular apertures. The circ function is 1 inside a circle of radius 1 centered at the origin, and 0 outside. It is not a separable function in Cartesian coordinates. circ x; yð Þ ¼ 1 if ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x2_{þ y}2 ð Þ p 1 0 otherwise (3:4)

Unit Gaussian function Both the rect() function and the circ() function are binary in that they take on values of 1 and 0. A useful, circularly symmetric, 2-D function that takes on a continuum of amplitude values is the unit Gaussian centered at the origin. It is separable as shown below:

(d) (c) (b) r x y r y x (a) θ θ

Figure 3.2 (a) A circularly symmetric image, and (b) its polar transform; (c) a radially constant image, and (d) its polar transform

Gaus x; yð Þ ¼ 1 2p exp x2þ y2 2 ¼ 1ffiffiffiffiffiffi 2p p exp x 2 2 1ffiffiffiffiffiffi 2p p exp y 2 2 ¼ Gaus xð Þ Gaus yð Þ (3.5)

The above definition considers a unit Gaussian (also known as standard Gaussian) that is centered at the origin and has a standard deviation of 1 along both x and y. More general Gaussian functions can be obtained by changing the variables.

3.3 Linear shift-invariant systems

In Section3.1, we defined a system as producing the output o(x) in response to the input i(x). If both signals are CT, we will refer to that system as a CT system, and if both signals are DT, we consider it to be a DT system. There can be occasions where the input signal is of one type and the output of a different type. We will refer to such systems as hybrid or mixed systems.

Linear, shift-invariant systems offer much in terms of their properties. We will first define what linearity is and what shift-invariance means. This will be followed by a look at the properties of LSI systems. One property of an LSI system of particular interest is that its output can be obtained by convolving the input signal and its impulse response (impulse response is the output of the LSI system when the input is a delta function). We will see that the correlation operation is similar to the convolution operation. We will demonstrate the fact that sinusoids are eigenfunctions of LSI systems.1While we will use 1-D CT signals, our discussion is easily extended to higher dimensions and to DT systems. We will point out any differences only when they are significant. Linearity In simple words, linearity requires that weighted summation of inputs should lead to an identically weighted sum of output signals. More rigorously, a linear system must satisfy the following:

If i1(x)! o1(x) and i2(x) ! o2(x); then ai1(x)þ bi2(x)! ao1(x)þ bo2(x) for any scalars a; b and any inputs i1(x) and i2(x) (3:6)

1 _{The term eigenfunction derives from linear algebra, in which an eigenvector (discussed in Chapter}₂_{) of}

a matrix is one changed by only a complex factor when multiplied by the matrix. An LSI system may change the phase and magnitude of an input sinusoid, but not the sinusoid’s frequency.

If the condition in Eq. (3.6) is not satisfied for even one set of weights or for even one particular signal, then the system is nonlinear. If a¼ b ¼ 1, the above requirement in Eq. (3.6) means that a new input that is the sum of two old inputs results in a new output that is the sum of the corresponding two old outputs. This property is also known as homogeneity or the principle of super- position. For a linear system, it is easy to show that an all-zero input signal must lead to an all-zero output signal. What this does not mean is that if an input signal is zero over a certain time interval (let us say from t1to t2), then the

resulting output signal is also zero over that time interval.

The advantage of linearity is that we can find the output of a system for an input signal by knowing the outputs for some basic signals. We will show in Section 3.4 that we can represent arbitrary signals as a weighted sum of sinusoids. Thus, knowing the outputs of a linear system to input sinusoids is very attractive. Instead of documenting every possible input–output pair for a linear system, we only have to know the outputs for sinusoids. Since sinusoids are eigenfunctions for LSI systems, we need to document only the magnitude and phase response of that system as a function of input frequency.

Shift invariance If shifting the input signal by x0 results in an output that is

shifted by the same amount, then we have a shift-invariant system. In 1-D systems, these are more commonly referred to as time-invariant systems as the independent variable is time. More precisely, if i(x)! o(x), then i(x x0)! o(x x0) for any i(x) and any x0.

Shift-invariance tells us that if we know the output for a particular input, then we know the outputs for every shifted version of that input signal. We will see in the next section that this shift-invariance, coupled with linearity, enables us to characterize an LSI system completely by its impulse response.

3.3.1 Impulse response, convolution, and correlation

Consider a unit impulse function (t) or [n] (depending on whether the system is CT or DT, respectively) input signal. Irrespective of whether the system is LSI or not, we will refer to the corresponding output as its impulse response (h(t) or h[n]). Let us now look at a DT LSI system with impulse response h[n]. An example impulse response is shown in Figure3.3(a). Suppose the input signal to this system is the sequence i[n] shown in Figure3.3(b). What is the resulting output? This signal comprises three DT delta functions, each weighted by different amounts and each shifted by different amounts as shown

in Figure3.3(c). In general, a DT signal can always be expressed as a weighted sum of shifted delta functions as follows:

i½n ¼ X 1 k¼1

i½k ½n k (3:7)

Since the system is shift-invariant, the input [n k] should lead to the output h[n k]. Since the system is linear, weighting that input signal [n k] by weight i[k] results in the output i [k] h [n k]. These output signals are shown in Figure3.3(d) for the example being considered. Equation (3.7) tells us that the input signal is a sum over k of these weighted, shifted delta function inputs, and, by linearity of the system, the output must be an identical sum over k, i.e.,

+

0 1 2 3 1 2 n h[n] (a) 0 1 2 3 1 2 n i[n] (b) 3 0 1 2 3 n 1

+

₊

0 1 2 3 2 0 1 2 3 n 3 (c) 0 1 2 3 1 2 n i[0]h[n]

+

0 1 2 3 2 4 n i[1]h[n–1] 0 1 2 3 3 6 n i[2]h[n–2] (d) 0 1 2 3 5 n o[n] (e) 8 2 3

Figure 3.3 (a) The impulse response of a DT system, (b) an example input DT signal, (c) expressed as a sum of weighted, shifted delta functions, (d) output signals for the input-weighted, shifted delta functions, and (e) output signal for the input signal in (b)

o½n ¼ X 1 k¼1

i½k h½n k ¼ i½n h½n (3:8)

where i½n h½n is used as a shorthand notation for the summation operation in Eq. (3.8). This operation, called the convolution between i[n] and h[n], is at the core of LSI systems. We have used both linearity and shift-invariance in obtaining the above result, that the output of the LSI system is the convolution of the input with the impulse response. Figure3.3(e) shows the output of the LSI system.

It is important to note that in the convolution sum in Eq. (3.8), the sign on k is different in i [k] and h [n k]. If we use the same sign for k, we get a correlation c[n], i.e.,

c½n ¼ X 1 k¼1

i½k h½n þ k ¼ i½n h½n (3:9)

where we use the symbol to denote the correlation operation. The convolution in Eq. (3.8) and the correlation in Eq. (3.9) look similar, but produce entirely different results.

Region of support Suppose the two functions being convolved are of finite support. (More precisely, this means i(x) is zero outside Li x Ui, and h(x)

is zero outside Lh x Uh, with all of Li, Ui, Lh, and Uhbeing finite numbers.)

Then the convolution of the two, oðxÞ ¼ iðxÞ hðxÞ is zero outside Lo x Uo

where:

Lo¼ Liþ Lh and Uo¼ Uiþ Uh (3:10) Denoting the lengths of the two signals by Ai¼ (Ui Li) and Ah¼ (Uh Lh),

we see that the output convolution length is at most Ao¼ ( Uo Lo)¼ Aiþ Ah.

The results are slightly different for DT convolution. Assuming that i [n] is zero outside Li n Ui, and h[n] is zero outside Lh n Uh, the DT

convolution o½n ¼ i½n h½n is zero outside Lo n Uo, where Lo and Uo

are as in Eq. (3.10). But for discrete signals, the length of a signal is given by Ai¼ Ui Liþ 1. Thus, the result for DT convolution length is at most

Ao¼ (Aiþ Ah 1). As an example, convolving a DT sequence of length 64

points with itself would result in an output signal length of at most 127 points. For 2-D CT convolution (which is discussed in the next section), these region- of-support results hold for each axis. The region-of-support results are the same for correlation.

3.3.2 Two-dimensional LSI systems

We can define something similar to the impulse response for 2-D systems. The

In document Correlation Pattern Recognition (Page 62-122)