Spatial
Filters
Adaptive
HR
Temporal
Filters
Velocity
Tuning
Velocity
Tuning
Compute
Velocity
yF i g u r e 5 .1 : Schematic diagram of the method of image velocity measurement described in this chapter. Velocity measurements are computed from the phase outputs of space-time separable band-pass filters and fed back to adapt the filters’ temporal tunings so that they match the space
w here 1V(x,t) is a real-valued w indow function, and expljkx] = cos(bc) + jsin(kx)
w ith / = -1. The convolution o f these filters with the image sequence is given by:
R(x,t) = I(x,t)* ^(x,t) .
(5.3)
Because R (x,t) is com plex-valued it may be written as:
R(x,t) = p(x,0 exp [/<t)(jc,0] ,
(5.4)
w here p(x,t) and ())(x,t) are its am plitude and phase com ponents.
A ssum ing the conservation o f the logarithm o f the band-pass filtered image
signal, R (x,t), i.e. d[lnR{x,t)]/dt = 0 (Fleet, 1992), im age velocity can be m easured w ith first-order differential constraints on the am plitude and phase com ponent of
R{x,t), as in:
w here v(x,t) = (m,v) is the local im age velocity and subscripts denote partial
^ ^ ( x , t ) u ^ é ( x , t ) v *
. 0 ,
(5.6)
P^(x,‘)« * Py(x,t)v * p,(x,t) = 0 .
(5.7)
H ere, the phase derivatives o f the filter outputs are com bined over a small spatiotem poral region to arrive at an estim ate o f local velocity and a m easure of confidence in that estim ate. In principle, given im age translation, both terms m ight be used to com pute im age velocity. The preference for phase over am plitude is a result o f robustness considerations (Fleet & Jepson, 1989; Fleet,
1992). Phase is relatively stable under deviations from image translation that com m only occur with projections o f 3-D scenes, while the am plitude o f the filter response is not. A m plitude is not conserved, for exam ple, under dilation o f the im age as a cam era approaches an object. H owever, phase has also been shown to exhibit occasional instability, m ost often in the neighbourhoods about phase singularities. F or this reason it is desirable to a im pose a stability constraint (Jepson & Fleet, 1990; Fleet, 1992) to detect unreliable estim ates o f velocity. Here, a "signal/certainty philosophy" (K nutsson & W estin, 1993) is follow ed such that fo r each local estim ate o f velocity at any given time there is an associated m easure o f the confidence o f that velocity estim ate. The calculation o f these confidence m easures is discussed below.
derivatives (Fleet & Jepson, 1989; Franks 1969):
, . ,
Im[R-(x,t).R(x,t)]
. ( 5 . 8 )
V elocity is com puted using a w eighted least squares fit of the local first-order constraints on phase to a m odel of constant velocity in each small spatiotem poral neighbourhood, N, by m inim izing:
Y , f T ( j i : , / ) [ | V ( t > ( x , / ) . v ( j i : . 0 * < t ) , ( j : , 0 l ] ^ . ( 5 9 )
w here VF(x,t) is a window that gives more w eight to constraints near the centre of the neighbourhood. The m inim ization o f equation 5.9 leads to a linear system of the form the solution o f which is given by:
V = , ( 5 . 1 0 )
w here v is the im age velocity vector, 0 ,, and 0^ are the n \2 m atrix and n \ l vector o f spatial and tem poral phase derivatives, and W is the diagonal w eight matrix.
Calculating confidence estim ates
As well as producing reliable m easurem ents o f image velocity, it is desirable for an im age m otion analysis schem e to indicate im age regions where the fitting o f a local translational velocity m odel is poor. This is particularly im portant if the output o f the m otion analysis schem e is to be com bined with data from other low- level visual m odules (e.g. form or stereo) to reconstruct the 3-D visual
environm ent and segm ent it into objects. Here, unreliable estim ates o f velocity are identified using the eigenvalues of the spatial covariance matrix,
(B arron et al, 1994; Fleet & Langley, 1995). The m agnitude o f the sm allest eigenvalue is used as a m easure o f confidence in the associated velocity estim ate. T his value depends upon the range o f orientations and the magnitude o f spatial gradients present locally within the image.
Spatial filte rs
T he spatial filters im plem ented here are com plex band-pass G abor filters (Gabor, 1946) and their derivatives. The real and im aginary parts o f the filter form a H ilbert transform pair, and are thus said to be in quadrature (Figure 5.2). The filters are tuned to each o f 6 spatial orientations, with centre-frequency spatial tunings o f 0.2 cycles per pixel and envelope standard deviations o f 2.5 pixels. G abor filters are used because o f their favourable resolution in the signal and frequency dom ains (W ilson & G ranlund, 1984). The width o f the G aussian
F ig u re 5.2; Plot of the form o f the spatial filters used to im plem ent the phase-based schem e for velocity m easurem ent. The com plex G abor kernel is plotted over space as its real (solid line) and im aginary (dotted line) parts. In the com puter sim ulations, th e filters have a centre-frequency tuning o f 0.2 cycles per pixel and an envelope standard deviation o f 2.5 pixels.
negligible (Langley, 1990).
Space-tim e oriented filte rs
The phase-based scheme requires spatiotem porally band-pass filters oriented in space-tim e. Such filters are separable into one-dim ensional spatial and tem poral filters. To achieve quadrature the real and im aginary parts o f the constituent spatial and temporal filters are com bined according to trigonom etrical identities in the same way as for energy m odels (e.g. A delson & Bergen, 1985). A schem atic diagram o f the construction o f com plex space-tim e oriented linear filters is shown in Figure 5.3.
C ausal recursive tem poral filters
T he above phase-based schem e is im plem ented using space-tim e separable filters (T (x ,t) = A(t)B(x)), allow ing tem poral filtering to be considered in isolation:
R(xd) = A{ty [B(x)*I(xj)] .
(5.11)
The tem poral filters are designed in the continuous-tim e dom ain and then transform ed to obtain a discrete-tim e transfer function. The class o f filters used