Concise Introduction to Wavelets

(1)

CONCISE INTRODUCTION TO WAVELETS

(2)

Contents

Preface

This text is based on my research in which I was interested during my former master studies. It may be a valuable resource for any MSc students interested in the theory of wavelets.

The whole text is presented as a compilation, it contains research and ideas of respected mathematicians and authors. Bibliography is provided at the end of this document.

(4)

(5)

Chapter 1. Introduction 1

Chapter 1

Introduction

In this chapter we introduce the basic mathematical tools which are exploited throughout the text. For a more complete treatise we refer to [3, 7, 10, 11, 12, 17, 18, 23].

1.1 The space

L

p For p∈ N, we define Lp= { f :R 7→ C ∫ |f(x)|p_{dx < +}_∞ } ,

where the integration is with respect to the whole real line. For each p, Lp is a Banach

space with the corresponding norm

∥f∥p = (∫ |f(x)|p_dx )1 p .

Of particular importance areL1, the space of absolutely integrable functions, and L2, the

space of functions of finite energy. The spaceL2 is a Hilbert space with the inner product

⟨f, g⟩ =

∫

f (x)g(x) dx,

where g(x) denotes the complex conjugate of g(x). We will also work with a more general space, L2 = (X , S, µ), where X ⊂ Rn, S is a σ-algebra and µ a corresponding measure. The inner product is then defined as

⟨f, g⟩ =

∫

X f g dµ.

A similar discussion applies to sequences (of complex numbers), denoted by

c := (cj)j∈J,

whereJ is a countable index set, which can be finite or infinite. If we define

ℓp(J ) = { (cj)j∈J ∑ n∈J |cn|p< +∞ } ,

(6)

for p∈ N, then ℓp(J ) is a Banach space with the corresponding norm

∥c∥p=  ∑ j∈J |cj|p   1 p

and in particular, ℓ2(J ) is a Hilbert space with the inner product

⟨c, d⟩ =∑ n

cndn.

In caseJ is infinite, we will write just ℓp instead of ℓp(J ).

1.2 Fourier transforms

For f ∈ L1, the Fourier transform is defined and denoted by

F (ξ) := (Ff)(ξ) = √1

2π ∫

f (x)e−iξxdx. (1.1)

Given F =Ff ∈ L1, the function f can be reconstructed by the inverse formula as

f (x) = √1

2π ∫

F (ξ)eiξxdξ. (1.2)

The reason for this particular normalization is the fact that the Fourier transform is then an isometry (see below).

The convolution of two functions f, h∈ L1 is defined by

g(x) = (f∗ h)(x) =

∫

f (t)h(x− t) dt.

In terms of the Fourier transform this can be stated as

G(ξ) = F (ξ)H(ξ),

which is the content of the convolution theorem.

A direct consequence of the convolution theorem is the Parseval formula: for f, h∈

L1∩ L2, we have

⟨Ff, Fh⟩ = ⟨f, h⟩ , (1.3)

which means that the Fourier transform preserves inner product. The same applies to the

L2-norm:

∥Ff∥2 =∥f∥2, (1.4)

for f ∈ L1 ∩ L2. This results immediately from the previous formula, setting h := f ,

and is refered to as the Plancherel formula. The Fourier transform is thus an isometric operator.

When dealing with functions of the Hilbert spaceL2, the formula (1.1) cannot be used

(7)

process, it can be extended to functions of finite energy, assuming the transform and the Plancherel formula for more specific family of functions (which are also dense in L2) is

provided.

To illustrate this process, take f ∈ L2. Since the space of smooth functions with

compact support (denoted by C₀∞) is dense in L2, there exists a Cauchy sequence fn of

functions in C₀∞ so that limn→+∞∥f − fn∥ = 0. The Fourier transform is well defined for

all fn∈ C0∞, is inL2 and will be denoted by Fn. By the Plancherel formula (which works

also for functions of C₀∞), Fnis also a Cauchy sequence. Since Fn∈ L2 and L2 is a Hilbert

space, Fn has a (strong) L2-limit

F = lim

n→+∞Fn,

which is in L2 and will be called the Fourier transform of f . Such defined extension of

the Fourier transform enjoys all the properties of the transform for C₀∞functions [10]. For convenience, we will use the same notation as in case of absolutely integrable functions.

The Discrete Time Fourier Transform of (xn)n∈Z∈ ℓ1 is defined by

X(ξ) =∑ n∈Z

xne−iξn. (1.5)

The inverse transform is given by

xn= 1 2π ∫ 2π 0 X(ξ)eiξndξ, (1.6)

under the assumption X(ξ)∈ L1. Using similar arguments as above, the transform can be

extended to ℓ2.

With this normalization, the analogue of the Parseval and Plancherel formulas take the

form _∫ 2π 0 X(ξ)Y (ξ) dξ = 2π∑ n∈Z xnyn, (1.7) and _∫ 2π 0 |X(ξ)|2_{dξ = 2π}∑ n∈Z |xn|2, (1.8) respectively.

1.3 Frames and bases

Frames were introduced in [9] in connection with a class of nonharmonic Fourier series. They provide a general yet stable framework for various function representations. An example of such a representation is

f =∑

j∈J

⟨f, ej⟩ ej, ∀f ∈ L2, (1.9)

which applies if the system (ej)j∈J constitutes an orthonormal basis ofL2. An orthonormal

(8)

An interested reader can find an in-depth discussion of frames, also in connection with wavelets, in [3, 7, 12].

A system (ej)j∈J of functions in L2 is called a frame if for some α, β > 0 and for all

f ∈ L2,

α∥f∥2 ≤∑ j∈J

|⟨f, ej⟩|2 ≤ β∥f∥2. (1.10)

A frame is said to be exact if, after removing any element from it, the inequality (1.10) no longer applies.

In case α = β, the frame is called a tight frame and the definition reads

∥f∥2 ₌ 1

α

∑

j∈J

|⟨f, ej⟩|2, (1.11)

which after some manipulations with inner products yields

⟨f, g⟩ = 1 α

∑

j∈J

⟨f, ej⟩ ⟨ej, g⟩ ,

where g is any function inL2. Since the inner product is linear in the first argument, it is

indeed possible to write

⟨f, g⟩ = ⟨ 1 α ∑ j∈J ⟨f, ej⟩ ej, g ⟩ .

With a slight abuse of notation one can rewrite this formula as

f = 1 α

∑

j∈J

⟨f, ej⟩ ej. (1.12)

In fact, one must always be able to recognize that such a representation is merely formal and that only after taking the inner product with any g∈ L2 it leads to the right formula.

It is usual to say that such a formula holds “in the weak sense” [7]. We will use this notation whenever possible.

If (ej)j∈J is a tight frame with α = β = 1 and is such that∥ej∥ = 1 for all j ∈ J , then

the equation (1.11) is precisely the Parseval identity and since

∥ek∥2= ∑ j∈J |⟨ek, ej⟩|2 =∥ek∥4+ ∑ j∈J ,j̸=k |⟨ek, ej⟩|2,

it follows that (ej)j∈J is nothing more than an orthonormal basis.

In following we see the derivation of a general expansion for arbitrary frames similar to (1.9) or (1.12). If (ej)j∈J is a frame, we define a linear map T :L2 7→ ℓ2(J ) by

(T f )j =⟨f, ej⟩ , ∀j ∈ J . (1.13)

By (1.10), T is bounded by a strictly positive constant β. Moreover, its adjoint T∗ can be computed as

T∗c = ∑ j∈J

(9)

in the weak sense. 1)

If we define a linear operator S :L27→ L2 by S = T∗T , then

⟨Sf, f⟩ = ∥T f∥2 ₌∑

j∈J

|⟨f, ej⟩|2.

We will call S a frame operator. It can be verified that S is bounded, self-adjoint and invertible.

The frame condition (1.10) now can be stated in an operator form (I denotes the identity operator) as

αI ≤ S ≤ βI,

which implies that S is bounded below by α > 0 and therefore is invertible and S−1 is bounded above by _α1, see [7].

If we now define ˜ej = S−1ej, then it can be checked that the system (˜ej)j∈J is a frame

as well [7], satisfying 1 β∥f∥ 2_≤∑ j∈J |⟨f, ˜ej⟩|2≤ 1 α∥f∥ 2_,

or, in the operator form,

1

βI ≤ S

−1 _≤ 1 αI.

The system (˜ej)j∈J is called the dual frame, because the dual frame of (˜ej)j∈J is again

(ej)j∈J. We will also denote ˜T the linear map associated with the dual frame as in (1.13):

( ˜T f )j =⟨f, ˜ej⟩ , ∀j ∈ J ,

and the dual frame operator ˜S is defined as ˜S = ˜T∗T . Since S is self-adjoint, we have˜

(T S−1f )j = ⟨ S−1f, ej ⟩ =⟨f, S−1ej ⟩ =⟨f, ˜ej⟩ = ( ˜T f )j, or, ˜ T = T S−1, which implies ˜ T∗T = (T S−1)∗T = S−1T∗T = S−1S = I and T∗T = T˜ ∗T S−1 = SS−1 = I. Inserting (1.13) and (1.14) in the last two equalities results in

f =∑ j∈J ⟨f, ej⟩ ˜ej = ∑ j∈J ⟨f, ˜ej⟩ ej, (1.15)

in the weak sense.

1)

This follows from

⟨T∗_{c, f}_{⟩ = ⟨c, T f⟩ =}∑ j∈J cj⟨f, ej⟩ = ∑ j∈J cj⟨ej, f⟩ .

(10)

The reconstruction formula (1.15) also implies that any frame (ej)j∈J, ej ∈ L2generates

the space L2, i.e. the closure of the linear span of (ej)j∈J equals L2. This means that,

given a frame (ej)j∈J, ej ∈ L2, one can express any f ∈ L2 as

f =∑

j∈J cjej

If we also premise that this representation is unique, the frame becomes a basis with the additional condition (1.10). Such obtained basis is called a Riesz basis. An interesting fact is that every Riesz basis is an exact frame (defined above) and conversely, every exact frame is a Riesz basis. One can find the proof in [23].

Assuming that the representation (1.15) is unique (i.e., the frame (ej)j∈J is a Riesz

basis) and substituting f := ek, we deduce an interesting biorthogonal relation

⟨ek, ˜ej⟩ = δk,j,

where ˜ej = S−1ej and S is the frame operator defined above. The systems (ej)j∈J and

(˜ej)j∈J are therefore said to be mutually biorthogonal systems. According to [23], a

system (ej)j∈J in a separable Hilbert space is a Riesz basis if and only if it possesses a

biorthogonal system (˜ej)j∈J which is likewise a Riesz basis.

1.4 The continuous wavelet transform

In this section we will briefly review the (historically) first example of a wavelet transform, neglecting some technical details. We will examine how the continuous transform is de-fined and what properties must be satisfied to have an inverse transform. For a complete reference, consult e.g. [4, 7, 17].

In subsequent chapters we will investigate a more sophisticated concept which leads to an elegant description of the same problem, but it will then be enlightening to see the connection with the continuous transform introduced here.

To define a wavelet one must first specify the function space where the analysis takes place. In this introductory chapter we will consider only the Hilbert spaceL2 of functions

defined on the whole real line (to avoid boundary problems).

A wavelet is then be defined as any function ψ ∈ L2, satisfying the admissibility

condition _∫

|Ψ(ξ)|2

|ξ| dξ = Cψ, with 0 < Cψ < +∞, (1.16)

where Ψ denotes the Fourier transform of ψ and Cψ is a constant depending only on ψ.

This condition is needed in order to have an inverse wavelet transform (defined below). We will see that additional requirements are needed for practical analysis. Practical examples of wavelets will be given later.

In many practical situations ψ will also be (absolutely) integrable and thus its Fourier transform Ψ will be continuous which together with (1.16) implies Ψ(0) = 0, or equivalently,

∫

(11)

which means that in all practical situations the function ψ must oscillate at least a little. This is the motivation behind the term “wavelet”.

As we already indicated, the continuous wavelet transform is defined as

CW Tψ(f, a, b) =⟨f, ψa,b⟩ , (1.17) where ψa,b(x) = 1 |a|1/2ψ ( x− b a ) , ∀a ∈ R \ {0}, b ∈ R

is a family of functions obtained from ψ by translations and dilations. The reason for this particular normalization is the fact that the norm of all ψa,b for all values of a and b is

preserved. One can also write

CW Tψ(f, a, b) = 1 |a|1/2 ∫ f (x)ψ ( x− b a ) dx, (1.18)

where the complex conjugation can be omitted if the analyzing wavelet is real. The operator

CW Tψ(·, a, b) is, of course, linear with respect to the first variable.

To compute the inverse transform one can use the following observation [7]: ∫ +∞ −∞ ∫ +∞ −∞ 1 √ aCW Tψ(f, a, b)CW Tψ(g, a, b) da db = 2πCψ⟨f, g⟩ , (1.19)

which can be derived by substituting (1.18) to the left side of (1.19) and using the isometry of the Fourier transform and the admissibility condition (1.16).

From (1.19) it already follows the inverse wavelet transform, holding in the weak sense: f = 1 2πCψ ∫ +∞ −∞ ∫ +∞ −∞ 1 √ aCW Tψ(f, a, b)ψa,bda db. (1.20)

The continuous wavelet transform is usually depicted in so-called scalogram, where the x-axis represents the time dilation parameter b and the y-axis the scale parameter a. One example of such scalogram is given in Figure 1.1.

1.5 Discretization of the wavelet transform

The strength of the continuous wavelet transform lies in better time-frequency resolution as opposed to the Short Time Fourier Transform. However, it turns out that, under modest requirements on the analyzing wavelet, the continuous representation tends to be superfluous in order to represent functions. We will discuss this briefly; a full exposition is given in [7] or [4].

The idea is to restrict the “continuous” variables a and b in (1.17) to proper discrete values so that for all a we have

∪

b

(12)

Figure 1.1: Signal ‘chirp’ and its scalogram

0 100 200 300 400 500 −1 −0.5 0 0.5 1 time time scale 50 100 150 200 250 300 350 400 450 500 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58

This can be accomplished by putting a = a−j₀ , b = b0a−j0 k, with a0, b0 ∈ R, a0 ̸= 1 and

j, k∈ Z, leading to ψj,k(x) = a j 2 0ψ(a j 0x− kb0).

The wavelet transform in its discrete form is then evaluated by means of the wavelet coeﬃcients⟨f, ψj,k⟩.

Let us examine how such a discretized wavelet transform actually leads to a stable function representation.

A function f can be represented by means of the wavelet coeﬃcients {⟨f, ψj,k⟩} if for

some different function g ̸= f the wavelet coefficients {⟨g, ψj,k⟩} differ from {⟨f, ψj,k⟩}.

This is equivalent to saying that if for all j, k∈ Z, ⟨f, ψj,k⟩ = 0 then f is zero.

Stability is generally defined as the property of having bounded output for some bounded input. In our situation, a representation is said to be stable if for bounded ℓ2-norm

of the sequence⟨f, ψj,k⟩ the L2-norm of f is likewise bounded, i.e., if

∑

j,k|⟨f, ψj,k⟩|2 ≤ ε,

then∥f∥2 ≤ δ. Assuming that the representation is stable, then by putting

g =√ε·_(∑ f j,k|⟨f, ψj,k⟩|2

)1/2,

the property∑_j,k|⟨g, ψj,k⟩|2 ≤ ε is clearly satisfied and therefore ε∥f∥2

∑

j,k|⟨f, ψj,k⟩|2

(13)

implying the lower bound estimate in the definition of a frame:

α∥f∥2 ≤∑ j,k

|⟨f, ψj,k⟩|2, (1.21)

where, in this particular case, we have α = ε_δ.

The converse implication is even more obvious - if∑_j,k|⟨f, ψj,k⟩|2 is bounded then, due

to (1.21),∥f∥2 is also bounded.

It turns out that for all practical purposes (decay of ψ in frequency and in the inde-pendent variable) the second frame estimate,

∑

j,k

|⟨f, ψj,k⟩|2≤ β∥f∥2,

is automatically satisfied (see below) and hence for a stable function reconstruction the discretized wavelet family{ψj,k} needs to constitute a frame. The following proposition is

borrowed from the Daubechies book, giving also estimates on the frame bounds α, β. The proof is challenging and is not reproduced here.

Proposition 1.1 (Daubechies, [7]). If ψ and a0 are such that

inf |ξ|∈[1,a0] ∑ m∈Z |Ψ(am 0 ξ)|2> 0, sup |ξ|∈[1,a0] ∑ m∈Z |Ψ(am 0 ξ)|2 < +∞, (1.22)

and if θ(s) = sup_ξ_∈R∑_m_∈Z|Ψ(a₀mξ)||Ψ(am₀ξ + s)| decays at least as fast as (1 + |s|)−(1+ε), with ε > 0, then there exists bthr₀ > 0 such that the the ψj,k constitute a frame for all choices b0< bthr0 . For b0 < bthr0 , the following expressions are frame bounds for the ψj,k:

The conditions on θ and (1.22) are satisfied if, e.g., |Ψ(ξ)| ≤ C|ξ|µ(1 +|ξ|)−ν with µ >

0, µ > ν + 1.

We have seen that the wavelet transforms discussed so far lead to eﬃcient function representations. Still, there is no indication of the actual construction of any wavelets such that the family ψj,k constitutes a frame. In some applications, one may also impose

addi-tional requirements on ψ such as smoothness, compact support, or it may even be desired for the ψj,k to constitute an orthonormal basis. The aim of the subsequent chapters is to

present a concise exposition of several methods dealing with construction of wavelets with desired properties. The underlying structure for these methods is so called multiresolution

(14)

Chapter 2. Multiresolution analysis 10

Chapter 2

Multiresolution analysis

The concept of multiresolution is considered to be vital in the theory of wavelets. It originates from S. Mallat [16] and since then it has been used as a tool for various wavelet constructions.

The definition presented here is due to W. Sweldens [21], for the sake of generality it provides. After discussing the basic concept, we will briefly review three important cases, which will be further elaborated in subsequent chapters.

2.1 The definition, scaling functions

For this moment we will stick with a more general setting involving a general function spaceL2 = (X , S, µ), see the introduction. For the time being, in terms of complexity this amounts to the same setting as with our primary spaceL2. In the subsequent two chapters

we will, however, find a necessity to abstract away from this general concept, but later, in Chapter 4, we will return to this general setting.

A (primal) multiresolution analysis1) is defined as a sequence of closed subspaces

Vj ⊂ L2, j ∈ J ⊂ Z, satisfying

1. For all j∈ J , Vj ⊂ Vj+1,

2. ∪_j_∈J Vj =L2,

3. IfJ = Z, then also ∩_j_∈J Vj ={0},

4. For all j ∈ J , there exists a family of (primal) scaling functions {φj,k | k ∈ Kj}

constituting a Riesz basis of Vj, whereKj is a set of indices satisfying Kj ⊂ Kj+1.

We assume that eitherJ = Z or J = Nj0, where Nj0 = {j0, j0+ 1, j0+ 2, . . .} for some

j0 ∈ Z.

Recalling the introductory material, we know that for a family of scaling functions

{φj,k | k ∈ Kj} constituting a Riesz basis there exists a family { ˜φj,k | k ∈ Kj}, likewise

constituting a Riesz basis of some ˜Vj. The latter family is referred to as dual scaling

1)_{Instead of “analysis”, one may also use the terms “representation” or “approximation”. We will}

(15)

functions and the sequence ˜Vj is naturally defined as a dual multiresolution analysis.

Moreover, there is a biorthogonal relation between φj,k and ˜φj,k:

⟨φj,k1, ˜φj,k2⟩ = δk1,k2,

for all k1, k2∈ Kj and fixed j ∈ J .

A basic property of multiresolution analysis is a two-scale relation, also referred to as a refinement or dilation equation:

φj,k =

∑

l∈Kj+1

hj,k,lφj+1,l. (2.1)

It follows from the fact that φj,k ∈ Vj ⊂ Vj+1 and {φj+1,l | l ∈ Kj+1} is a Riesz basis of Vj+1.

2.2 Wavelet functions

One can utilize scaling functions we have just introduced, to approximate any function f in our space L2, in the sense that

lim

j→+∞∥f − Pjf∥ = 0,

where Pj =

∑

k∈Kj⟨·, ˜φj,k⟩ φj,k denotes the projection from L

2 _onto _V

j and the norm is

induced by the metric defined onX .

To capture the missing details owing to these approximations, Sweldens defined the space Wj as a complement ofVj inVj+1, intended as Vj+1 =Vj ⊕ Wj, where the sum is

direct. This is a slight extension of the original Mallat definition, which we will see later. If for all j∈ J , Wj has a Riesz basis denoted by{ψj,m| m ∈ Mj}, where Mj =Kj+1\ Kj

and ˜Vj ⊥ Wj, we call {ψj,m | m ∈ Mj} a family of wavelet functions, or just wavelets.

From this definition one may deduce several facts. First oﬀ, for each j∈ J there exists a dual set of wavelet functions, call them{ ˜ψj,m| m ∈ Mj}. Let ˜Wj denotes the space they

generate. Analogously to the case of scaling functions there is a relation between primal and dual wavelet functions

⟨

ψj,m1, ˜ψj,m2

⟩

= δm1,m2, ∀m1, m2 ∈ Mj.

The equality Vj+1 =Vj⊕ Wj can be iterated. In case there is a coarsest level j0, we

obtain Vj+1=Vj0 ⊕ j ⊕ l=j0 Wl,

whilst the second multiresolution property ensures that

L2 ₌_V j0 ⊕ +∞ ⊕ l=j0 Wl. (2.2)

(16)

IfJ = Z, then due to the third property it is possible to write

Vj+1= j ⊕ l=−∞ Wl, L2₌ +∞ ⊕ l=−∞ Wl.

It can be suspected that in case of wavelets there should be likewise some relation between scales. Naturally, since ψj,m ∈ Wj ⊂ Vj+1, it follows that

ψj,m=

∑

l∈Kj+1

gj,m,lφj+1,l. (2.3)

2.3 Vanishing moments and the order of multiresolution

anal-ysis

The dual wavelet functions ˜ψj,m are said to have N vanishing moments, if for all r ∈

N0, r < N and j ∈ J , m ∈ Mj, _∫ XPr

˜

ψj,mdµ = 0,

where Pr ∈ C∞ are linearly independent2) functions defined on X and P0 ≡ 1.

The fact that the dual wavelets ˜ψj,mhave N vanishing moments can also be expressed

in terms of the corresponding scaling functions. Since ˜Wj ⊥ Vj, we have

⟨ ˜

ψj,m, φj,k

⟩ = 0, for all possible indices, implying

⟨ ∑ k∈Kj c(r)_j,kφj,k, ˜ψj,m ⟩ = 0,

where c(r)_j,k ̸= 0 are arbitrarily chosen. But we also have ⟨

Pr, ˜ψj,m

⟩

= 0, ∀r ∈ N0, r < N.

Combining these two equations together yields

Pr=

∑

k∈Kj

c(r)_j,kφj,k, ∀r ∈ N0, r < N,

at least in the weak sense. Hence the scaling functions reproduce up to N linearly inde-pendent,C∞ functions.

2)_{In the sense that “the restrictions of a finite number of these functions to any ε-ball are linearly}

(17)

Conversely, if the scaling functions φj,k reproduce up to N linearly independent, C∞

functions, then ∫ XPr ˜ ψj,mdµ = ⟨ Pr, ˜ψj,m ⟩ = ∑ k∈Kj c(r)_j,k ⟨ φj,k, ˜ψj,m ⟩ , ∀r ∈ N0, r < N,

but since ˜Wj ⊥ Vj, the right side of the equation evaluates to zero.

All in all, if ˜ψj,mhave N vanishing moments or if φj,kreproduce N linearly independent, C∞ _{functions, we say that the order of the multiresolution analysis is N .}

The same applies to the dual multiresolution analysis. The (primal) wavelets ψj,m are

said to have ˜N vanishing moments if

∫

XPrψj,mdµ = 0, ∀r ∈ N0, r < ˜N ,

or, by means of scaling functions, ˜ Pr= ∑ k∈Kj ˜ c(r)_j,kφ˜j,k, ∀r ∈ N0, r < ˜N .

In this case we say that the order of the dual multiresolution analysis is ˜N .

2.4 Special instances of multiresolution

Now that we have established a fairly general framework, we will examine specific properties of multiresolution, assuming some additional conditions may be specified.

Multiresolution leading to (classical) orthogonal wavelets

Herein we impose these essential constraints on a multiresolution analysis{Vj | j ∈ J }:

1. The underlying space where the analysis is done, is the spaceL2 of functions of finite

energy and the index set J is all Z.

2. The scaling functions φj,k are obtained from one single function φ by

φj,k = 2j/2φ(2j · −k), (2.4)

with j, k∈ Z.

3. The spaceWj which is now defined as the orthogonal complement of Vj in Vj+1 is

required to possess an orthonormal basis. If this is the case, we will call the wavelets orthogonal.

Without loss of generality, we may suppose that the scaling functions φj,k = 2j/2φ(2j· −k) constitute an orthonormal basis of Vj, for if this was not the case, then obtaining a

new, orthonormal basis φ#_j,k is only a matter of writing

φ#_j,k = 2 j 2 √ 2π   F−1  _(∑ (Fφ)(ξ) l∈Z|(Fφ)(ξ + 2lπ)|2 )1 2     (2j · −k),

(18)

whereF and F−1denote the forward and inverse Fourier transform operators, respectively. We will tackle the proof in the next chapter which is devoted specially to this kind of multiresolution. Note that, since any function f ∈ Vj can now be expressed as

f =∑

k∈Z

⟨f, φj,k⟩ φj,k,

the dual multiresolution analysis ˜Vj coincides with the primal,Vj.

In the following chapter, we will also discuss the third constraint in more depth, for it is not immediately obvious that the space spanned by wavelets automatically has an orthonormal basis. However, it will emerge that this is always possible, provided we have a proper multiresolution analysis with the first two additional requirements mentioned above. Moreover, it will turn out that the resulting wavelets likewise enjoy the property of being obtained from one fixed function, in the sense that

ψj,k = 2j/2ψ(2j· −k). (2.5)

It is of no surprise that the two-scale relation takes somewhat more specific form. Since

φ∈ V0 ⊂ V1 and due to (2.4), it is possible to write

φ =∑

k∈Z √

2hkφ(2· −k), hk∈ ℓ2 (2.6)

or, in terms of the Fourier transform,

Φ(ξ) = √1 2 ∑ k∈Z hke−i ξ 2kΦ ( ξ 2 ) .

This can be simplified to

Φ(ξ) = √1 2H ( ξ 2 ) Φ ( ξ 2 ) , (2.7)

where H(ξ) = ∑_k_∈Zhke−iξk is the Discrete Time Fourier Transform of the sequence hk.

We will refer to the sequence hk(or H(ξ), interchangeably) as a filter; later it will become

obvious why.

The sequence hk defined above is in ℓ2, for we have hk=⟨φ, φ1,k⟩ and, by the Parseval

identity, _∑

k∈Z

|hk|2 = 1.

As a consequence of the Dominated Convergence Theorem [18, 11], the interchange of the sum and the integral when taking the Fourier transform of (2.6), is possible. This possibility appears commonly throughout the text; additional comments may be found in [22].

A vital property of such multiresolution is that of scale, meaning that f ∈ Vj is

equiv-alent to f (2·) ∈ Vj+1. This is, again, a direct consequence of (2.4), since f ∈ Vj is

expressible as f =∑ k∈Z cj,kφj,k = ∑ k∈Z cj,k2j/2φ(2j · −k),

(19)

but then also

f (2·) =∑ k∈Z c′_j,k2j+12 φ(2j+1· −k) = ∑ k∈Z c′_j,kφj+1,k, where cj,k =⟨f, φj,k⟩ and c′_j,k = 2− 1

2⟨f, φ_j,k⟩. Hence f(2 ·) ∈ V_j+1. Equally obvious is the

property of invariance under dyadic integer translation, namely, f ∈ Vj is equivalent to f (· − 2−jm), with m∈ Z.

Since we are working with the space L2, the property of vanishing moments may be

simplified to polynomial reproduction. In other words, one may take polynomials of degree

r as the linearly independent, C∞ functions we denoted as Pr. We then say that the

wavelet ψ has N vanishing moments if ∫

xrψ(x) dx = 0, ∀r ∈ N0, r < N.

With respect to the scaling function, this means that any polynomial of degree less than N is expressible as∑_k_∈Zckφ(· − k). Later we will see how the number of vanishing moments

is connected with regularity of wavelets.

As we have already indicated, the details of this type of multiresolution analysis are further elaborated in the following chapter, along with a possible construction of compactly supported wavelets which are of particular importance.

Multiresolution leading to (classical) biorthogonal wavelets

This setting is fairly similar to that of orthogonal wavelets. The main diﬀerence is that now we work in a fully biorthogonal manner, which has several significant consequences.

We will consider that the space Vj spanned by scaling functions has merely a Riesz

basis φj,k. The space spanned by the dual scaling functions ˜φj,k is again denoted by ˜Vj.

The space Wj is defined as a complement of Vj in Vj+1, in the sense that the sum Vj+1 =Vj ⊕ Wj is direct, with Wj ⊥ ˜Vj. Hence we respect this multiresolution property

in its most general form.

What remains the same as in the previous case, is the fact that all the work is accom-plished inL2, under the requirement

φj,k = 2j/2φ(2j· −k), φ˜j,k = 2j/2φ(2˜ j· −k).

This means that there are two scaling equations,

φ =∑ k∈Z √ 2hkφ(2· −k), φ =˜ ∑ k∈Z √ 2˜hkφ(2˜ · −k), (2.8) or, Φ(ξ) = √1 2 ∑ k∈Z hke−i ξ 2kΦ ( ξ 2 ) , Φ(ξ) =˜ √1 2 ∑ k∈Z ˜ hke−i ξ 2kΦ˜ ( ξ 2 ) .

The property of scale and the notion of vanishing moments remain the same. We will study biorthogonal wavelets in more depth in Chapter 4.

(20)

Multiresolution leading to second generation wavelets

In this case, we establish no further restrictions on the underlying multiresolution. The spaceL2 = (X , S, µ) is much too general to utilize the Fourier transform as a construction tool. Besides, the scaling functions are not necessarily translated and dilated versions of one fixed function, so the Fourier transform would be practically of no eﬀect. Instead, we take advantage of a promising technique known as lifting, whose power lies also in the fact that it provides a consistent way of building the classical wavelets we briefly discussed above, as well. Lifting may even shed new light on the process and purpose of wavelet analysis, for the principle may seem somewhat more “digestible” than that of classical treatise.

(21)

Chapter 3. Construction of orthogonal wavelets 17

Chapter 3

Construction of orthogonal

wavelets

In this chapter we consider a classical multiresolution setting as proposed by S. Mallat. From the vast range of literature, this survey is mainly based on [6, 7, 13, 15, 17].

For clarity, we begin by summarizing the requirements we have already imposed on a multiresolution analysis (leading to classical wavelets). It consists of closed subspaces

Vj ⊂ L2, satisfying

1. Vj ⊂ Vj+1, ∀j ∈ Z

2. ∪_j_∈J Vj =L2,

3. ∩_j_∈J Vj ={0},

4. For all j ∈ J , there is a family of scaling functions {φj,k | k ∈ Z} constituting a

Riesz basis of Vj. The scaling functions φj,k are obtained from one single function φ

by φj,k = 2j/2φ(2j· −k).

We have seen that this definition necessarily implies the property of scale, that is

f ∈ Vj ⇔ f(2 ·) ∈ Vj+1. In many situations, however, it will prove useful to involve

merely the family φ0,k = φ(· − k) as a Riesz basis of V0 under the hypothesis of the

property of scale. More precisely, if φ0,k = φ(· − k) is a Riesz basis of V0, and if f ∈ Vj is

equivalent to f (2·) ∈ Vj+1, then φj,k = 2j/2φ(2j· −k) is a Riesz basis of Vj. The reader

can convince himself of this fact simply by using the definition of a Riesz basis.

In the following few sections we discuss further properties of a scaling function, followed by arguments leading to a simple evaluation of the corresponding wavelet. After discussing a criterion for good approximation properties of the desired wavelet we turn our attention to an actual possibility of wavelet construction. This, as we will see, reduces to finding an appropriate scaling function. We conclude by deriving a fast algorithm of the Discrete Wavelet Transform, which fits neatly into the concept of multiresolution.

(22)

3.1 Further properties of a scaling function

We begin by deriving two auxiliary lemmas for assessing the property of orthonormality and that of Riesz basis in terms of the Fourier transform.

Lemma 3.1. Let f ∈ L2. The system {f(· − n) | n ∈ Z} is orthonormal if and only if

∑

k∈Z

|F (ξ + 2kπ)|2 ₌ 1

2π, a.e.

Proof. The system{f(·−n) | n ∈ Z} is orthonormal if and only if ⟨f(· − k), f(· − l)⟩ = δk,l

or, equivalently,⟨f, f(· − l)⟩ = δ0,l. Due to isometry of the Fourier Transform,

⟨f, f(· − l)⟩ = ⟨F, Ff(· − l)⟩ = ∫ F (ξ)F (ξ)eiξldξ = ∫ |F (ξ)|2_eiξl_{dξ =}∑ k∈Z ∫ 2(k+1)π 2kπ |F (ξ)|2_eiξl_dξ = ∫ 2π 0 ∑ k∈Z |F (ξ + 2kπ)|2_eiξl_e2πikl_{dξ =} ∫ 2π 0 eiξl∑ k∈Z |F (ξ + 2kπ)|2_dξ.

The last expression has a form of an inverse Discrete Time Fourier Transform, implying that⟨f, f(· − n)⟩ = δ0,n is equivalent to

∑

k∈Z|F (ξ + 2kπ)|2= 2π1 .

Lemma 3.2. Let S = {φ(· − n) | n ∈ Z} be a set of functions generating the space V0.

The set S is a Riesz basis of V0 if and only if

A≤∑

k∈Z

|Φ(ξ + 2kπ)|2 _{≤ B, a.e.,} _(3.1)

for some strictly positive A, B.

Proof. Let f ∈ V0 with the expansion f (x) =

∑

nλnφ(· − n), λn not necessarily unique.

Taking the Fourier transform of both sides gives

F (ξ) = Λ(ξ)Φ(ξ),

where Λ(ξ) is 2π-periodic. It follows

∥f∥2 ₌ ∫ |F (ξ)|2_{dξ =} ∫ |Λ(ξ)|2_|Φ(ξ)|2_dξ =∑ k∈Z ∫ 2(k+1)π 2kπ |Λ(ξ)|2_|Φ(ξ)|2_{dξ =} ∫ 2π 0 ∑ k∈Z |Λ(ξ + 2kπ)|2_{|Φ(ξ + 2kπ)|}2_dξ = ∫ 2π 0 |Λ(ξ)|2∑ k∈Z |Φ(ξ + 2kπ)|2_dξ. _(3.2)

(23)

Chapter 3. Construction of orthogonal wavelets 19 If Φ satisfies (3.1) then ∥f∥2 _{≤ B} ∫ 2π 0 |Λ(ξ)|2_dξ _and _∥f∥2_{≥ A} ∫ 2π 0 |Λ(ξ)|2_dξ, that is to say, B−1∥f∥2 ≤ ∫ 2π 0 |Λ(ξ)|2_dξ_{≤ A}−1_∥f∥2_. _(3.3)

From the discrete version of the Parseval formula it follows that

(2πB)−1∥f∥2≤∑

n

|λn|2 ≤ (2πA)−1∥f∥2.

It suﬃces to show linear independence. If f =∑_nλnφ(· − n) = 0 then by (3.3), Λ(ξ) is

zero almost everywhere and hence λn= 0,∀n ∈ Z. The set S is therefore a Riesz basis of V0.

We show the converse implication by contradiction. Let S be a Riesz basis of V0,

implying

∀f ∈ V0, f (x) =

∑

n

λnφ(x− n),

with the Riesz basis property (3.3). For any such f , (3.2) is also satisfied. If for either

A > 0 or B > 0 the condition (3.1) was not true, then in case of the lower bound A, it

is possible to construct a non-zero 2π periodic function Λ(ξ) defined for such ˜ξ for which

∑

n|Φ(˜ξ + 2kπ)|2 < A. But due to (3.2) we deduce a contradiction with the Riesz basis

property (3.3).

It is then simple to orthonormalize any Riesz basis{φ(· − n) | n ∈ Z} by putting Φ#(ξ) = √1 2π Φ(ξ) (∑ k∈Z|Φ(ξ + 2kπ)|2 )1/2, a.e.

By lemma 3.2, the denominator is assured to be strictly positive almost everywhere and thus Φ#(ξ) is well defined almost everywhere. Since

Φ#(ξ) = √1 2π Φ(ξ) (∑ k∈Z|Φ(ξ + 2kπ)|2 )1/2 = Λ(ξ)Φ(ξ),

with 2π-periodic Λ(ξ), we have

φ#=∑

n∈Z

λnφ(· − n),

therefore the system φ#₍_{· − n) generates V}

0. Orthonormality follows from lemma 3.1.

The following necessary condition will prove to be fundamental in our development. It states how the orthonormality of {φ(· − n) | n ∈ Z} reflects on the filter H(ξ) appearing in the two-scale relation.

(24)

Theorem 3.3. Let{φ(· − n) | n ∈ Z} be an orthonormal system in V0. Then

|H(ξ)|2₊_{|H(ξ + π)|}2 _{= 2, a.e.,}

where H(ξ) is given by the two-scale relation,

Φ(ξ) = √1 2H ( ξ 2 ) Φ ( ξ 2 ) . (3.4)

Proof. Orthonormality implies

∑ k∈Z |Φ(ζ + 2kπ)|2 ₌ 1 2π, a.e. Applying (3.4) yields ∑ k∈Z 1 2 H(ζ₂+ kπ) 2 Φ(ζ₂ + kπ) 2 = 1 2π, a.e., or, substituting ξ = ζ/2, 1 2π = ∑ k∈Z 1 2|H(ξ + kπ)| 2_{|Φ(ξ + kπ)|}2 = 1 2 ∑ k∈Z [ |H(ξ + 2kπ)|2_{|Φ(ξ + 2kπ)|}2₊_{|H(ξ + 2kπ + π)|}2_{|Φ(ξ + 2kπ + π)|}2] = 1 2 [ |H(ξ)|2∑ k∈Z |Φ(ξ + 2kπ)|2₊_{|H(ξ + π)|}2∑ k∈Z |Φ(ξ + 2kπ + π)|2 ] = 1 4π(|H(ξ) 2₊_{|H(ξ + π)|}2_).

We have taken advantage of splitting the sum in even an odd indexes (since the terms in

the sum are absolutely integrable).

The lemma below states that a reasonable scaling function φ satisfies |Φ(0)| = √1

2π,

which will prove to be useful in later development.

Lemma 3.4. Let{φ(· − n) | n ∈ Z} be an orthonormal system of functions such that Φ is

2π-periodic and continuous in 0. Let H(ξ), given by (2.7), be 2π-periodic and continuous

and let H(ξ) satisfies|H(ξ)|2₊_{|H(ξ + π)|}2_{= 2. Then necessarily}

|Φ(0)| =

∫

φ(x) dx = √1

(25)

Proof. Evaluating the equation

|H(ξ)|2₊_{|H(ξ + π)|}2 _{= 2}

in ξ = 0 yields

|H(0)| ≤√2.

We show that |H(0)| = √2 by contradiction. If |H(0)| < √2, then √1

2|H(ζ)| < 1 for ζ

small enough. By (2.7), for all ξ∈ R, Φ(ξ) = √1 2Φ ( ξ 2 ) H ( ξ 2 ) = 1 (√2)2Φ ( ξ 22 ) H ( ξ 22 ) H ( ξ 2 ) = . . . = = 1 (√2)mΦ ( ξ 2m ) H ( ξ 2m ) . . . H ( ξ 2 ) ,

for m ≥ 1. Taking a limit m → +∞, we would derive ∀ξ Φ(ξ) = 0, which would mean that φ is identically zero (up to a zero measure). This case was excluded in the definition.

Knowing that|H(0)| =√2 and that H is 2π-periodic, we have

|H(2kπ)| =√2. Since |Φ(2q+1_2kπ)_{| =} _√1 2|Φ(2 q_2kπ)_{| |H(2}q_2kπ)_{| = |Φ(2}q_2kπ)_{|, ∀q ≥ 0, k ∈ Z,} we deduce|Φ(2q+12kπ)| = |Φ(2kπ)|.

For k ̸= 0, the Riemann-Lebesgue lemma (lim_|ξ|→+∞Φ(ξ) = 0) ensures Φ(2kπ) = 0. By lemma 3.1, 1 2π = ∑ k∈Z |Φ(ξ + 2kπ)|2₌ ∑ k∈Z, k̸=0 |Φ(ξ + 2kπ)|2₊_|Φ(ξ)|2_.

Specially taking ξ = 0, we obtain|Φ(0)| = √1

2π.

We conclude this section by discussing a possible construction of a multiresolution analysis, starting from a suitable choice of a scaling function φ, not from the sequenceVj

itself. If φ satisfies the two-scale relation

φ =∑

n∈Z

hnφ(2· −n), (3.5)

with hn∈ ℓ2 and the Riesz basis property

A≤∑

k∈Z

|Φ(ξ + 2kπ)|2_{≤ B, a.e.,} _{A, B > 0,} _(3.6)

we may define

Vj = clos span{φj,k = 2j/2φ(2jx− k) | k ∈ Z}. (3.7)

The property Vj ⊂ Vj+1 is a result of (3.5), and together with (3.6) it follows that φj,k

is a Riesz basis of Vj. It comes to question whether the conditions

∪

j∈J Vj = L2 and

∩

j∈J Vj ={0} are satisfied. This is addressed by the following proposition. The proof in

(26)

Proposition 3.5. If φ is such that (3.5) and (3.6) hold, then the {Vj | j ∈ Z} defined by

(3.7) satisfy ∩_j_∈JVj = {0}. Furthermore, if |Φ(ξ)| is continuous at 0 and if Φ(0) ̸= 0, then∪_j_∈J Vj =L2.

3.2 Wavelets induced by a scaling function

In this section we will present a derivaion of an explicit formula for an orthogonal wavelet

ψ ∈ W0, meaning that the family {ψ0,m | m ∈ Z} is to constitute an orthonormal basis

of W0. As in case of a scaling function, the multiresolution framework ensures that the

family {ψj,m = 2j/2ψ(2j · −m) | m ∈ Z} then automatically constitutes an orthonormal

basis ofWj.

We begin by characterizing the space W0. Since W0 is the orthogonal complement of

V0 in V1, the condition f ∈ W0 is equivalent to f ∈ V1 and f ⊥ V0. The property f ∈ V1

is equivalent to f =∑_k_∈Z√2λnφ(2· −k), or F (ξ) = √1 2Λ ( ξ 2 ) Φ ( ξ 2 ) ,

for 2π-periodic Λ ∈ L2. On the other hand, the property f ⊥ V0 is equivalent to

⟨f, φ(· − k)⟩ = 0, ∀k ∈ Z, or 0 = ∫ F (ξ)Φ(ξ)eiξkdξ =∑ l∈Z ∫ 2π(l+1) 2πl F (ξ)Φ(ξ)eiξkdξ = ∫ 2π 0 eiξk∑ l∈Z F (ξ + 2πl)Φ(ξ + 2πl) dξ. That is to say, _∑ l∈Z F (ξ + 2πl)Φ(ξ + 2πl) = 0, a.e. Substituting F (ξ + 2πl) = √1 2Λ ( ξ 2+ πl ) Φ ( ξ 2 + πl ) and Φ(ξ + 2πl) = √1 2H ( ξ 2 + πl ) Φ ( ξ 2 + πl ) yields ∑ l∈Z Λ ( ξ 2+ πl ) H ( ξ 2 + πl ) Φ(ξ₂+ πl) 2 = 0, a.e.,

which can be rewritten, putting ξ :=₂ξ and splitting the sum in even and odd indices, as ∑

l∈Z

Λ(ξ)H(ξ)|Φ(ξ + 2πl)|2+∑

l∈Z

(27)

Chapter 3. Construction of orthogonal wavelets 23 = 1 2π ( Λ(ξ)H(ξ) + Λ(ξ + π)H(ξ + π) ) = 0, a.e., (3.8)

where we utilized the orthonormality of φ(· − k).

From the last equation we can determine Λ(ξ) more closely. For ξ such that H(ξ) = 0 we have Λ(ξ + π)H(ξ + π) = 0, but due to orthonormality of φ(· − k), we also have

|H(ξ)|2 ₊_{|H(ξ + π)|}2 ₌ _{|H(ξ + π)|}2 _{= 2, implying Λ(ξ + π) = 0, so the equation (3.8)}

applies if H(ξ) = 0. For ξ such that H(ξ)̸= 0 we can write Λ(ξ) =−Λ(ξ + π)H(ξ + π)

H(ξ) = θ(ξ)H(ξ + π), a.e.

where θ(ξ) ∈ L2 is 2π-periodic and satisfies θ(ξ) = −θ(ξ + π), a.e. If we now define

ν(ξ) = eiξ2θ

(

ξ

2

)

then ν(ξ) ∈ L2 is 2π-periodic and it follows that θ(ξ) = e−iξν(2ξ).

Together, Λ(ξ) = e−iξH(ξ + π)ν(2ξ).

Combining various pieces together, we finally obtain

F (ξ) = √1 2e −iξ 2H ( ξ 2+ π ) Φ ( ξ 2 ) ν(ξ), thus W0= { f ∈ L2 (Ff)(ξ) = √1 2e −iξ 2H ( ξ 2 + π ) Φ ( ξ 2 ) ν(ξ) } , where ν ∈ L2 is 2π-periodic.

We would like to characterize a ψ ∈ W0 so that ψ(· − m) constitute an orthonormal

basis ofW0. Our assertion is that such ψ is given in terms of its Fourier transform by

Ψ(ξ) = √1 2e −iξ 2H ( ξ 2 + π ) Φ ( ξ 2 ) µ(ξ), (3.9)

where µ is 2π-periodic and |µ(ξ)| = 1, for almost all ξ. Specially, taking µ(ξ) ≡ 1, the formula can be rewritten as

ψ =∑

n∈Z

(−1)nh1−nφ(2· −n). (3.10)

We first prove that such a ψ leads to an orthonormal basis of W0, in the sense that

ψ(· − m) are orthonormal and generate W0.

Orthonormality follows from ∑ k∈Z |Ψ(ξ + 2kπ)|2 ₌ 1 2 ∑ k∈Z H(ξ₂ + kπ + π) 2 Φ(ξ₂ + kπ) 2 = 1 2 ( H ( ξ 2+ π ) 2∑ k∈Z Φ(₂ξ + 2kπ) 2 +H ( ξ 2 ) 2∑ k∈Z Φ(ξ₂ + 2kπ + π) 2) = 1 2 · 1 2π· 2 = 1 2π.

(28)

We already know that f ∈ W0 is equivalent to

F (ξ) = √1 2e −iξ 2H ( ξ 2+ π ) Φ ( ξ 2 ) ν(ξ), or, F (ξ) = √1 2e −iξ 2H ( ξ 2+ π ) Φ ( ξ 2 ) ν(ξ) µ(ξ)µ(ξ) = ν(ξ) µ(ξ)Ψ(ξ) = Γ(ξ)Ψ(ξ),

with 2π-periodic Γ∈ L2. This can be rewritten as

f = ∑

m∈Z

cmψ(· − m),

where cm∈ ℓ2. Hence ψ(· − m) generates W0.

It remains to check that there are no other orthogonal wavelets than that given by (3.9). Since ψ∈ W0, it must satisfy

Ψ(ξ) = √1 2e −iξ 2H ( ξ 2 + π ) Φ ( ξ 2 ) µ(ξ),

with 2π-periodic µ∈ L2. Orthonormality of ψ(· − m) implies

1 2π = ∑ k∈Z |Ψ(ξ + 2kπ)|2₌ 1 2|µ(ξ)| 2∑ k∈Z H(ξ₂ + kπ + π) 2 Φ(ξ₂ + kπ) 2 = 1 2|µ(ξ)| 2 ( H ( ξ 2 + π ) 2∑ k∈Z Φ(ξ₂+ 2kπ) 2 +H ( ξ 2 ) 2∑ k∈Z Φ(ξ₂+ 2kπ + π) 2) = 1 2|µ(ξ)| 2 1 2π ( H ( ξ 2 + π ) 2+H ( ξ 2 ) 2 ) = 1 2π|µ(ξ)| 2_.

Therefore|µ(ξ)|2 = 1, which was to be proved.

3.3 Vanishing moments and regularity

In this section we address regularity questions regarding orthogonal wavelet bases. Spe-cially, we will show that one can not achieve an orthogonal wavelet ψ to belong in C∞.

Lemma 3.6. Let r∈ N0, ψ∈ Crsuch that ψ is not identically constant, let ψ(l)be bounded

for l≤ r and let

|ψ(x)| ≤ C0

(1 +|x|)r+1+ε, (3.11)

for some ε > 0, C0≥ 0. Let {ψj,k | j, k ∈ Z} constitute and orthonormal basis in L2. Then

ψ has r + 1 vanishing moments, i.e.,

∫

(29)

Proof. The proof by induction on r is based on [13].

1. Let r = 0. Since ψ is continuous, not identically constant and since the dyadic rationals{2−jk| j, k ∈ Z} are dense in R, there exist j0, k0 ∈ Z such that ψ(2−j0k0)̸=

0. Orthogonality implies ∫

ψ(x)ψ(2j_x_{− k) dx = 0, for all j ̸= 0, k ̸= 0.}

We choose j > max(j0, 0) and k = 2j−j0k0. This yields

∫ ψ(x)ψ(2j_(x_{− 2}−j0_k 0)) dx = 0. By change of variables (t = 2j(x− 2−j0_k 0)), ∫ ψ(2−jt + 2−j0_k 0)ψ(t) dt = 0.

Applying the Dominated Convergence Theorem for j → +∞,

ψ(2−j0_k 0) ∫ ψ(t) dt = 0, but ψ(2−j0_k 0)̸= 0, implying ∫ ψ(t) dt = 0.

2. Let us now assume that for all l = 0, 1, . . . , r− 1, ∫

xlψ(x) dx = 0.

We want to show that∫xrψ(x) dx = 0. Let us therefore auxiliary functions ν0, ν1, . . . νr,

so that ν0 = ψ and ∀n = 1, . . . , r νn(x) = ∫ x −∞νn−1(t) dt. Since |ν0(x)| = |ψ(x)| ≤ C0 (1 +|x|)r+1+ε, it follows that |ν1(x)| ≤ ∫ x −∞|ψ(t)| dt ≤ C0 ∫ x −∞ 1 (1 +|t|)r+1+εdt = C1 (1 +|x|)r+ε.

Applying this logic r-times and using the definition of νn, we end up with

|νn(x)| ≤

Cn

(1 +|x|)r−n+1+ε, (3.12)

where n = 0, 1, . . . , r. Integrating by parts, ∫

νr(x) dx = [xνr(x)]+_−∞∞−

∫

(30)

Since

0≤ |xνr(x)| ≤

Cr|x|

(1 +|x|)1+ε → 0 , when x → ±∞,

the first term equals zero. Integrating by parts r-times gives ∫ νr(x) dx =− ∫ xνr−1(x) dx = α1 ∫ xνr−1(x) dx = = α2 ∫ x2νr−2(x) dx = . . . = αr ∫ xrψ(x) dx,

for some constants αn. Therefore

∫

xrψ(x) dx = 0 ⇔

∫

νr(x) dx = 0. (3.13)

We know that {2−jk | j, k ∈ Z} is dense in R, ψ is not a polynomial (otherwise it

would not be bounded) and ψ(r)is continuous. Hence there exist j0, k0 ∈ Z such that

ψ(r)(2−j0_k

0)̸= 0. Orthogonality implies

∫

ψ(x)ψ(2j_x− k) dx = 0.

Taking again j > max(j0, 0) and k = 2j−j0k0,

∫

ψ(x)ψ(2j_(x− 2−j0_k

0) dx = 0.

Now let us consider the following expression: ∫

ψ(r)(x)νr(2j(x− 2−j0k0)) dx

Using (3.12) and the fact that ψ(n) is bounded for n≤ r, one can derive, integrating by parts, ∫ ψ(r)(x)νr(2j(x− 2−j0k0)) dx = K ∫ ψ(x)ψ(2j_(x− 2−j0_k₀_{)) dx = 0.} By change of variables, ∫ ψ(r)(2−jx + 2−j0_k 0)νr(x) dx = 0.

Applying again the Dominated Convergence Theorem and using the fact ψ(r)(2−j0_k 0)̸=

0, it follows _∫

νr(x) dx = 0.

By (3.13), this concludes the proof.

(31)

Corollary 3.7. Let ψ ∈ L2∩ C∞ and let ψ has compact support. Then {ψj,k | j, k ∈ Z} cannot constitute an orthonormal system.

Proof. Since ψ has compact support, there certainly exists C0 so that (3.11) is satisfied.

Let us pretend that {ψj,k | j, k ∈ Z} constitute an orthonormal system. By Theorem 3.6, ψ is then orthogonal to any polynomial, i.e.

∫

p(x)ψ(x) dx = 0, for every polynomial p.

Due to the Weierstrass Approximation Theorem1), for each ε > 0 there exists a polynomial

p, such that

sup

x∈K|f(x) − p(x)| < ε,

whereK denotes the support of ψ. We have

∥ψ∥2 L2 =⟨ψ, ψ⟩ = ∫ ψ(x)ψ(x) dx = ∫ ψ(x)ψ(x) dx− ∫ p(x)ψ(x) dx = = ∫ K[ψ(x)− p(x)]ψ(x) dx ≤ ∫ K|ψ(x) − p(x)| |ψ(x)| dx ≤ ε ∫ K|ψ(x)| dx

We see that the norm∥ψ∥2

L2 can be made arbitrarily small, which leads to contradiction

with orthonormality (∥ψ∥2_L₂ must equal 1). We are thus limited when designing compactly supported wavelets in the sense of smoothness - one can only construct compactly supported wavelets that belong toCN, for some fixed N .

3.4 Construction of compactly supported wavelets

In this section we will finally examine a method for construction of wavelets that are orthogonal and compactly supported, with a nice property of having prescribed number of vanishing moments.

First oﬀ, by (2.6) it is clear that if φ has compact support then hn is finite, and by

(3.10), ψ has likewise compact support.

Orthogonality is reflected on the filter H, defined by (2.7), as

|H(ξ)|2₊_{|H(ξ + π)|}2 _{= 2,}

where we dropped the “a.e.” because H is now a trigonometric polynomial.

The following lemma shows how the number of vanishing moments is reflected on the filter H.

1)_{Let f be a function with compact support. Then for each ε > 0 there exists a polynomial p such that}

(32)

Lemma 3.8. Let φ be an orthogonal scaling function with compact support, ψ an associated

wavelet defined by (3.10) such that φ, ψ∈ Cr. Then H(ξ) defined by (2.6) can be expressed as H(ξ) = ( 1 + exp(−iξ) 2 )r+1 P (ξ), with P ∈ Cr and 2π-periodic.

Proof. Since φ has compact support then by the discussion above ψ has also compact

support. Moreover, the compact support ensures that ψ(l) is bounded for 0≤ l ≤ r and that

|ψ(x)| ≤ C

(1 +|x|)r+1+ε,

for some ε > 0. According to lemma 3.6 it follows ∫

xlψ(x) dx = 0, ∀l = 0, 1, . . . , r.

Due to a standard property of the Fourier transform,

ξjΨ(l)(ξ) = (−i)l−j(−1)j ∫ dj dxj [ xlf (x) ] exp(−iξx) dx, we have Ψ(l)(ξ)_ξ=0 = 0, for all 0≤ l ≤ r. (3.14) Since hn is finite, it follows that H(ξ) is 2π-periodic and continuous. By lemma 3.4, Φ(0)

is nonzero. Since

Ψ(ξ) = √1

2exp(−iξ/2)H(ξ/2 + π)Φ(ξ/2)

and since both Φ, Ψ∈ Cr, it follows that H(ξ)∈ Cr. Applying (3.14) gives

H(l)(ξ)_ξ=π = 0, ∀0 ≤ l ≤ r.

This means that H has a zero of order at least r + 1 at ξ = π, implying

H(ξ) = ( 1 + exp(−iξ) 2 )r+1 P (ξ), with 2π-periodic P ∈ Cr.

The strategy is then to find a trigonometric polynomial H such that the following two properties are satisfied:

|H(ξ)|2₊_{|H(ξ + π)|}2 _{= 2,} _(3.15) and H(ξ) = ( 1 + exp(−iξ) 2 )r+1 P (ξ). (3.16)

(33)

Theorem 3.9 (Daubechies). Let H be a trigonometric polynomial satisfying

H(ξ) =√2 ( 1 + exp(−iξ) 2 )N P (ξ), (3.17)

where P is a trigonometric polynomial. We have

|H(ξ)|2₊_{|H(ξ + π)|}2 _{= 2,} _(3.18)

if and only if L(ξ) :=|P (ξ)|2 is of the form L(ξ) = q ( sin2 ξ 2 ) , with q(y) = qN(y) + yNR(y), where

qN(y) = N∑−1 k=0 ( N− 1 + k k ) yk

and R is a polynomial, antisymmetric with respect to 1₂, such that q(y)≥ 0 for all y ∈ [0, 1]. Proof. Inserting (3.17) in (3.18) yields|H(ξ)|2+|H(ξ + π)|2= 2, if and only if

( cos2 ξ 2 )N L(ξ) + ( sin2 ξ 2 )N L(ξ + π) = 1, (3.19) because |H(ξ)|2_{= H(ξ)H(ξ) = 2} ( 1 + exp(−iξ) 2 )N( 1 + exp(iξ) 2 )N |P (ξ)|2₌ = 2 ( 1 + cos ξ 2 )N L(ξ) = 2 ( cos2 ξ 2 )N L(ξ) (3.20) and |H(ξ + π)|2_{= 2} [ cos2 ( ξ + π 2 )]N L(ξ + π) = 2 ( sin2 ξ 2 )N L(ξ + π).

Since we assume that hn are real, it follows

H(ξ) = ∑ n finite hnexp(−iξn), H(ξ) = ∑ k finite hkexp(iξk) and |H(ξ)|2 ₌ ∑ n finite ∑ k finite hnhkexp[iξ(k− n)].

Moreover,|H(ξ)|2 is an even trigonometric polynomial, which can be verified by a simple change of variables ˜n := k, ˜k := n. Taking (3.20) in account, we examine that L(ξ) = |P (ξ)|2 _{is an even trigonometric polynomial and hence it can be expressed as L(ξ) =}

(34)

q0(cos ξ), for some polynomial q0. Using the identity cos ξ = 1− 2 sin2 ξ₂, L(ξ) can be

expressed as L(ξ) = q ( sin2 ξ 2 ) , (3.21)

where q is some polynomial. Substituting this in (3.19) after some manipulation yields ( 1− sin2 ξ 2 )N q ( sin2ξ 2 ) + ( sin2 ξ 2 )N q ( 1− sin2 ξ 2 ) = 1.

It is intuitive to substitute y = sin2 ξ₂ ∈ [0, 1]. The condition (3.18) is then equivalent to (1− y)Nq(y) + yNq(1− y) = 1, ∀y ∈ [0, 1], (3.22) with q such that L(ξ) = q

( sin2 ξ₂

)

= q(y). It remains to determine q. By the Bezout theorem2), there exist unique polynomials q1, q2 with degree less than or equal to N− 1,

such that

(1− y)Nq1(y) + yNq2(y) = 1. (3.23)

After a change of variables (y7→ 1 − y), this gives

(1− y)Nq2(1− y) + yNq1(1− y) = 1.

Since q1, q2 are unique, this means that q2(y) = q1(1− y). Hence q1 is a solution to (3.22).

The equation (3.23) can be rewritten as

q1(y) = (1− y)−N[1− yNq1(1− y)].

Since (1− y)−N can be expanded in Taylor polynomial plus a residue as (1− y)−N = N∑−1 k=0 ( N + k− 1 k ) yk+ O(yN),

where O(yN) carries the terms of power N or higher, the polynomial q1 then becomes

q1(y) = N∑−1 k=0 ( N + k− 1 k ) yk+ O(yN).

But since q1 has degree deg(q1)≤ N − 1, it follows

q1(y) = N_∑−1 k=0 ( N + k− 1 k ) yk. (3.24)

Note that according to (3.21), q must ne nonnegative for y∈ [0, 1], which is clearly satisfied. The polynomial q1 defined by (3.24) is thus the unique lowest degree solution of (3.22). We

2)_{If p}

1, p2are two polynomials of degree n1, n2respectively, with no common zeros then there exist unique

(35)

will denote qN := q1. To give a complete characterization of the equation (3.22), we need

to obtain all solutions. If q is a solution to (3.22), then (subtracting the two equations for

q and qN),

(1− y)N[q(y)− qN(y)] + yN[q(1− y) − qN(1− y)] = 0. (3.25)

If R(y) is a polynomial antisymmetric with respect to 1₂, i.e.

R(1− y) = −R(y),

then q(y) = qN(y) + yNR(y) is clearly a solution to (3.25), assuming that R is chosen such

that q(y)≥ 0, for y ∈ [0, 1]. This concludes the proof. Now we have|H(ξ)|2, but we need H(ξ) itself. For this purpose we have the following theorem.

Theorem 3.10 (Riesz, Fej´er). Let p be a positive trigonometric polynomial of the form

p(ξ) = M

∑

n=0

αncos(nξ), where an∈ R.

There exists a trigonometric polynomial q with real coeﬃcients and of the same order as p, such that|q(ξ)|2 = p(ξ).

Proof. We are looking for a trigonometric polynomial q of the form

q(ξ) = M

∑

n=0

βnexp(iξn),

where βn∈ R. Using the identity

cos(nξ) = cosnξ− ( n 2 ) sin2ξ cosn−2ξ + ( n 4 ) sin4ξ cosn−4ξ− . . . ,

it is indeed possible to express the trigonometric polynomial p as an ”ordinary” polynomial of the variable cos(ξ), i.e.

p(ξ) = ˜p(cos ξ),

where ˜p is a polynomial with real coeﬃcients and of degree M . Let v1, . . . , vM be the roots

of ˜p (not necessarily all diﬀerent). Then we can write

˜ p(v) = C M ∏ j=1 (v− vj). Let P (z) = CzM M ∏ j=1 ( z + z−1 2 − vj ) = C M ∏ j=1 ( 1 2 − vjz + 1 2z 2 ) = C M ∏ j=1 pj(z),

where z∈ C, z = e−iξ and pj(z) = 1₂ − vjz +1₂z2. Clearly, the degree of P is 2M and