Sparse Signal Recovery: Theory, Applications
and Algorithms
Bhaskar Rao
Department of Electrical and Computer Engineering
University of California, San Diego
Collaborators: I. Gorodonitsky, S. Cotter, D. Wipf, J. Palmer, K.
Kreutz-Delgado
Outline
1
Talk Objective
2
Sparse Signal Recovery Problem
3
Applications
4
Computational Algorithms
5
Performance Evaluation
Outline
1
Talk Objective
2
Sparse Signal Recovery Problem
3
Applications
4
Computational Algorithms
5
Performance Evaluation
Importance of Problem
Organized with Prof. Bresler a Special Session at the 1998 IEEE
International Conference on Acoustics, Speech and Signal Processing
VOLUME
I11DSPl7:
DSP EDUCATION
DSP Instead of Circuits?
-
Transition to Undergraduate DSP Education at Rose-Hulman...
111-1845W. Padgett, M. Yoder (Rose-Hulman Institute of Tecnology, USA)
A Java Signal Analysis Tool for Signal Processing Experiments
...
~...,..,....,.,...,..,...,,...III-l~9 A. Clausen, A. Spanias, A. Xavier, M. Tampi (Arizona State University, USA)Visualization of Signal Processing Concepts
...
111-1853J. Shaffer, J. Hamaker, J. Picone (Mississippi State University, USA)
An Experimental Architecture for Interactive Web-based DSP Education
...
111-1 857M. Rahkila, M. Karjalainen (Helsinki University of Technology, Finland)
SPEC-DSP:
SIGNAL PROCESSING WITH SPARSENESSCONSTRAINT
Signal Processing with the Sparseness Constraint
...
111-1861B. Rao (University of California, San Diego, USA)
Application of Basis Pursuit in Spectrum Estimation
...
111-1865S. Chen (IBM, USA); D. Donoho (Stanford University, USA)
Parsimony and Wavelet Method for Denoising
...
111-1869H. Krim (MIT, USA); J. Pesquet (University Paris Sud, France); I. Schick (GTE Ynternetworking and Harvard Univ., USA)
Parsimonious Side Propagation
...
111-1873P. Bradley, 0. Mangasarian (University of Wisconsin-Madison, USA)
Fast Optimal and Suboptimal Algorithms for Sparse Solutions to Linear Inverse Problems ... 111-1877
G. Harikumar (Tellabs Research, USA); C. Couvreur, Y. Bresler (University of Illinois, Urbana-Champaign. USA)
Measures and Algorithms for Best Basis Selection
...
111-1881K. Kreutz-Delgado, B. Rao (University of Califomia, San Diego, USA)
Sparse Inverse Solution Methods for Signal and Image Processing Applications
...
111-1885B. Jeffs (Brigham Young University, USA)
Image Denoising Using Multiple Compaction Domains
...
111-1889P. Ishwar, K. Ratakonda, P. Moulin, N. Ahuja (University of Illinois, Urbana-Champaign, USA)
SPEC-EDUCATION:
FUTURE
DIRECTIONS
IN
SIGNAL
PROCESSING
EDUCATION
A Different First Course in Electrical Engineering
...
111-1893D. Johnson, J. Wise, Jr. (Rice University, USA)
Integrating Engineering Design, Signal Processing, and Community Service in the EPICS Program
...
111-1897 L. Jamieson, E. Coyle, M. Haver, E. Delp (Purdue University, USA)An Interactive Learning Environment for Adaptive Systems
...
III-1901 J. Principe, N. Euliano (University of Florida, USA); C. Lefebvre (Neurodimension, USA)Interactive DSP Education Using Java
...
111-1905Y. Cheneval, L Balmelli, P. Prandoni (Swiss Federal Institute of Technology, Switzerland); J. Kovacevic (Bell Labs, Lucent Technologies, USA); M. Vetterli (Swiss Federal Institute of Technology, Switzerland)
E. Patterson (Clemson University, USA); D. Wu (Sony Corporation, USA); J. Gowdy (Clemson University, USA)
Multi-Platform CBI Tools Using Linux and Java-Based Solutions for Distance Learning
...
111-1909xxxv
Talk Goals
Sparse Signal Recovery is an interesting area with many
potential applications
Tools developed for solving the Sparse Signal Recovery problem
are useful for signal processing practitioners to know
Outline
1
Talk Objective
2
Sparse Signal Recovery Problem
3
Applications
4
Computational Algorithms
5
Performance Evaluation
Problem Description
t
is
N
×
1 measurement vector
Φ is
N
×
M
Dictionary matrix.
M
>>
N
.
w
is
M
×
1 desired vector which is sparse with
K
non-zero entries
�
is the additive noise modeled as additive white Gaussian
Problem Statement
Noise Free Case
: Given a target signal
t
and a dictionary Φ, find
the weights
w
that solve:
min
w
M
�
i
=1
I
(
w
i
�
= 0) such that
t
= Φ
w
where
I
(.) is the indicator function
Noisy Case
: Given a target signal
t
and a dictionary Φ, find the
weights
w
that solve:
min
w
M
�
i
=1
I
(
w
i
�
= 0) such that
�
t
−
Φ
w
�
2
2
≤
β
Complexity
Search over all possible subsets, which would mean a search over
a total of (
M
C
K
) subsets. Problem NP hard. With
M
= 30,
N
= 20,
and
K
= 10 there are 3
×
10
7
subsets (Very
Complex)
A branch and bound algorithm can be used to find the optimal
solution. The space of subsets searched is pruned but the search
may still be very complex.
Indicator function not continuous and so not amenable to
standard optimization tools.
Challenge
: Find low complexity methods with acceptable
performance
Outline
1
Talk Objective
2
Sparse Signal Recovery Problem
3
Applications
4
Computational Algorithms
5
Performance Evaluation
Applications
Signal Representation (Mallat, Coifman, Wickerhauser, Donoho,
...)
EEG/MEG (Leahy, Gorodnitsky, Ioannides, ...)
Functional Approximation and Neural Networks (Chen,
Natarajan, Cun, Hassibi, ...)
Bandlimited extrapolations and spectral estimation (Papoulis,
Lee, Cabrera, Parks, ...)
Speech Coding (Ozawa, Ono, Kroon, Atal, ...)
Sparse channel equalization (Fevrier, Greenstein, Proakis, )
Compressive Sampling (Donoho, Candes, Tao..)
DFT Example
N
chosen to be 64 in this example.
Measurement t:
t
[
n
] = cos
ω
0
n
,
n
= 0
,
1
,
2
, ..,
N
−
1
ω
0
=
2
π
64
33
2
Dictionary Elements:
φ
k
= [1
,
e
−
j
ω
k,
e
−
j
2
ω
k, ..,
e
−
j
(
N
−
1)
ω
k]
T
, ω
k
=
2
π
M
Consider
M
= 64,
128,
256 and 512
NOTE: The frequency components are included in the dictionary Φ
for
M
= 128,
256,
and 512.
FFT Results with Different
FFT Results
M
M=64
M=128
Magnetoencephalography (MEG)
Given measured magnetic fields outside of the head, the goal is to
locate the responsible current sources inside the head.
MEG Example
i
Given measured magnetic fields outside of the head, we would
like to locate the responsible current sources inside the head.
= MEG measurement point
= Candidate source location
MEG Example
At any given time, typically only a few sources are active (SPARSE).
MEG Example
i
At any given time, typically only a few sources are active.
= Candidate source location
= MEG measurement point
MEG Formulation
Forming the overcomplete dictionary Φ
The number of rows equals the number of sensors.
The number of columns equals the number of possible source
locations.
Φ
ij
= the magnetic field measured at sensor
i
produced by a
unit current at location
j
.
We can compute Φ using a boundary element brain model and
Maxwells equations.
Many different combinations of current sources can produce the
same observed magnetic field
t
.
By finding the sparsest signal representation/basis, we find the
smallest number of sources capable of producing the observed
field.
Compressive Sampling
Transform Coding
Compressive Sensing
Compressive Sampling
Transform Coding
Compressed
Sensing
Ɏ
w
b
Ɏ
w
b
A
A
t
Ɍ
Compressive Sensing
Computation :
t
→
w
→
b
Compressive Sampling
Transform Coding
Compressed
Sensing
Ɏ
w
b
Ɏ
w
b
A
A
t
Ɍ
Compressive Sensing
Compressed
Sensing
Ɏ
w
b
Ɏ
w
b
A
A
t
Ɍ
Computation :
t
→
w
→
b
Compressive Sampling
Transform Coding
Compressed
Sensing
Ɏ
w
b
Ɏ
w
b
A
A
t
Ɍ
Compressive Sensing
Compressed
Sensing
Ɏ
w
b
Ɏ
w
b
A
A
t
Ɍ
Computation :
t
→
w
→
b
Outline
1
Talk Objective
2
Sparse Signal Recovery Problem
3
Applications
4
Computational Algorithms
5
Performance Evaluation
Potential Approaches
Problem NP hard and so need alternate strategies
Greedy Search Techniques
: Matching Pursuit, Orthogonal
Matching Pursuit
Minimizing Diversity Measures
: Indicator function not
continuous. Define Surrogate Cost functions that are more
tractable and whose minimization leads to sparse solutions, e.g.
�1
minimization
Bayesian Methods
: Make appropriate Statistical assumptions on
the solution and apply estimation techniques to identify the
desired sparse solution
Greedy Search Methods: Matching Pursuit
Select a column that is most aligned with the current residual
0.74 …. Ͳ0.3 ….
t
ĭ
w
İ
Remove its contribution
Stop when residual becomes small enough or if we have
exceeded some sparsity threshold.
Some Variations
Matching Pursuit
[Mallat & Zhang]
Inverse techniques
For the systems of equations Φ
w
=
t
,
the solution set is
characterized by
{
w
s
:
w
s
= Φ
+
t
+
v
,
v
∈ N
(Φ)
}
,
where
N
(Φ)
denotes the null space of Φ and Φ
+
= Φ
T
(ΦΦ
T
)
−
1
.
Minimum Norm solution
: The minimum
�2
norm solution
w
mn
= Φ
+
t
is a popular solution
Noisy Case
: regularized
�2
norm solution often employed and is given
by
w
reg
= Φ
T
(ΦΦ
T
+
λ
I
)
−
1
t
Diversity Measures
Functionals whose minimization leads to sparse solutions
Many examples are found in the fields of economics, social
science and information theory
These functionals are concave which leads to difficult
optimization problems
Examples of Diversity Measures
�
(
p
≤
1)
Diversity Measure
E
(p)
(
w
) =
M
�
l=1
|
w
l
|
p
,
p
≤
1
Gaussian Entropy
H
G
(
w
) =
M
�
l
=1
ln
|
w
l
|
2
Shannon Entropy
H
S
(
w
) =
−
M
�
l=1
˜
w
l
ln ˜
w
l.
where ˜
w
l
=
w
2
l
�
w
�
2
Diversity Minimization
Noiseless Case
min
w
E
(
p
)
(
w
) =
M
�
l
=1
|
w
l
|
p
subject to Φ
w
=
t
Noisy Case
min
w
�
�
t
−
Φ
w
�
2
+
λ
M
�
l
=1
|
w
l
|
p
�
p
= 1 is a very popular choice because of the convex nature of the
optimization problem (Basis Pursuit and Lasso).
Bayesian Methods
Maximum Aposteriori Approach (MAP)
Assume a sparsity inducing prior on the latent variable
w
Develop an appropriate MAP estimation algorithm
Empirical Bayes
Assume a parameterized prior on the latent variable
w
(hyperparameters)
Marginalize over the latent variable
w
and estimate the
hyperparameters
Determine the posterior distribution of
w
and obtain a point as
Generalized Gaussian Distribution
Density function: Subgaussian:
p
>
2 and Supergaussian :
p
<
2
f
(
x
) =
p
2σΓ(
1
p
)
exp
�
−
�
|
x
|
σ
�
p
�
Generalized Gaussian Distribution
0.5 1 1.5 2 2.5 3 p=0.5 p=1 p=2 p=10
( )
exp
2
1
p
x
p
p x
p
P
V
V
§
·
½
°
°
®
¨
¸
¾
*
°
¯
©
¹
°
¿
-
p < 2: Super
Gaussian
-
p > 2: Sub
Gaussian
Student t Distribution
Density function:
f
(
x
) =
Γ(
ν
+1
2
)
√
πν
Γ(
ν
2
)
�
1 +
x
2
ν
�
(
ν+1 2)
Student-t Distribution
1
2
2
1
2
( )
1
2
x
p x
Q
Q
Q
Q
QS
§
·
¨
©
¸
¹
§
·
*¨
©
¸ §
¹
·
¨
¸
§ · ©
¹
*¨ ¸
© ¹
-5 0 5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Q = 1 Q = 2 Q = 5 GaussianMAP using a Supergaussian prior
Assuming a Gaussian likelihood model for
f
(
t
|
w
), we can find MAP
weight estimates
w
MAP
= arg max
w
log
f
(
w
|
t
)
= arg max
w
(log
f
(
t
|
w
) + log
f
(
w
))
=
arg min
w
�
�
Φ
w
−
t
�
2
+
λ
M
�
l
=1
|
w
l
|
p
�
This is essentially a regularized LS framework. Interesting range for
p
MAP Estimate: FOCal Underdetermined System
Solver (FOCUSS)
Approach involves solving a sequence of Regularized Weighted
Minimum Norm problems
q
k
+1
= arg min
q
�
�
Φ
k
+1
q
−
t
�
2
+
λ
�
q
�
2
�
where Φk
+1
= Φ
M
k+1,
and
M
k+1
= diag(
|
w
k
.
l
|
1
−
p2