Single Depth Image Super Resolution and Denoising Using Coupled Dictionary Learning with
Local Constraints and Shock Filtering
Jun Xie
1, Cheng-Chuan Chou
2, Rogerio Feris
3, Ming-Ting Sun
11
University of Washington,
2
Industrial Technology Research Institute (ITRI), Taiwan,
3
IBM T. J. Watson Research Center Hawthorne, U.S.
Outline
• Introduction
• Our contribution
• Simulation results
• Conclusion
Motivation
• Depth images often are low-resolution and noisy which affects the quality of the applications
• Human are sensitive to 3D noises and
jagged edges
2D patch
3D patch
Objective
• Input: Single noisy, low-resolution depth map
• Output: A clean, increased resolution depth map
Related Work on Depth Super Resolution
• Fusion of multiple depth images
• Use a guiding high resolution color image
However, multiple depth maps or guiding color images at the target resolution often are unavailable.
Q. Yang et al., “Spatial-depth Super Resolution for Range Images,” CVPR 2007.
Related Work
• Learning-based single Image super resolution
J. Yang, J. Wright, T. Huang, Y. Ma, “Image Super-resolution as
Sparse Representation of Raw Image Patches,” CVPR, 2008.
Problems from the Properties of Low-Resolution Depth Maps
• Lack of texture -> Overfitting
• Noisy and
jagged edges
Our Contribution
Propose a dictionary learning based algorithm by
• Adding local constraints into the coupled dictionary learning process
To prevent the dictionary from over-fitting
• Incorporating an adaptively regularized Shock filter
To tackle the jagged edges and noises in the
depth map
Our Coupled Dictionary Learning
• Training set
- Divide images into patches
• Feature Extraction
Low-res Images:
[Gx, Gy, Gxx, Gyy]
High-res Images:
f
h=y
h-y
l’ (y
l’ is the bilinear
interpolation result of y
l)
Our Coupled Dictionary Learning
• Impose a local constraint
Objective: Given training feature patches x, learn a dictionary d such that:
2 2
min
,(
i i j i ij)
d c i j
x d c d x c
Local Constraint Linear Combination of dictionary bases
c: weighting coefficient vector
• For each low resolution patch, only the dictionary bases which are most similar to it are selected, effectively
preventing the overfitting problem
• Preserve the manifold assumption in the feature space
and keep the locality constraint
Sparse Reconstruction Based on the Learned Coupled Dictionary
' 2
min . .
0i
i
i l l i i
c
cs d c s t c L
Shared coeffs.
' i
h h i
s d c Linear combination of high-res dictionary bases
d’ contains 10% of dictionary atoms with closest
distances to the low-resolutions patches
Edge Denoising Based on
Adaptively Regularized Shock Filter
Why Shock filter?
• Edge preserving
• Remove jagged noises
• Good smoothing of depth images
which have less texture
Edge Denoising Based on Regularized Shock Filter
2 arctan( I ( ))
t m
I a I I I
I
Smoothing in the gradient direction
Smoothing in the tangent direction Shock term for edge
enhancement
G. Gilboa, N. Sochen, Y. Y. Zeevi, “Image Enhancement and Denoising by
Complex Diffusion Processes,” PAMI, vol. 26, issue 8, pp. 1020-1036, 2004.
• Adaptive weight
Adaptively Regularized Denoising Shock Filter
Large beta
Small beta
2 arctan( I ( ))
t m
I a I I I
I
Quad Tree Plain Region
(little
Smoothing)
Edges
(Smoothing along tangent direction)
Corners
Edge Denoising Based on
Adaptively Regularized Shock Filter
• Filtering result
Edge Denoising Based on
Adaptively Regularized Shock Filter
• Filtering result
Low-res Result of [1] Ours
[1] J. Yang et al., “Image Super-resolution as Sparse
Representation of Raw Image Patches,” CVPR, 2008.
Quantitative Results
RMSE COMPARISON SCALED *3 RMSE COMPARISON SCALED *4 Cones Venus Teddy Tsukuba Cones Venus Teddy Tsukuba Nearest Neighbor 1.172 0.309 0.925 0.672 1.498 0.367 1.348 0.832 Sparse coding [1] 1.291 0.420 1.133 1.504 2.908 1.126 2.140 0.840 K-SVD based [2] 1.030 0.284 0.782 0.636 1.268 0.320 1.186 0.730 Aodha et. al [3] 1.319 0.311 0.987 0.844 1.504 0.337 1.026 0.833 Tsai et. al in [4] 1.049 0.278 0.781 0.646 1.246 0.321 1.178 0.714 Hornacek. et. al [5] 0.927 0.273 0.835 0.878 1.375 0.452 1.129 0.727 Our (w/o Shock filter) 0.957 0.258 0.706 0.613 1.188 0.284 1.147 0.712 Our (with Shock filter) 0.842 0.220 0.657 0.531 1.111 0.265 1.108 0.635
[1] J. Yang et al., “Image Super-resolution as Sparse Representation of Raw Image Patches,” CVPR, 2008.
[2] R. Zeyde et al., “On Single Image Scale-up using Sparse Representations,” Curves and Surfaces, 2010.
[3] O. M. Aodha et al., “Patch based Synthesis for Single Depth Image Super-resolution,” ECCV, 2012.
[4] C. Tsai et al., “Context-aware Single Image Super-resolution Using Locality-constrained Group Sparse Representation,” VCIP, 2012.
[5] M. Hornacek, et. al, “Depth super resolution by rigid body self-similarity in 3d,” CVPR, 2013.
Visual Results
Nearest Neighbor (NN) [1] [2] Ours
[1] J. Yang, J. Wright, T. Huang, and Y. Ma, “Image Super-resolution as Sparse Representation of Raw Image Patches,” CVPR, 2008.
[2] O. M. Aodha, N. D. Campbell, A. Nair, and G. J. Brostow, “Patch based Synthesis for Single Depth Image Super-resolution,” ECCV, 2012.
Ours
Visual Results
NN [1] [2] Ours
[1] R. Zeyde, M. Elad, M. Protter, “On Single Image Scale-up using Sparse Representations,” in Curves and Surfaces, 2010.
[2] O. M. Aodha, N. D. Campbell, A. Nair, and G. J. Brostow, “Patch based Synthesis for Single Depth Image Super-resolution,” in ECCV, 2012.
3D Visual Results
NN [1] [2] Ours
[1] R. Zeyde, M. Elad, M. Protter, “On Single Image Scale-up using Sparse Representations,” Curves and Surfaces, 2010.
[2] O. M. Aodha, N. D. Campbell, A. Nair, and G. J. Brostow, “Patch based Synthesis for Single Depth Image Super-resolution,” ECCV, 2012.
View Synthesis Results
GT O. M. Aodha, et. al “Patch based Synthesis for
Single Depth Image Super-resolution,” ECCV, 2012.
C. Tsai et. al “Context-aware Single Image Super- resolution Using Locality-constrained Group Sparse Representation,” VCIP, 2012.
Ours