J.A. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2011, LNCS 6761, pp. 677–683, 2011. © Springer-Verlag Berlin Heidelberg 2011
Algorithm Based on CUDA
Qingqing Xu1, Xin Zheng1, and Jie Chen2 1
Image Processing and Pattern Recognition Laboratory, College of Information Science and Technology, Beijing Normal University, Beijing 100875, P.R. China 2
Information center, CITIC Guoan Group, Beijing, P.R. China [email protected]
Abstract. Although many upsampling methods have been proposed, but no
method can get result images with satisfactory quality in real-time. In this paper, we propose a CUDA based image upsampling algorithm, which can gen-erate sharp edges with reduced grid-related artifacts efficiently. By analyzing existing method, we find chock points which confine the efficiency of the algorithm mostly and use CUDA to accelerate our algorithm and improve the implementation model of the algorithm. In this way we not only guarantee the quality of the result image, but also realize the purpose of a real-time human-computer interaction. Experimental results show that our method can get high-quality upsampled images efficiently.
Keywords: image upsampling, CUDA, large-scale data parallel computing,
image interpolation, super-resolution.
1 Introduction
As one of the most elementary image operations, image resizing or resampling has been used for many purposes and is supported by almost all image editing softwares. A proper linear pre-filtering can obtain satisfactory downsampled images. But for upsampling it is not the same case. Because of lacking the needed information during upsampling, the result images usually lack small-scale texture-related features and the sharp edges become blurry, original pixel grids are still noticeable. Although many interpolation-based upsampling methods[1-5] have been proposed, the quality of result images is not satisfactory.
In order to avoid the blurred areas in sharp edges, another kind of upsampling method is proposed by using edge detecting. The typical upsampling algorithms such as POCS[6] based on adaptive image magnification, upsampling via imposed edges statistics[9] and so on. These kinds of algorithms considered not only pixel connec-tion but also sharp edges, therefore the sampling effects were satisfactory. But all these methods used quite complex edges detecting computation, therefore their speed were not suitable to use in real-time texture mapping.
In this paper, we propose a high-quality fast image upsampling algorithm based on CUDA, which can generate sharp edges with reduced grid-related artifacts efficiently. As the same with upsampling method via imposed edges statistics, our method is also based on a statistical edge dependency relating certain edge features of two different resolutions, which is generically exhibited by real-world images. A special edge reli-ance is calculated in the preprocessing, which will be used as a known condition to reach the goal of improving the image resolution. And in this process the intensity of the image must also be preserved in order to achieve the association which can assure the content of output and input images are the same while the original resolution im-age is downsampled.
Furthermore, throughout the whole upsampling process, the complex operations between matrices often confine the efficiency of the algorithm, therefore we use CUDA[10] to accelerate our algorithm and improve the implementation model of the algorithm. In this way we not only guarantee the quality of the images upsampled, but also realize the purpose of a real-time human-computer interaction when upsampling. The rest of our paper is organized as follows. In section 2, we discuss the related work in upsampling method and CUDA[10]. In section 3, we discuss our algorithm in detail. Implementations and experiments are discussed in section 4. In section 5, we describe our conclusion.
2 Related Work
Upsampling is an important method for image processing and is widely used in dif-ferent procedures of image editing. There are many kinds of difdif-ferent upsampling methods, such as Classical Methods, Weights Adapting Methods, Storing Additional Data, Edge Statistics and so on.
Classical Methods include Nearest-Neighbor, Bilinear, Bicubic and so on. They[1-5] are very easy and popular used in the contemporary image processing commercial softwares. But these methods always rely on the assumption that the image data is either spatially smooth or band-limited. Because of this, these methods may have bad visual effects during the image processing. Visual artifacts such as ringing, aliasing, blocking and blurring are obviously.
On the other hand, many people put the Weights Adapting Methods forward in or-der to avoiding blurring, ringing and other artifacts. One of these methods is POCS[6] based on adaptive image magnification. It was proposed by Ratakonda and Ahuja in 1998. The selective interpolation is implemented by using an iterative Projection Onto Convex Set (POCS) scheme. There are three basic steps in this method. Firstly, find-ing the edges of the input image. Secondly, obtainfind-ing the initial image. Finally, use POCS based on iterative algorithm method to get good upsampling images. Another Weights Adapting Method is Image Interpolation by using pixelleveldata-dependent triangulation. It was proposed by Su and Willis[7]. This method includes series of interpolation weights which are adjusted locally by choosing three out of the four nearest pixels. By using this method, we can reduce the number of variables that are averaged. This choice forms a noticeable block-like eggect, showing strong continuity along one of the two diagonals.
The Storing Additional Data methods also have many different methods. Tumblin and Choudhury[8] store additional image data in the form of discontinuity graphs and avoid the averaging pixels across boundaries. Raanan Fattal [9] uses Edge Statistics to upsample images. Although Fattal’s method can get result images with high quality, it has a fatal weakness. Because of the accurately calculation, the speed of his method is very slow. So it can’t be used in real time image processing.
At the same time, CUDA is NVIDIA’s parallel computing architecture that enable dramatic increases in computing performance by harnessing the power of the GPU(Graphics Processing Unit)[10]. With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for CUDA, including image and video processing, computational biology and chemis-try, fluid dynamics simulation, CT image reconstruction, seismic analysis, ray tracing, and much more.
In our paper, we propose a method and use CUDA to accelerate the speed of the processing procedure which can get good results and can be used in real time image processing. So our method have an extensive use in many different fields which need to use image upsampling.
3 Upsampling Algorithm Based on CUDA
As a result of our research on upsampling methods, we know that the method of Im-age Upsampling via Imposed Edge Statistics produces the best result imIm-ages. This algorithm was proposed by Raanan Fattal in 2007. It uses the edge statistics informa-tion to let the edges of upsampled images become clear without ringing, aliasing, blocking and blurring. The basic idea of Fattal’s method can be described as Fig.1. By the guiding of this idea, we can clearly know the goal of our method. We’ll improve Fattal’s algorithm and accelerate the algorithm by using CUDA.
Fig. 1. Sharp upsampled image resulting from a low-resolution image plus edge statistics
3.1 Using Gradients Model to Get Good Upsampling Results
We use the changes in one direction of a gradient to judge the edges of an image. We sum the gradients of pixels along one edge and solve the average of the sum gradients. Thus, we can know the variance of the gradients. Use the average and the variance, we can construct a Gaussian distribution. Then by using this Gaussian distribution we construct a Gibbs distribution. The Gibbs distribution is a method for
reduce the difficulty of the calculation of Gaussian distribution. In the condition of weak assumptions, the error rate of Gibbs distribution is only twice of the best classification(Bayes). So Gibbs distribution is a good choice for reducing calculated amount.
Having known the above conclusion, we will discuss the details in the following. The whole algorithm flow chart shows in Fig.2.
Input image I image I’ Use I’ to calculate the gradient
Let
Calculate the whole gradient of one gradient direction Get the average of
the sum gradients Get the variance
of the gradient Construct a Gaussian distribution Construct a Gibbs distribution P(I) Max(log(P(I))) upsampling
Fig. 2. The whole algorithm flow chart
3.2 Using CUBLAS to Accelerate the Algorithm
By analyzing the above processing procedure, we find that all steps with matrix operations are time cost process. While matrix operation is very suitable to be accel-erated by using CUDA, we propose our CUDA based upsampling algorithm. By com-paring 4 ways of matrix operation with CUDA: Without Shared Memory, With Shared Memory and Blocks, With Registers, Using CUBLAS, we use the most effi-cient method named CUBLAS to accelerate the computational process. We can see the whole procedure in Fig.3.
CUBLAS[10] is a BLAS(Basic Linear Algebra Subprograms) library ported to CUDA, which enables the use of fast computing by GPUs without direct operation of the CUDA drivers. This library uses on GPU per a process.
4 Result
We do experiments on a Intel(R) Core(TM) i3 CPU, 3.07GHz, 2G RAM computer. The GPU is NVIDIA GeForce 310 and coded by using C++. We will discuss the result of our method from three aspects: comparison in run-time, the result images of our method and the usage in large images.
Fig. 3. The flow chart of the calculation of matrix by using CUBLAS
4.1 Comparison in Run-Time
By using the CUBLAS library to accelerate our algorithm, in the experiment, our method has a very good performance. Compared to Fattal’s algorithm, our method is about three times faster. We can see the comparison data in Table 1.
We use different color images of size 128×128, 512×512 and 1024×1024 pixels in these experiments. In Table 1. we can see that our method has reduced the run-time obviously.
Table 1. The run-time comparation
Method
Image size Fattal ’s Method Our Method
128×128 1.4 Sec. 0.5 Sec.
512×512 5.1 Sec. 1.4 Sec.
1024×1024 18.6 Sec. 4.8 Sec.
4.2 The Result Images of Our Method
By comparing the result images of our method with Fattal’s, the result images are almost the same. The only differences was produced by the computational process of CUBLAS. Because the computation of CUBLAS uses float as its data type instead of double. Thus, there may be some problems in computational accuracy. Because the results of our method always perform the same as Fattal’s, so we won’t show them in our paper. In the following, we show a condition of failure, we can see the aborted result in Fig.4.
4.3 Usage in Large Images
Because of the fast and good results of our method, we can use it to process large images. But our method still needs to improve. In the future, we will use it in real-time drafting.
Fig. 4. (a) the input image. (b) an aborted result of our method which is produced by the
com-putational accuracy. The failure is in the center of the child’s eyes. (c ) Fattal’s result. (a)
(c) (b)
5 Conclusion
In our paper, we proposed a high-quality fast image upsampling algorithm based on CUDA. By using this method, we can generate sharp edges with reduced grid-related artifacts efficiently. We not only use edges statistics to ensure the quality of our re-sults, but we also use CUDA to accelerate our algorithm. By using both of them, we got our high-quality fast image upsampling algorithm. We can know that the edges dependency relating certain edge features of two different resolutions. They are gen-erically exhibited by real-world images. In addition, a special edge reliance is calcu-lated in the preprocessing.
Acknowledgements. We’d like to thank Raanan Fattal for his help. This paper is
supported by grants from the National Natural Science Foundation of China (Project No.60703070).
References
1. Shi, J.: The basis of Virtual reality and its practical algorithm. Science Press, Beijing (2002)
2. Williams, L.: Pyramidal Para metrics. Computer (1983)
3. Han, H., Zhang, C.: The research of the real-time interactive technology Clipmap in scene with huge texture, Jinan (2003)
4. Losasso, F., Hoppe, H.: Geometry clipmaps; terrain rendering using nested regular grids. ACM Transactions on Graphics 23(3) (2004)
5. Tian, D., Zheng, X.: Adaptive Large-scale Terrain texture Mapping Technology. The IADIS Computer Graphics and Visualization (2008)
6. Ratakonda, K., Ahuja, N.: POCS based adaptive image magnification. Image Processing (1998)
7. Su, Willis: Image interpolation by pixelleveldata-dependent triangulation. Computer Graphics Forum 23, 189–202 (2004)
8. Tumblin, Choudhury: Bixels: Picture samples with sharp embedded boundaries. Euro-graphics Symposium on Rendering (2004)
9. Fattal, R.: Upsampling via Imposed Edges Statistics. In: Proceedings of Siggraph (2007) 10. NVIDIA Corporation, http://developer.nvidia.com/