AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Jun Wang
Parallel Data Selection Based on Neurodynamic Optimization
in the Era of Big Data
Department of Mechanical and Automation Engineering
The Chinese University of Hong Kong Shatin, New Territories, Hong Kong
School of Control Science and Engineering
Dalian University of Technology Dalian, Liaoning, China
[email protected]
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Outline
Introduction
Problem formulations
kWTA networks
Simulation results
Sorting application
Filtering Application
Concluding remarks
Future works
References
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Multiple Winners-take-all Operation
The k-winners-take-all (kWTA) operation is to select the k largest inputs out of n inputs
(1 ≤ k < n).
kWTA is a general rule in nature and society.
kWTA has widespread applications in data mining, machine learning, classification, clustering, computer vision, etc.
It is a common building block for many
models such as ART and SOM.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
k Winners-take-all Operation
As the number of inputs increases and/or the selection process should be operated in real time, parallel algorithms and hardware
implementation are desirable.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Parallel k Winners-take-all Operation
k u 1 u 2 u n
x 1 x 2 x n
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Problem Formulations
"The mere formulation of a problem is far more essential than its solution, which may be merely a matter of mathematical or
experimental skills. To raise new questions, new possibilities, to regard old problems from a new angle requires creative imagination
and marks real advances in science."
Albert Einstein
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Problem Formulations
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Problem Formulations (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Problem Formulations (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Problem Formulations (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Model Selection and Redesign
The kTWA problem has been formulated as an equivalent linear and quadratic
programming problems.
All existing neurodynamic optimization
models for linear and quadratic programming can be applied.
Now the question is: which is the best in
terms of model complexity and computational
efficiency?
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
QP-based Primal-Dual Network
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
QP-based Projection Network
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
LP-based Projection Network
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
QP-based Simplified Dual Net
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
LP-based Discontinuous Network
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Discontinuous Activation Function
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Convergence Conditions
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
QP-based Discontinuous Network
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Discontinuous Activation Function
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Convergence Condition
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results(cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
QP-based Improved Dual Network
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Model Comparisons
Model Number of layer(s) Number of neuron(s) Number of connections
LP-based primal-dual
network 4 3n + 1 6n + 2
QP-based primal-dual
network 4 3n + 1 6n + 2
LP-based projection network 2 n + 1 2n + 2
QP-based projection
network 2 n + 1 2n + 2
QP-based simplified dual
network 1 n 3n
LP-based discontinuous net 1 n 2n
QP-based discontinuous
network 1 n 2n
QP-based improved dual
network 1 1 n
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Discrete-time Counterpart
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Activation Function with High
Gain
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
A New Model
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Desirable Properties
The kWTA model with Heaviside activation function has been proven to be globally
stable and globally convergent to the kWTA solutions in finite time.
Derived lower and upper bounds of convergence time are respectively
It essentially solves the dual problem of the
linear programming formulation.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Convergence Time
As a linear system with a discontinuous bias, the converence time of the kWTA network can be computed as a function of input vector u.
The expectation and variance of the convergence time can also be computed, based on Binomial
distribution, as functions of initial states.
Y. Xiao, Y. Liu, C.-S. Leung, J. P.-F. Sum, K. Ho, “Analysis on the convergence time of dual neural network-based kWTA,” IEEE Trans. Neural Networks and Learning Systems, vol. 23, pp. 676-682, 2012.
J. P.-F. Sum, C.-S. Leung, K. Ho, “Effect of Input Noise and Output Node Stochastic on Wang's kWTA,” IEEE Trans. Neural Networks and Learning Systems, vol. 24, pp.
1472 - 1478 , 2013.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Reformulated Problem
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Reformulated Problem (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Reformulated Problem (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results with
Randomized Integer Inputs
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results with Low-
Resolution Inputs
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Initial State Estimation
Although the state of kWTA model is
guaranteed to be globally convergent in finite time from any initial state, prior information is helpful to initialize the state closely to the
steady state.
Obviously, the steady state of y ∈ (u k+1 , u k ]
depends on the distribution of u 1 , u 2 , . . . , u n ,
as well as the values of k and n.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Initial State Estimation (cont’d)
General distribution
Uniform distribution
Normal distribution
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Initial State Estimation (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Uniform Distribution
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Normal Distribution
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (convergence
time) with Infinity Gain
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (convergence
time) with Unity Gain
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Discrete-time Version
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results ( n = 10 6 , k = n /2)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results ( n = 10 6 , k = n /2)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Monte Carlo Simulation Results
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Monte Carlo Simulation Results
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Estimated Complexity (uniform)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Estimated Complexity (normal)
For data with a dimension of 10 100
(1 Googol), it would need about 8.44
iterations on average!
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Histograms of Convergence Iterations
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Histograms of Convergence Iterations
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Histograms of Convergence Iterations
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Histograms of Convergence Iterations
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Sorting Operation
Sorting is a fundamental process to arrange data in an order according to their values.
It accounts for 25% of data processing time (Knuth).
For sorting with large number or high
dimensional data, parallel sorting approaches are more desirable.
Numerous sorting algorithms and models
have been developed with varied efficiencies.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Parallel Sorting Representation
For example, a permutation matrix:
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Parallel Sorting Representation (cont’d)
A modified version:
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Logic Reversal
A simple logic can be used to flip over the
redundant '1' elements after the first '1' in
each row; i.e.,
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Parallel Sorting based on k WTA
Let each kWTA network computes one
column of the above sorting matrix from left to right with k increasing from 1 to n - 1.
Specifically, a WTA network with a single state variable (i.e., k=1) is adopted to
determined the largest element of the list.
Next, a kWTA network with k = 2 computes
the second item in the list without recounting
the first item.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Parallel Sorting based on k WTA
As such, the whole list of n items can be
sorted using n-1 kWTA networks without the need for computing the last item.
As a result, only n-1 neurons will be needed.
It is a substantial reduction of the model
complexity compared with the analog sorting
networks with n 2 neurons.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Illustrative Example
In this case, only five (5) neurons are
needed by using five kWTA networks here.
In contrast, 36 neurons are needed in the
analog sorting network (Wang, 1995).
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (state variable)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (output variables)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Rank-order Filter
Rank order filters are nonlinear filters with many applications including digital image processing, speech processing, coding and digital TV, etc.
A rank order filter functions by working by selecting its input with a certain rank as its output.
Rank order filters entails substantial
processing power to implement, which limits
their real-time signal processing applications.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Rank-order Filter Based on k WTA
Nevertheless, rank order filters can benefit from their parallelism realizations.
Specifically, a 𝑘 WTA network with 𝑘 = 𝑟 is used in parallel to another 𝑘 WTA network
with 𝑘 = 𝑟 − 1 to select the input with its rank
order being 𝑟 .
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (median filter)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (median filter)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Simulation Results (median filter)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Image Processing
Percentage of speckle noise in image 10%
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Image Filtering (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Image Filtering (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Image Filtering (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Image Filtering (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Image Filtering (cont’d)
Put the original image into median filter
The Original image Original image after median filtering
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Color Image Filtering
Percentage of speckle noise in image 10%
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Color Image Filtering (cont’d)
Percentage of speckle noise in image 10%
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Color Image Filtering (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Color Image Filtering (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Color Image Filtering (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Results & Discussion
- Image Processing
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Color Image Filtering (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Color Image Filtering (cont’d)
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Information Retrieval
The efficiency of information retrieval from large database is essential.
The techniques for information retrieval from large data sets play a very important role as the size of the world-wide web exceeded
possibly more than 30 billion nowadays.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Web Information Retrieval
There are basically two parts in web information retrieval:
One is calculating the weight of all the pages or data.
The other is find the most “wanted” k results with highest weightings.
The second one is the top-k query or front
page problem.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
A Toy Problem from Wikipedia
7 pages
17 links
The PageRank
weight of each
page and link is
provided.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Selection Results ( k =3)
Output vector x=[1,1,0,0,1,0,0]
TPages 1, 2, and 5
are with higher
PageRank weights
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Film-director-actor-writer Network
Crawled from Wikipedia under the category of English
language films
34,279 pages
142,426 links
Part of the square adjacency matrix is shown by the figure, where a dot on the i th column and the j th row represents that there is a directed link pointed to the j th page from the i th one.
The rest of the matrix is 0.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Selection Results ( k =10)
The answer to this query [3111, 3869, 4058, 4621, 6938, 8974, 10341,
11502, 13320, 15326] T can be easily achieved from the sparse
representation of the output vector x =
g(u i -y(t)), where 10 of the elements are
nonzero.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Conclusions and Future Works
The neurodynamic optimization approaches are demonstrated to be powerful for k-winners-take-all operations.
k-winners-take-all neural networks provide parallel
distributed computational models with guaranteed global convergence to the optimal solutions.
Neurodynamic optimization approaches are more suitable for real-time applications with big data.
GPU-based implementation is under way.
Applications to other problems such as recommender
systems are yet to be done.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015
Acknowledgments
Prof. Yousheng Xia (Fuzhou University)
Prof. Yunong Zhang (Sun Yat-sen University)
Prof. Xiaolin Hu (Tsinghua University)
Prof. Qingshan Liu (Huazhong Univ. of Sci. and Tech.)
Dr. Shubao Liu (GE Global Research)
Dr. Zheng Yan (Huawei Shannon Laboratory)
Mr. Yunpeng Pan (Georgia Institute of Technology)
Mr. Zhishan Guo (University of North Carolina)
Mr. Shaofu Yang and Miss Xinyi Le (Chinese University of Hong Kong)
Many projects funded by the Hong Kong
Research Grants Council.
AI Forum 2015; Kaohsiung, Taiwan; June 5-6, 2015