2018 International Conference on Communication, Network and Artificial Intelligence (CNAI 2018) ISBN: 978-1-60595-065-5
Research Progress in Bayesian Program Learning
Zong-jian ZHU, Ming-qiang PAN
*, Cheng SUN, Lei JIU and Li-ning SUN
College of Mechatronics Engineering & Collaborative Innovation Center of Suzhou Nano Science and Technology, Soochow University, Suzhou 215123, China
Jiangsu Key Laboratory for Advanced Robotics Technology, Soochow University, Suzhou 215123, China
Robotics and Microsystems Center, Soochow University, Suzhou 215123, China *Corresponding author
Keywords: Bayesian program learning, Bayesian network, Bayesian model.
Abstract. Bayesian Program Learning (BPL) is an important area of machine learning. In recent years, the small sample learning model with BPL as the core has made breakthroughs in methodology and performance, attracting great attention from both industry and academia. This paper reviews the research progress and application of BPL. First of all, it introduces the research background, development history and research route of BPL. Secondly, a brief overview of Bayesian model, reasoning algorithm, based on this, a detailed review of Bayesian learning based on speech, assembly, motion learning, bias diagnosis, learning effectiveness assessment, classification and so on, to explain the application of the method, tools, experimental results, and a brief overview of open source tools for probability programming. Finally, the paper summarizes and points out the future research direction of BPL.
Introduction
Due to its wide range of applications in speech, image, classification, sports and other fields, machine learning has gained a golden stage of development since the mid-1990s. Bayesian program learning is an important branch of machine learning[1], solving the problem of self-learning without using big data. It can form a concept only from one example and achieve the advantage of quick decision-making, building the model of ‘small sample learning’[2][3]. The Bayesian approach initially implemented AI functions over a particular Bayesian network[4][5][6][7], such as speech recognition via Hidden Markov Models (HMM), followed by Professor Leslie Valliant, winner of Turing 2010, who established probably approximate correct(PAC) theory of learning[8][9]. Professor Judea Pearl, winner of the Turing Award 2011, who established an artificial intelligence approach based on probability statistics[10][11][12]. In 2015, three scholars, such as Brenden M. Lake, published [13] and put forward the concept of Bayesian program learning. The research results promote the development of Bayesian program learning.
Bayesian program learning uses probabilities to represent all forms of uncertainty, learning and reasoning through probability rules[1]. It requires a priori probability of the goal, and when the probability is unknown, these probabilities can be estimated on the basis of background knowledge, pre-prepared data, and the assumption of a baseline distribution[13][14].
However, the research directions of Bayes at home and abroad are divided into two genres. The application of the Bayesian network model in China is mainly studied. In the foreign countries, Bayesian program learning is the research direction. Such as Zhu team in Tsinghua University, combining Bayesian and deep learning[15]. Zhang Zhihua of Peking University deep in Bayesian direction[16]. Brend M.Lake published [17] in 2016 and hoped to create a humanoid robot. This article summarizes and analyzes the research progress of Bayesian program learning.
The Brief Introduction of Bayesian Model and Reasoning Algorithm Bayesian Model
Bayesian model is established by Bayesian network, which is composed of nodes, directed edges connected, acyclic images, independent variable[18]. At present, with the development of machine learning, the Bayesian model is also divided into the classical parametric model and the non-parametric Bayesian model because of the absence of parameters[1][19][20][21].
1) Dynamic Bayesian Network (DBN) is a brand new neural network model proposed by Professor Hinton in 2006[22]. DBN is a parameterized model and is a model of generation. By training the weight in neurons, DBN can make the whole Neural networks generate training data according to the maximum probability. DBN has a wide range of applications in the identification of features, classification, generate data[22][23].
2) The Dirichlet process is a theory put forward by Thomas S Ferguson, a great statistician in the United States, in 1973[24]. From then on, the Dirichlet process was used as a priori probability in non-parametric Bayesian.
Reasoning Algorithm
Bayesian formula as shown in equation 1. However, one of the key problems of Bayesian learning is that posterior distribution is usually not solvable. That is, inference algorithm refers to solving the posterior distribution probability.
𝑃 ℎ|𝐷 | 𝑃 𝐷|ℎ ∗ 𝑃 ℎ (1) Various reasoning algorithms have been proposed in the development of Bayesian methods, such as Variation Inference, Monte Carlo methods, online learning algorithms and distributed inference algorithms. Among them, Variational Inference is the most widely used method. The basic idea is to transform the problem into an optimization problem for solving approximate distribution. By using Jason inequality, the lower bound of logarithmic likelihood can be obtained(refer with: Eq.2) [1].
log 𝑃 𝐷 𝐸 log 𝑃 ℎ|𝐷 𝐸 log 𝑞 ℎ (2) Then maximize the log-likelihood lower bound(refer with: Eq.3). Where D is the data set, 𝑃 ℎ|𝐷
is the posterior distribution to be solved.
max 𝐸 log 𝑃 ℎ|𝐷 𝐸 log 𝑞 ℎ (3)
Application Based on Bayesian Program Learning The Application of Bayesian Method in Speech
1) In the aspect of speech processing, Li et al. [25] uses the dynamic kernel feature and the Bayesian maximum a posterior solution to the conversion function parameters to convert the speech. In the experiments of male voice conversion and female voice conversion, the accuracy based on the Bayesian method was 75.0% and 77.5% respectively, both higher than 62.5% and 67.5% of the MPLR. Herman[26] uses communication auto-coders to solve the difficulties of top-down and bottom-up learning zero-resource speech processing. Bayesian method can also be used for speech enhancement. Wang[27] uses an adaptive beta-order Bayesian perceptually algorithm based on chi-square distribution and combines with the masking threshold of auditory masking effect through linear rules. Roland[28] uses Bayesian inference to describe a class of related speech signal processing algorithms. Brend[29] can get more information from human question asking using Bayesian analysis.
62.6% accuracy. For speech recognition, the GMTK toolkit is widely used. For example, Sun[31] builds the training and recognition model of continuous speech based on DBN using the GMTK toolkit. Similarly, Wang[32] also uses the GMTK toolkit to build the AWA-DBN model, and trains the model using the EM algorithm, achieving the best recognition rate of 60.43% at a volume of 5db. It is challenging for each speaker to have his own specific speech recognition, Huang[33] uses the adaptive activation function parameters of Bayesian unsupervised blocks and online speaker to implement speaker-oriented ‘Personalized’ model. George[34] constructs Bayesian Hidden Markov Model for speech recognition.
3) In the aspect of biometrics, Yu[35] uses the Bayesian method for voiceprint recognition, enabling unimodal biometrics. Wang et al. [36] uses DBN to integrate voiceprint-face information to achieve speaker recognition of bimodal information fusion and improve the performance of unimode speech recognition and unimodal face recognition 3.3% and 6.1% respectively. Cheng et al. [37] uses a log-likelihood ratio method based on Bayesian theory to realize the verification of voiceprint.
The Brief Introduction of Bayesian Method for Precision Assembly and Movement
Precision assembly, like the precise positioning of mobile robots, uses state estimation to calculate the position of the assembly hand[38]. Wu[39] uses a Bayesian optimization algorithm replaced by Gaussian process regression based on orthogonal search to achieve high-precision assembly of industrial robots based on force control. Zhang[40] builds an intelligent assembly system based on Bayesian network-based assembly reasoning model and recognition library to realize intelligent assembly technology. Cheng[41] uses probability generation algorithm and motion data generation algorithm to generate the model. The similarity between the generated motion data and the captured human data was 79.64%.
The Brief Introduction of Bayesian Method for Diagnostic Evaluation, Learning Effectiveness Assessment and Classification
Bayesian network with its causal relationship between nodes is clear and easy to understand, therefore, we can use the probability problem to solve complex system multi-input node reasoning.
Liu et al. [42][43] Conducts a small sample test on assembly process deviation, combined with the model of flexible assembly deviation, to realize the deviation source diagnosis for the left-side assembly process of a vehicle model. Xie[44] uses the reasoning mechanism of Bayesian network to evaluate the students' knowledge points and update the students' latest state of knowledge according to the students' responses. Naveed et al. [45] Proposes a nonparametric Bayesian method to learn discriminative dictionaries for sparse representation data, which is superior to the current methods in terms of accuracy and time consumption.
The Brief Introduction of Mainstream Open Source Tools
Table 1. Some mainstream Bayesian program learning open source toolboxes and their download address at present. Name of the tools Description and Remarks Download Address
Edward Bayesian network tools https://github.com/blei-lab/edward/ Stan Bayesian network tools http://mc-stan.org/
PyMC Bayesian analysis of the python library https://sourceforge.net/projects/pymc/ ZhuSuan Bayesian deep learning python library https://github.com/thu-ml/zhusuan TensorFlow Google developed a deep learning
visual development tools https://github.com/tensorflow/tensorflow Theano Deep learning open source tools based on python language https://github.com/Theano/Theano
Caffe Deep learning open source tools published by UC Berkeley BVLC
labs https://github.com/BVLC/cafie
Summary
This paper briefly introduces the research background, the history and the research route of Bayesian learning. However, we should realize that in the broad definition of machine learning framework, Bayesian program learning and deep learning are actually one extreme of machine learning[15]. Deep learning requires a large amount of training samples and uses a large amount of computing resources to achieve a high accuracy in a particular environment[48]. For example, AlphaGo trained millions of human chess master chess[49-50]. In contrast, Bayesian program learning requires very little data. To sum up, Bayesian program learning is the other extreme where machine learning requires very few samples[15].
At present, Bayesian deep learning proposed by the Zhu team is a learning model that can really explore the real artificial intelligence. It combines Bayesian program learning and deep learning. According to Zhu Jun, this learning model has the Bayesian interpretability, can learn a small amount of data, in addition to deep learning very powerful fitting ability[15]. Overall, the future of Bayesian program learning combined with deep learning is very bright. In this paper, through reviewing and analyzing the research progress in relevant fields, I hope to bring new information and research to the research progress in this field Ideas, and jointly promote the further development and prosperity of Bayesian program learning and related fields.
Acknowledgement
This research was financially supported by the General Armaments Department Per-research fund(6140863020216JW30001), Applies Basic Research Rroject of Suzhou (SYG201540), the National Key Foundation for Exploring Scientific Instrument(2013YQ470767).
References
[1] Jun Zhu, Wenbo Hu, Recent Advances in Bayesian Machine Learning, Journal of Computer Research and Development, 2015.
[2] Yong Liu, Yiyi Liao, Class man concept learning: the next leap in machine learning?, Science and Technology Review. 2016, 34(7).
[3] Xiaoliang Chen, Small sample class concept learning and intensive learning of large data, Science and Technology Review. 2016, 34(7).
[4] Xiangyang Zhang, Ming Liu, A summary of the research on Bias's reasoning, Progress in Psychosocial Science, 2002.
[6] Zhang Peng, Tang Shiwei, Privacy Preserving Naive Bayes Classification, Chinese Journal of Computer, 2007.
[7] Shinji Watanabe and Jen-Tzung Chien, Bayesian Speech and Language Processing, Cambrige University, 2011.
[8] Leslie Valiant, Probably Approximately Correct: Nature’s Algorithms for Learning and Prospering in a Complex World, Basic Books, 2013.
[9] John Pavlus, Edward Frenkel, Looking for an ecological algorithm: an interview with the Turing prize winner Leslie Valiant, Sohu net, 2016.
[10] Judea Pearl, E. Bareinboim, Transportability of causal and statistical relations: A formal approach, IEEE, 2011.
[11] S. Greenland, J. Pearl, Adjustments and their consequences—collapsibility analysis using graphical models, University of California, 2011.
[12] J. Pearl, Bayesian networks, University of California, 2011.
[13] Brenden M. Lake, Buslan Salakhutdinov, Joshua B. Tenenbaum, Human-level concept learning through probabilistic program induction, Science, 2015.
[14] Huiliang Luo, Study on Models and Methods of Pattern Recognition of Bayesian Network Based on Rough Set, Chongqing University, 2008.
[15] Almost Human, Zhu Jun: Tsinghua University detailed abacus Bayesian deep learning GPU library, GMIS the first global Machine Intelligence Summit, 2017.
[16] Zhang Zhihua, Machine learning - the love of statistics and computing, The capital of statistics COS, 2016.
[17] Brend M. Lake, T.D. Ullman, J.B. Tenenbaum, SJ Gershman, Building Machines That Learn and Think People, Behavioral and Brain Sciences, 2016.
[18] Wang Hui, Bayesian network for prediction, Journal of northeastern Normal University (Natural Science), 2002.
[19] Jun Zhu, When Bayes meet big data, Technology survey, 2015. [20] Information on https://www.zhihu.com/question/22422121.
[21] Zhu Ming, Guo Chunsheng, Hidden Markov model and its latest application and development, Computer System Application, 2010.
[22] Jin Lianwen, Yang Zhao, Yang Weixin, Zhong excelbright, Xie Zecheng, Sun Jun, Applications of Deep Learning for Handwritten Chinese Character Recognition: A Review, Acta Automatica Sinica, 2016, 42(8): 1125-1141.
[23] Geoffery E. Hinton, Simon Osindero, A fast learning algorithm for deep belief nets, Neural Computation, 2006.
[24] Thomas S. Ferguson, A Bayesian Analysis of Some Nonparametric Prombles, The Annals of Statistics, 1973.
[25] Li Na, Zeng Xiangyang, Qiao Yu, Li Zhifeng, Voice conversion using bayesian analysis and dynamic kernel features, Acta Acustica, 2015.
[27] Wang Lei, Adaptive beta-order Bayesian perceptually motivated speech enhancement algorithm based on auditory masking, Information Technology, 2017.
[28] Roland Maas, On Bayesian Networks in Speech Signal Processing, ITG-Fachbericht, 2014 [29] Brend M. Lake, Asking and evaluating natural language questions, 2017.
[30] Fabio Valente, Variational Bayesian GMM for Speech Recognition, EUROSPEECH 2003, 2016.
[31] Sun Ali, Jiang Dongmei, Lv Guoyun, Research on DBN Based Continuous Speech Recognition, The Fifth National Conference on "signal and information processing" Joint Academic Conference, 2006.
[32] Wang Feng-na, Jiang Dong-mei, Song Pei-yan, Novel Articulatory Feature based Dynamic Bayesian Network model for speech recognition, Computer Engineering and Application, 2009, 45(8): 178-181.
[33] Zhen Huang, Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition, IEEE, 2017.
[34] George Saon, Jen-Tzung Chien, Bayesian Sensing Hidden Markov Models for Speech Recognition, 2011.
[35] Yu Xian, Pattern Matching of Voiceprint Recognition based on GMM, Communication Technology, 2015.
[36] Wang Runtuo, A speaker recognition method based on information fusion of DBN, Journal of Guilin University of Electronic Technology, 2010.
[37] Chen Quanjin, Study on the application of Voiceprint Recognition Based on LIR, Criminal Technique, 2014.
[38] Yao Cong, The Research of Mobile Robot Localization Method Based on Bayesian Theory, Chongqing University, 2015.
[39] Binglong Wu, Industrial Robot High Precision Assembly Based on force control, University of Chinese Academy of Sciences, 2017.
[40] Zhang Shuai, Intelligent Assembly Technology Based on Bayesian Network, Modular Machine Tool and Automatic Manufacturing Technology, 2013.
[41] Meng-Zhen Cheng, Quan-Hua Tang, Long-Jun Huamg, Motion learning based on Bayesian program learning, Web of Conferences, 2017.
[42] Liu Yinhua, Study on variation source diagnosis based on Bayesian networks in Auto-Body Assmebly Processes, Shanghai Jiao Tong University, 2013.
[43] Liu Yinhua, Bayesian Networks Modeling Based on Small Samples for Variation-Source Diagnosis, Journal of Shanghai Jiao Tong University, 2012
[44] Xie Zhaohui, Research on the evaluation system of network learning effect based on Bayes, Development Technology, 2015.
[45] Naveed Akhtar, Faisal Shafait, Ajmal Mian, Discriminative Bayesian Dictionary Learning for Classification, IEEE, 2016.
[48] DeepTech technology, "Google brain": 9 basic direction research, 6 specific domain results definition Google AI progress, Sohu net, 2018.
[49] Honestsun, Father of AlphaGo: About weiqi, man made a big mistake for 3000 years, Tencent Science and Technology, 2017.