• No results found

The origin of Bayesian theory could be traced to the source of the name: Reverend Thomas Bayes, an 18th century Mathematician and theologian. He wrote a paper titled “An Essay towards Solving a Problem in the Doctrine of chances”, published in 1761, two years after his death. He presented a particular case of what is now known as of Bayes‟ theorem. However the Bayesian probability as it is known today was pioneered by Pierre-Simon Laplace. The developments of Bayesian theory was however slow for the first 200 years. Gradually, Bayesian researchers started increasing, more books were written and more articles published. For instance, discussing the development of the Bayesian theory in terms of books and articles published, Berger (2000) provided a very detailed record; possibly 15 books were written between 1769 and 1969, a guess of about 30 books from 1970-1989, and about 60 Bayesian books between 1990 and 1999. There were also more Bayesian conference proceedings, papers and Bayesian organisations from 1990 till date.

The application of the Bayesian theory cuts across several fields of research. Some references were made in Berger (2000) in terms of books and articles published in each field of application. Some of them are mentioned here; in the field of archaeology see Buck, Cavanaugh, and Litton (1996) ; in atmospheric sciences, there is Berliner, et. al. (1999); for application in education, see Johnson (1997); epidemiology, see Greenland (1998);

33

engineering, see Godsill and Rayner (1998); hydrology, see Parent, Hubert, Bobée and Miquel (1998); law, see DeGroot, Fienberg, and Kadane (1986). Application in medicine could be found in Berry and Stangl (1996); for physical sciences, see Bretthorst (1988). In Econometrics, some of the available books and journal articles on Bayesian theory are; Dreze (1962), Zellner (1965), Cyert and DeGroot (1987), Poirier (1995), Perlman and Blaug (1997), Kim, Shephard and Chib (1998), Kleibergen, F.& van Dijk, H.K., (1998), Geweke (1999, 2005), Lancaster (2004), Greenberg (2007), and, Koop, Poirier and Tobias (2007).

There are many approaches to Bayesian analysis. The most common ones are the objective, subjective, robust, frequentist-Bayes and quasi-Bayes approaches. Beginning with the first set of Bayesians, Bayes (1783) and Laplace (1812), who carried out Bayesian analysis using constant prior distribution for unknown parameters, Bayesian analysis has been taken as an objective theory. The use of uniform or flat prior, more generally known as noninformative, is a common objective Bayesian approach, Jeffrey‟s prior as presented in Jeffrey (1961) is the most popular in this school of thought. Although these priors are often referred to as noninformative prior, they also reflect certain informative features of the system being analysed, in fact, some Bayesians have argued that it rather be referred to as “weakly informative” prior for example, German et. al (2008). Another prior in the objective Bayesian approach is the maximum entropy priors. More recently, we also have the reference priors, these could be seen in Bernardo (1979) and Yang and Berger (1997). A review of ways of selecting noninformative priors could be found in Kass and Wasserman (1996). The fact that the objective Bayesian procedure involves the use of improper prior distributions which do not automatically have desirable Bayesian properties is a major concern. Moreover, the choice of improper priors can lead to improper posteriors requiring difficult evaluation techniques. Thus, proposed objective Bayesian procedures are typically studied to ensure that such problems do not arise.

The subjective Bayesian School is new on the Bayesian theory compared with other approaches. It is regarded by many Statisticians as appealing because the needed inputs (models and subjective prior distributions) can be fully and accurately specified. Although the difficulty in the specification often limits application of the approach (Kahneman, Slovic, and Tversky 1986), there has been a considerable research effort to further develop elicitation techniques for subjective Bayesian analysis (Lad, 1996). The situation with the subjective Bayesian approach is such that the subjective prior is essential in some problems, while in

34

others it is readily available in which case it will bring about much gain. The Robust Bayesian analysis takes into consideration the fact that complete subjective specification of the model is impossible since it requires infinite number of assessments even in the simplest situation. Robust Bayesian approach thus involves the use of classes of models and classes of prior distributions, whereby the classes reflect the uncertainty remaining after the efforts at discovering and specifying the priors. Walley (1991) and Berger (1985, 1994) presented the foundational basis, history and developments of Robust Bayesian analysis. The robust Bayesian approach as reviewed in Berger (2000) is an attractive tool for implementing a general subjective Bayesian elicitation program. It is also useful in directing the elicitation effort for the subjective Bayesian analysis, by first assessing if the current information is sufficient for solving the problem and then, if not, determining which additional elicitations would be most valuable (Berger, 2000).

Another Bayesian approach school of thought is what one can refer to as the frequentist (classical) Bayes Analysis. This approach was presented in Berger (2000) as unification in Statistical theory and methods involving both Bayesian and frequentist approaches. Berger (2000) was of the opinion that Statistics will be discussed in the Bayesian sense since research activities over the years have proved that the only coherent language for discussing uncertainty, which Statistics seeks to measure, is the Bayesian language. In terms of methodology, both approaches are important in this unification. For instance, the Bayesian approach is considered to have an edge in terms of methodology for parametric problems while the frequentist approach can be useful in determining good objective Bayesian procedures. Also, in nonparametric studies, studies have shown the Bayesian approach to behave poorly by frequentist‟s standard (Diaconis and Freedman 1986). This should draw attention to the fact that something might be wrong somewhere especially when the information contained in the prior distribution is small compared to that which is “hidden”

Another unification issue suggested by Berger (2000) was that there are many cases where frequentist arguments yield satisfactory answers easily, whereas Bayesian analysis requires more vigorous work. For example, in Markov Chain Monte Carlo (MCMC) where one evaluates an integral by a sample average rather than a formal Bayesian estimate. Berger (2000) suggested that frequentist answer be accepted by the Bayesians as an approximate Bayesian answer, although the appropriateness of this still needs to be verified. The need for unification from the frequentist perspective was also raised by Berger; that optimal

35

unconditional frequentist procedures must be Bayesian even from a frequentist perspective (Berger, Boukai and Wang 1997).

The last of the Bayesian approaches discussed here is the Quasi-Bayesian approach. It is a type of Bayesian analysis where priors are chosen in various ad hoc methods, such as;

choosing vague proper priors, choosing priors to span the range of likelihood, and choosing priors with tuning parameters that are adjusted until the answer “looks nice”. Berger referred to such analyses as quasi-Bayes because, according to him, although they make use of Bayesian methods, they do not reflect the assurance of good performance that comes with either true subjective analysis or well studied objective Bayesian analysis. However, if handled by an expert Bayesian analyst, the quasi-Bayesian procedures can be quite reasonable, in that the expert may have the experience and skill to know when the procedures are likely to be successful. Also, there are many instances when results from the quasi-Bayes analysis could be trusted more than any other alternative.

Another important area in the development of the Bayesian Statistics is computation and software. The most serious challenge with the use of Bayesian method before now (about 20 years ago) was computational difficulties. This is because, Bayesian methods often involve calculation of posterior expectations that are complex or intractable numerical integration, to which analytical solution is not available. However, events have since overtaken this, because today, there are a number of numerical intensive softwares that can take care of the computational or even mathematical difficulties. In fact, presently, truly complex models in most cases can only be taken care of by the use of Bayesian techniques (Berger 2000). The usual methods for computing these posterior expectations have been the numerical integration, Laplace approximation and Monte Carlo importance sampling. Numerical integration can be carried out more effectively in problems with few dimensions, not more than 10 since the complexity increases with the dimension up to the point when it becomes impracticable. In the past years, computation of posterior expectations was traditionally carried out mostly by the use of Monte Carlo importance sampling. The method works well even for large dimensions and also has a good advantage of producing reliable measures of the accuracy of the computation. Specifically, there are methods that have been used to obtain Bayesian posterior estimates. The Maximum aposteriori approach is used to obtain the mode of the posterior distribution. It is most easily applicable to cases with conjugate prior where the posterior distributions are of the form that can be solved analytically. The

36

maximum aposteriori estimate could also be obtained by; numerical optimization such as conjugate gradiant method or Newton‟s method; modification of an expectation-maximization algorithm which does not require derivatives of the posterior density; and also by Monte Carlo method. The Kalman filter is an algorithm that operates recursively on streams of input data containing random variations to produce a statistically optimal estimate of the parameter of interest. It was named after Rudolf E. Kalman, one of the developers of its theory. It is applicable to linear models having Gaussian distribution. Particle filter is another method of obtaining Bayesian posterior estimates. It is a sequential Monte Carlo method based on point mass (or filters) representations of probability densities which can be applied a model whether linear or non linear. It is a generalization of the kalman filter in that it can be applied to non-Gaussian models (Arulampalam, et. al., 2002).

In recent years (less than 20 years) the Markov Chain Monte Carlo (MCMC) method has become the most popular in carrying out the integration for obtaining the posterior point estimates. This is because of its ability to handle complex situations and the ease of programming compared to other methods. The approach usually employed by the MCMC method is to simulate draws of samples from the complex posterior distribution of interest.It has its roots in Metropolis Hastings algorithm (Metropolis and Ulam, 1949, Metropolis et. al.

1953) and also involves the use of Gibbs sampling as one of the ways of carrying out the simulation, discussions on this are presented in Gilks, et.al. (1996) and Chen, et. al.(2000).

Softwares that make use of the MCMC approach are now available for carrying out Bayesian analysis more conveniently. Examples of such softwares are; the OpenBUGS consisting of WinBUGS and ClassicBUGS for Windows and Linux operating systems respectively (available at http://www.mrc-bsu.cam.ac.uk); BayesX (available at http://www.bayesx.org).

A list of Bayesian software available before 1990 was provided by Press (1989), while those available from 1990 are presented in Berger (2000).