Computing EP’s Evidence Approximation Let us define

log Z(m,V)_≡1 2m T_V−1_m₊1 2log detV + n 2log(2π) and log Zi(m,V)≡log Z dxN(x_|m,V)t_iα(Uix).

Expectation propagation approximates the evidence p(y_|θ)by Zep=Z1−n/α∏iZαi . Using the above

introduced notation this can be written as log ZEP=log Z(m,V) +1 α

∑

_i h log Zj m\i,V\i+log Zm\i,V\i₋log Z(m,V)i,

which in the case when tidepends onUixleads to

log ZEP=log Z(m,V) + 1 α

∑

_i log Zj Uim\i,UiV\iUiT +1 α

∑

_i h log Z Uim\i,UiV\iUiT −log Z Uim,UiV UiT i .

Appendix D. A Summary of the Marginal Approximations

An explanatory list of the approximation methods in Figure 13.

• LA-TK. The Laplace approximation of Tierney and Kadane (1986). The approximation ˜

pLA-TK₍_x

i)is computed by using the Laplace method to approximate ci(xi)(Section 3.1). • EP-FULL. The full EP approximation of the marginal. This approximation is computed by

using EP to approximate ci(xi)(Section 4.1.1).

• EP-L. EP local. The approximation ˜pEP-L(xi)∝ εi(xi)q(xi)is obtained from cxi(x)≈1, where

εi(xi) =ti(xi)/˜ti(xi)and q(x)are computed by EP (Section 3).

• LM-L. Lapace method local. EP local. The approximation ˜pEP-L(xi)∝ εi(xi)q(xi)is obtained

from cxi(x)≈1 , whereεi(xi) =ti(xi)/˜ti(xi) and q(x)are computed by the Laplace method (Section 3). In this case logεi(xi) =R2[logti](xi).

• LA-CM. The Laplace approximation with the conditional mode approximated by the conditional mean. The approximation ˜pLA-CM₍_x

i)is computed as proposed in Rue et al. (2009),

that is, by using the approximationx∗_\_i(xi)≈Eq

x_\_i_|xi

where q(x)is given by the Laplace method (Section 4.1.2).

• LA-CM2. The similar approximation asLA-CM, but with an additional term added to account forx∗ \i(xi)≈Eq x_\_i_|xi (Section 4.1.2).

Expectation propagation (EP) Laplace method (LM)

with

EP-L LM-L

EP-1STEP LA-CM / LA-CM2

Use global method with some simplifications

Factorize and use the univariate global method

EP-FACT LA-FACT

EP-FACTN EP-OPW

(1st order) Expansions with regard to

EP-FULL LA-TK

Gaussian approximation Latent Gaussian model

Figure 13: A schematic view of the approximation methods introduced or referred to in this paper. For details see Section D of the Appendix.

• EP-1STEP. The one step EP approximation. The approximation ˜pEP-1STEP₍_x

i)is computed by

defining ˜εj(xj; xi)≡Collapse(q(xj|xi)εj(xj))/q(xj|xi)and using the approximation ci(xi)≈ R

dx_\iq(x_\i|xi)∏j6=iε˜j(xj; xi)(see Section 4.1.1). This corresponds to one EP step for com-

puting ci(xi)with the initialization ˜εj(xj; xi) =1.

• EP-OPW. The Taylor expansion of Opper et al. (2009). The approximation ˜pEP-OPW₍_x

i)is com-

puted by expanding p(x) ∝ p0(x)∏jεj(xj) in first order with regard to

εj(xj)−1 for all j=1, . . . ,n and integrating with regard to x_\i. When expanding only for

j₆=i the approximation is equal in first order to ˜pEP-FACT₍_x

i)(Section 4.3). • EP-FACT. The factorized EP approximation. The approximation ˜pEP-FACT₍_x

i) is computed

using the approximation ci(xi)≈∏j6=i R

dxjq(xj|xi)εj(xj), where the univariate integrals are

computed numerically or analytically, if it is the case. For further details see Section 4.2.

• LA-FACT. A similar approximation as EP-FACT, but here, the univariate integrals are com- puted with the Laplace method and using the approximation x∗_j(xi)≈Eq[xj|xi], with q(x)

being the global approximation resulting from the Laplace method. For further details see Section 4.2.

• EP-FACTN. Higher order approximations obtained by using the factorization recursively. For further details see Section 4.2.

References

P. R. Amestoy, T. A. Davis, and Iain S. D. An approximate minimum degree ordering algorithm.

SIAM Journal on Matrix Analysis and Applications., 17(4):886–905, October 1996.

A. Birlutiu and T. Heskes. Expectation propagation for rating players in sports competitions. In Joost N. Kok, Jacek Koronacki, Ramon L´opez de M´antaras, Stan Matwin, Dunja Mladenic, and Andrzej Skowron, editors, Proceedings ECML/PKDD, volume 4702 of Lecture Notes in Com-

puter Science, pages 374–381. Springer, 2007.

L. Csat´o and M. Opper. Sparse representation for Gaussian process models. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, Cam- bridge, MA, USA, 2001. MIT Press.

P. Dangauthier, R. Herbrich, T. Minka, and T. Graepel. Trueskill through time: Revisiting the history of chess. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural

Information Processing Systems 20, pages 337–344. MIT Press, Cambridge, MA, 2008.

A. M. Erisman and W. F. Tinney. On computing certain elements of the inverse of a sparse matrix.

Communications of the ACM, 18(3):177–179, 1975. ISSN 0001-0782.

T. Heskes, M. Opper, W. Wiegerinck, O. Winther, and O. Zoeter. Approximate inference techniques with expectation constraints. Journal of Statistical Mechanics: Theory and Experiment, 2005: P11015, 2005.

S. Ingram. Minimum degree reordering algorithms: A tutorial, 2006. URLhttp://www.cs.ubc.

M. Kuss and C. E. Rasmussen. Assessing approximate inference for binary Gaussian process clas- sification. Journal of Machine Learning Research, 6:1679–1704, 2005. ISSN 1533-7928. S. Martino and H. Rue. Implementing approximate Bayesian inference using integrated nested

Laplace approximation: a manual for the INLA program. Technical report, Department of Math- ematical Sciences, NTNU, Norway, 2009.

T. P. Minka. A Family of Algorithms for Approximate Bayesian Inference. PhD thesis, MIT, 2001. T. P. Minka. Divergence measures and message passing. Technical Report MSR-TR-2005-173,

Microsoft Research Ltd., Cambridge, UK, December 2005.

K. Murphy, Y. Weiss, and M. I. Jordan. Loopy belief propagation for approximate inference: An empirical study. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelli-

gence, volume 9, pages 467–475, San Francisco, USA, 1999. Morgan Kaufman.

I. Murray, R. P. Adams, and D. J.C. MacKay. Elliptical slice sampling. In Y. W. Teh and M. Titter- ington, editors, Proceedings of the 13th International Conference on Artificial Intelligence and

Statistics, pages 541–548. 2010.

M. Opper and C. Archambeau. The variational Gaussian approximation revisited. Neural Compu-

tation, 21(3):786–792, 2009.

M. Opper and O. Winther. Gaussian processes for classification: Mean-field algorithms. Neural

Computation, 12(11):2655–2684, 2000.

M. Opper, U. Paquet, and O. Winther. Improving on expectation propagation. In D. Koller, D. Schu- urmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems

21, pages 1241–1248. MIT, Cambridge, MA, US, 2009.

H. Rue and L. Held. Gaussian Markov Random Fields: Theory and Applications, volume 104 of

Monographs on Statistics and Applied Probability. Chapman & Hall, London, UK, 2005.

H. Rue, S. Martino, and N. Chopin. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society (Series

B), 71(2):319–392, 2009.

M. W. Seeger. Bayesian inference and optimal design for the sparse linear model. Journal of

Machine Learning Research, 9:759–813, 2008. ISSN 1533-7928.

K. Takahashi, J. Fagan, and M.-S. Chin. Formation of a sparse impedance matrix and its application to short circuit study. In Proceedings of the 8th PICA Conference, 1973.

L. Tierney and J. B. Kadane. Accurate approximations for posterior moments and marginal densi- ties. Journal of the American Statistical Association, 81(393):82–86, 1986.

M. van Gerven, B. Cseke, R. Oostenveld, and T. Heskes. Bayesian source localization with the multivariate Laplace prior. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 1901–1909, 2009.

M. van Gerven, B. Cseke, F. de Lange, and T. Heskes. Efficient Bayesian multivariate fMRI analysis using a sparsifying spatio-temporal prior. Neuroimage, 50(1):150–161, March 2010.

O. Zoeter and T. Heskes. Gaussian quadrature based expectation propagation. In Z. Ghahramani and R. Cowell, editors, Proceedings of the Tenth International Workshop on Artificial Intelligence

In document Approximate Marginals in Latent Gaussian Models (Page 34-38)