Workshop on Recent Advances in Bayesian Computation - IMS

Workshop on Recent Advances in Bayesian Computation
(20 - 22 Sep 2010)

~ Abstracts ~

Adaptive Monte Carlo sampling and model uncertainty
Merlise Clyde, Duke University, USA

Nott and Kohn provide one of the first adaptive MCMC algorithms (AMCMC) for variable selection using the past history to construct approximations to full conditional distributions. Single-variable-at-a time updates, however, may lead to chains with poor mixing in the presence high correlations among predictor variables, missing regions of the model space with high posterior probability. Several stochastic search algorithms utilize adaptive estimates of marginal inclusion probabilities to sample models near the median probability model allowing global moves, but in doing so give up constructing samples that are representative of the population of models leading to biased estimates of quantities of interest. To alleviate these problems, we construct an adaptive MCMC algorithm based on a global approximation to the joint posterior distribution by representing the model space as a binary tree, which facilitates efficient sampling and adaptive updates of the joint proposal distribution. We discuss several methods for estimating the sequence of conditional probabilities using best linear unbiased estimates from past samples, Bayesian updating of Monte Carlo frequencies, and Fisher consistent estimators using re-normalized marginal likelihoods.

Help! Fitting process-based models to infectious disease data, Bayesianly
Alex Cook, National University of Singapore

Fitting infectious disease models is difficult because (i) the lack of independence and (ii) the severe censoring combine to make explicit calculation of the likelihood difficult (at best) or impossible (usually). This is a shame, because infectious disease models are very useful in understanding, seeking to control, and predicting epidemics. In this talk, I'll describe the kind of models and data people would like to bring together, and outline the approaches that I have taken in trying to fit such models, and where I think there is scope for improvement. The hope is that someone out there has a brilliant idea for how to do it better.

The expected auxiliary variable method for Bayesian computation
Arnaud Doucet, University of British Columbia, Canada

The expected auxiliary variable method is a general framework for Monte Carlo simulation in situations where the target distribution of interest is intractable thus preventing the implementation of classical methods. The method finds application in situations where marginal computations are of interest, trans-dimensional move design is difficult in model selection setups or when the normalizing constant of a particular distribution is unknown but required for exact computations. I will present a general construction that allows us to use the expected auxiliary variable method in a very wide range of applications. Several applications where Bayesian computation was previously impossible will be presented.

Finite dimensional simulation methods for infinite dimensional posteriors
Jim Griffin, University of Kent, UK

There has been increased interest in Bayesian nonparametric methods over the last 10 to 15 years which has driven interest in increasingly complicated priors. These priors are necessarily infinite dimensional but simulation-based methods can only be defined on finite dimensional objects, which suggests that inference will generally involve some form of truncation of the prior. This talk discusses several techniques for avoiding truncation using finite dimensional objects with varying dimensions (and so making exact inference possible).

Bayesian computation on graphics cards
Chris Holmes, University of Oxford, UK

Advances in computational methods and computing power have been instrumental in the uptake and development of Bayesian statistics in the last forty years. Recent trends in desktop computing offer the potential to make substantial further improvements.

We have been working on Monte Carlo methods for Bayesian computation run on graphics cards (GPUs). Graphics cards were originally designed to deliver real-time graphics rendering for computer games and other high-end graphics applications. However, there is an emerging literature on the use of GPUs for scientific computing. The advantages are clear. A typical graphics card has effectively around 250 parallel processors designed for fast arithmetic computation. The cards are cheap, dedicated, low maintenance, with low energy consumption and are able to plug directly into a standard desktop computer or laptop. For certain classes of scientific computing algorithms, GPUs offer the potential speed up of traditional massively parallel cluster-based computing at a fraction of the cost, power and time of uptake and programming effort.

We will review GPU architectures and the class of algorithms which are suited to GPU simulation; as well as those which are not suited. In particular, we discuss SIMD (Single Instruction Multiple Data) processing structures and how they relate to Monte Carlo methods. We have migrated a number of methods onto GPUs including population-based MCMC, sequential Monte Carlo samplers and particle filters with speed ups ranging from 35 to 500 fold over conventional single-threaded computation; reducing weeks worth of computing down to hours. We discuss the algorithmic structures, developments and hurdles which lead to such improvements. We illustrate these methods on a number of challenging examples in Bayesian statistical modelling.

Flexible modeling of conditional distributions using smooth mixtures of asymmetric student t densities
Robert Kohn, University of New South Wales, Australia

A general model is proposed for flexibly estimating the density of a continuous response variable conditional on a possibly high-dimensional set of covariates. The model is a finite mixture of asymmetric student-t densities with covariate dependent mixture weights. The four parameters of the components, the mean, degrees of freedom, scale and skewness, are all modeled as functions of the covariates. Inference is Bayesian and the computation is carried out using Markov chain Monte Carlo simulation. To enable model parsimony, a variable selection prior is used in each set of covariates and among the covariates in the mixing weights. The model is used to analyze the distribution of daily stock market returns, and shown to more accurately forecast the distribution of returns than other widely used models for financial data.

Large-scale Bayesian logistic regression
David Madigan, Columbia University, USA

This talk will describe some large-scale applications of logistic regression involving millions of observations and predictors. Online algorithms are appropriate for some of these applications and I will present a number of possible approaches. I will also describe also a model variant that deals with correlated predictors.

Recent advances in ABC (Approximate Bayesian Computation) methodology
Jean-Michel Marin, Université Montpellier 2, France

In the Bayesian paradigm, when the likelihood function is expensive or impossible to calculate, it is almost impossible to sample from the posterior distribution. Approximate Bayesian Computation (ABC) is a recent and very promising technique that only requires being able to sample from the likelihood. In this talk, we will survey the flurry of recent results on the ABC algorithm that have appeared in the literature, including our own on the use of exact ABC algorithms for the selection ofIsing models.

Variational Bayes for spatial data analysis
Clare McGrory, Queensland University of Technology, Australia

The Variational Bayes method is emerging as a viable alternative to MCMC-based approaches for performing Bayesian inference. The key strengths of the variational Bayes approach are its computational efficiency and ease of implementation. This makes it particularly useful for practical applications where large datasets are frequently encountered. In this talk we will discuss its use for spatial data or image analysis with a focus on hidden Markov random field modelling. The variational approach that we will outline will be illustrated with applications in medical imaging and environmental modelling.

Regression density estimation with variational methods and stochastic approximation
David Nott, National University of Singapore

Regression density estimation is the problem of flexibly estimating a response distribution assuming that it varies smoothly as a function of covariates. An important approach to regression density estimation uses mixtures of experts (ME) models and in this talk flexible mixtures of heteroscedastic experts (MHE) regression models will be considered where the response distribution is a normal mixture, with the component means, variances and mixture weights all varying as a function of covariates. The mixture component means and variances are described by heteroscedastic linear regression models, and the mixture weights by a multinomial logit model. Using heteroscedastic rather than homoscedastic linear regressions for the mixture components is important, since it is known that if the number of covariates is moderate or large ME models with homoscedastic components do not give good estimates. Fast variational approximation methods for inference in MHE models will be developed. The motivation for this is that alternative computationally intensive MCMC methods for fitting mixture models are difficult to apply when it is desired to fit models repeatedly, such as in exploratory analysis and in cross-validation for model choice. The work we describe makes three contributions. First, a variational approximation for MHE models will be discussed where the variational lower bound is in closed form and where convenient updates of parameters are possible. Second, the basic approximation can be improved by using stochastic approximation methods to perturb the initial closed form solution to attain higher accuracy. The computational methods developed here are applicable beyond the context of MHE models. Third, the advantages of our approach for model choice compared to MCMC based approaches will be illustrated.

Skew-normal variational approximations for Bayesian inference
John Ormerod, University of Sydney, Australia

High-dimensional analytically intractable integrals are a pervasive problem in Bayesian inference. Monte Carlo methods can be used in the analysis of models where such problems arise. However, for large datasets or complex models such methods become computationally burdensome and it may become desirable to seek alternatives.

Popular deterministic alternatives include variational Bayes and Laplace's method. However, variational Bayes only performs well under particular conjugacy and independence assumptions and Laplace's method only works well when the posterior is nearly normal in shape.

In this talk I introduce the skew-normal variational approximation which minimises the Kullback-Leibler distance between a posterior density and a multivariate skew-normal density. The resulting approximation often simplifies calculations to the maximisation of a sum of univariate integrals which may be handled using a combination of standard optimisation and quadrature techniques. We show for a number of examples that the approach is more accurate than variational Bayes and Laplace's method whilst remaining faster than standard
Monte Carlo methods.

Optimizing MCMC algorithms in high dimensions: a new perspective.
Natesh Pillai, Harvard University, USA

MCMC (Markov Chain Monte Carlo) algorithms are an extremely powerful set of tools for sampling from complex probability distributions. Understanding and quantifying their behavior in high dimensions thus constitute an essential part of modern statistical inference. In this regard, most of the research efforts so far were focussed on obtaining estimates for the mixing times of the corresponding Markov chain.

In this talk we offer a new perspective for studying the efficiency of commonly used algorithms. We will discuss optimal scaling of MCMC algorithms in high dimensions where the key idea is to study the properties of the proposal distribution as a function of the dimension. This point of view gives us new insights on the behaviour of the algorithm, such as precise estimates of the number of steps required to explore the target measure, in stationarity, as a function of the dimension of the state space.

In the first part of the talk, we will describe the main ideas and discuss recent results on high dimensional target measures arising in the context of statistical inference for mathematical models representing physical phenomena. In the second part of the talk, we discuss the Hybrid Monte Carlo Algorithm (HMC) and answer a few open questions about its efficiency in high dimensions. We will also briefly discuss applications to parallel tempering, Gibbs samplers and conclude with concrete problems for future work.

The r-inla.org project: an overview
Håvard Rue, Norwegian University of Science and Technology, Norway

I'll give an overview of our r-inla.org project which target latent Gaussian models. The r-inla.org project consist of two parts. First part, is the development of fast and accurate methods for approximate inference, which is solved using the INLA-approach. The second part, is our SPDE approach towards GMRF representations of Gaussian spatial fields.

Adaptive optimal scaling of Metropolis-Hastings algorithms
Scott Sisson, University of New South Wales, Australia

In Metropolis-Hastings algorithms it is common to manually adjust the scaling parameter of the proposal distribution so that the sampler achieves a reasonable overall acceptance probability. Some theoretical results suggest that the overall acceptance probability should be around 0.44 for univariate and 0.234 for multivariate proposal distributions. However, manually tuning the scaling parameter(s) to obtain this can be time-consuming, and impractical in high dimensions.

I'll present an adaptive method for the automatic scaling of Random-Walk Metropolis-Hastings algorithms. This method will adaptively update the scaling parameter of the proposal distribution to achieve a pre-specified overall acceptance probability. Our approach relies on the use of the Robbins-Monro search process, whose performance is determined by an unknown steplength constant, for which we give a very simple estimator. I'll demonstrate how to incorporate the Robbins-Monro process into Metropolis-Hastings algorithms and demonstrate its effectiveness through simulated and real data examples. The algorithm is a quick robust method for finding the scaling parameter that yields a specified acceptance probability.

Based on work by P. H. Garthwaite, Y. Fan and S. A. Sisson

EM, variational bayes and expectation propagation
Mike Titterington, University of Glasgow, UK

The main aim of the talk will be to discuss the Expectation Propagation (EP) approach of T. Minka. This will follow a preamble based on the EM algorithm and the variational approximation approach. The relationship between EP and other parts of the literature will be outlined and some semi-theoretical investigation of its properties in some very simple problems will be presented.

Variational bayes for elaborate distributions
Matt Wand, University of Wollongong, Australia

We describe strategies for variational Bayes approximate inference for models containing elaborate distributions. Examples of elaborate distributions are the Skew-Normal,Generalized Exteme Value and Half-Cauchy distributions. Such distributions suffer from the difficulty that the variational Bayes parameter updates do not admit closed form solutions. We circumvent this problem through a combination of
(i) specially tailored auxiliary variables,
(ii) univariate quadrature schemes
and (iii) finite mixture approximations of troublesome density functions.
An accuracy assessment is conducted and the new methodologyis illustrated in an application. This talk represents joint research with J. Ormerod, S. Padoan and R. Fruhwirth.

Bayesian hypothesis testing in latent variable models
Jun Yu, Singapore Management University

MCMC has proved to be a highly useful and efficient approach for Bayesian estimation of latent variables models. However, hypothesis testing via Bayes factors (BF) suffers from several practical difficulties. One difficulty lies in the difficulty to compute the BF for many latent variables models. Another is that BF is subject to the Jeffreys-Lindley paradox. In this paper, a new Bayesian method is introduced to test a hypothesis that is based on decision theory. The new statistic is a by-product of Bayesian MCMC output and hence it is easy to compute. It is further shown that it does not suffer from the Jeffreys-Lindley paradox. The finite sample properties are examined using simulated data. We illustrate the empirical performance of the method using real data to fit a factor asset pricing model and a stochastic volatility model.
Authors: Yong Li (Sun Yat-sen University) and Jun Yu (Singapore Management University)

Best viewed with IE 7 and above