[1] C. Yau, O. Papaspiliopoulos, G. O. Roberts, and C. Holmes. Bayesian non-parametric hidden Markov models with applications in genomics. J. R. Stat. Soc. Ser. B Stat. Methodol., 73(1):37-57, 2011. [ bib | http ]
We propose a flexible non-parametric specification of the emission distribution in hidden Markov models and we introduce a novel methodology for carrying out the computations. Whereas current approaches use a finite mixture model, we argue in favour of an infinite mixture model given by a mixture of Dirichlet processes. The computational framework is based on auxiliary variable representations of the Dirichlet process and consists of a forward–backward Gibbs sampling algorithm of similar complexity to that used in the analysis of parametric hidden Markov models. The algorithm involves analytic marginalizations of latent variables to improve the mixing, facilitated by exchangeability properties of the Dirichlet process that we uncover in the paper. A by-product of this work is an efficient Gibbs sampler for learning Dirichlet process hierarchical models. We test the Monte Carlo algorithm proposed against a wide variety of alternatives and find significant advantages. We also investigate by simulations the sensitivity of the proposed model to prior specification and data-generating mechanisms. We apply our methodology to the analysis of genomic copy number variation. Analysing various real data sets we find significantly more accurate inference compared with state of the art hidden Markov models which use finite mixture emission distributions.

Keywords: Block Gibbs sampler; Copy number variation; Local and global clustering; Partial exchangeability; Partition models; Retrospective sampling
[2] O. Papaspiliopoulos, Y. Pokern, G.O. Roberts, and A.M. Stuart. Nonparametric estimation of diffusions: A differential equations approach. Biometrika, 99(3):511-531, 2012. [ bib | http ]
We consider estimation of scalar functions that determine the dynamics of diffusion processes. It has been recently shown that nonparametric maximum likelihood estimation is ill-posed in this context. We adopt a probabilistic approach to regularize the problem by the adoption of a prior distribution for the unknown functional. A Gaussian prior measure is chosen in the function space by specifying its precision operator as an appropriate differential operator. We establish that a Bayesian-Gaussian conjugate analysis for the drift of one-dimensional nonlinear diffusions is feasible using high-frequency data, by expressing the loglikelihood as a quadratic function of the drift, with sufficient statistics given by the local time process and the end points of the observed path. Computationally efficient posterior inference is carried out using a finite element method. We embed this technology in partially observed situations and adopt a data augmentation approach whereby we iteratively generate missing data paths and draws from the unknown functional. Our methodology is applied to estimate the drift of models used in molecular dynamics and financial econometrics using high-and low-frequency observations. We discuss extensions to other partially observed schemes and connections to other types of nonparametric inference.

Keywords: Finite element method; Gaussian measure; Inverse problem; Local time; Markov chain Monte Carlo; Markov process

This file was generated by bibtex2html 1.96.