My research is at the interface of computational statistics, machine learning, and applied probability - focusing on sampling, generative modelling, and optimisation. I develop stochastic and probabilistic mathematical
machinery for statistical inference and machine learning. A few highlights are:
See works page for preprints, papers, slides, posters, and other things related to my work. I maintain a research blog called almost stochastic for short notes which might be of interest to other people.
If you are interested in joining for a PhD, please see this page.
B. Boys, M. Girolami, J. Pidstrigach, S. Reich, A. Mosca, ODA
Diffusion generative models allow for the incorporation of strong empirical priors in scientific inference. We leverage higher order information using Tweedie's formula to solve inverse problems and provide a theoretical guarantee specifically for posterior sampling which can lead to a better theoretical understanding of diffusion-based conditional sampling. We show that our method (i) removes any time-dependent step-size hyperparameters required by earlier methods, (ii) brings stability and better sample quality across multiple noise levels, (iii) is the only method that works in a stable way with variance exploding (VE) forward processes as opposed to earlier works.
We develop a physics-informed dynamical variational autoencoder (Φ-DVAE) to embed diverse data streams into time-evolving physical systems described by differential equations. Our approach combines a standard, possibly nonlinear, filter for the latent state-space model and a VAE, to assimilate the unstructured data into the latent dynamical system. Unstructured data, in our example systems, comes in the form of video data and velocity field measurements, however the methodology is suitably generic to allow for arbitrary unknown observation operators. A variational Bayesian framework is used for the joint estimation of the encoding, latent states, and unknown system parameters.
ÖDA, F. R. Crucinio, M. Girolami, T. Johnston, S. Sabanis
We study a class of interacting particle systems for implementing a marginal maximum likelihood estimation (MLE) procedure to optimize over the parameters of a latent variable model. We prove nonasymptotic concentration bounds for the optimisation error of the maximum marginal likelihood estimator in terms of the number of particles in the particle system, the number of iterations of the algorithm, and the step-size parameter for the time discretisation analysis.
SIAM/ASA Journal on Uncertainty Quantification, (2022).
ÖDA, C. Duffin, S. Sabanis, M. Girolami
We use Langevin dynamics to solve the statFEM forward problem, studying the utility of the unadjusted Langevin algorithm (ULA), a Metropolis-free Markov chain Monte Carlo sampler, to build a sample-based characterisation of this otherwise intractable measure. Leveraging the theory behind Langevin-based samplers, we provide theoretical guarantees on sampler performance, demonstrating convergence, for both the prior and posterior, in the Kullback-Leibler divergence, and, in Wasserstein-2, with further results on the effect of preconditioning. Numerical experiments are also provided, for both the prior and posterior, to demonstrate the efficacy of the sampler, with a Python package also included.
The 24th International Conference on Artificial Intelligence and Statistics (AISTATS), 2021.
ÖDA, G. G. J. van den Burg, T. Damoulas, M. Steel
We introduce the probabilistic sequential matrix factorization (PSMF) method for factorizing time-varying and non-stationary datasets consisting of high-dimensional time-series. In particular, we consider nonlinear Gaussian state-space models where sequential approximate inference results in the factorization of a data matrix into a dictionary and time-varying coefficients with (possibly nonlinear) Markovian dependencies. The assumed Markovian structure on the coefficients enables us to encode temporal dependencies into a low-dimensional feature space.
We investigate an adaptation strategy based on convex optimisation which leads to a class of adaptive samplers. These samplers rely on the iterative minimisation of the \(\chi^2\)-divergence between an exponential family proposal and the target. We prove non-asymptotic error bounds for the mean squared errors (MSEs) of these algorithms, which explicitly depend on the number of iterations and the number of samples together. We also demonstrate explicit links between hyperparameters of these samplers, the number of samples, and the number of iterations.
L. Richter, A. Boustati, N. Nüsken, F. J. R. Ruiz, ÖDA
We analyse the properties of an unbiased gradient estimator of the ELBO for variational inference, based on the score function method with leave-one-out control variates. We show that this gradient estimator can be obtained using a new loss, defined as the variance of the log-ratio between the exact posterior and the variational approximation, which we call the log-variance loss. Under certain conditions, the gradient of the log-variance loss equals the gradient of the (negative) ELBO. We show theoretically that this gradient estimator, which we call VarGrad due to its connection to the log-variance loss, exhibits lower variance than the score function method in certain settings, and that the leave-one-out control variate coefficients are close to the optimal ones.
We introduce a framework for inference in general state-space hidden Markov models (HMMs) under likelihood misspecification. In particular, we leverage the loss-theoretic perspective of Generalized Bayesian Inference (GBI) to define generalised filtering recursions in HMMs, that can tackle the problem of inference under model misspecification. In doing so, we arrive at principled procedures for robust inference against observation contamination by utilising the β-divergence. Operationalising the proposed framework is made possible via sequential Monte Carlo methods (SMC), where most standard particle methods, and their associated convergence results, are readily adapted to the new setting.
Statistics and Computing volume 30, pages 1645–1663 (2020)
ÖDA, D. Crisan, J. Miguez
We introduce and analyze a parallel sequential Monte Carlo methodology for the numerical solution of optimization problems that involve the minimization of a cost function that consists of the sum of many individual components. The proposed scheme is a stochastic zeroth-order optimization algorithm which demands only the capability to evaluate small subsets of components of the cost function. It can be depicted as a bank of samplers that generate particle approximations of several sequences of probability measures. These measures are constructed in such a way that they have associated probability density functions whose global maxima coincide with the global minima of the original cost function. We provide explicit convergence rates in terms of the number of generated Monte Carlo samples and the dimension of the search space.
Statistics and Computing, volume 30, pages 305–330(2020)
ÖDA, J. Miguez
We investigate a new sampling scheme aimed at improving the performance of particle filters whenever (a) there is a significant mismatch between the assumed model dynamics and the actual system, or (b) the posterior probability tends to concentrate in relatively small regions of the state space. The proposed scheme pushes some particles toward specific regions where the likelihood is expected to be high, an operation known as nudging in the geophysics literature. We reinterpret nudging in a form applicable to any particle filtering scheme, as it does not involve any changes in the rest of the algorithm. We prove analytically that nudged particle filters can still attain asymptotic convergence with the same error rates as conventional particle methods. Simple analysis also yields an alternative interpretation of the nudging operation that explains its robustness to model errors.