Jonas Latz (Heriot-Watt University, Edinburgh) & Jeremie Houssineau (University of Warwick, UK)
Jonas Latz:
Optimisation problems with discrete and continuous data appear in statistical estimation, machine learning, functional data science, robust optimal control, and variational inference. The 'full' target function in such an optimisation problem is given by the integral over a family of parameterised target functions with respect to a discrete or continuous probability measure. Such problems can often be solved by stochastic optimisation methods: performing optimisation steps with respect to the parameterised target function with randomly switched parameter values. In this talk, we discuss a continuous-time variant of the stochastic gradient descent algorithm. This so-called stochastic gradient process couples a gradient flow minimising a parameterised target function and a continuous-time 'index' process which determines the parameter.
We first briefly introduce the stochastic gradient processes for finite, discrete data which uses pure jump index processes. Then, we move on to continuous data. Here, we allow for very general index processes: reflected diffusions, pure jump processes, as well as other Lévy processes on compact spaces. Thus, we study multiple sampling patterns for the continuous data space. We show that the stochastic gradient process can approximate the gradient flow minimising the full target function at any accuracy. Moreover, we give convexity assumptions under which the stochastic gradient process with constant learning rate is geometrically ergodic. In the same setting, we also obtain ergodicity and convergence to the minimiser of the full target function when the learning rate decreases over time sufficiently slowly.
Jeremie Houssineau:
It is widely acknowledged that there are two types of uncertainty, random/aleatory and deterministic/epistemic. Yet, Bayesian inference conflates these and uses probability theory throughout. In this talk, I will argue that there are cases where possibility theory can be successfully used instead of, or in conjunction with, probabilistic Bayesian inference to yield intuitive yet principled solutions to complex problems. This will be illustrated via applications to robust inference, data assimilation and space domain awareness.