A Brief Introduction to Stein Variational Gradient Descent
Context
- Scientific Domain: SVGD belongs to machine learning and computational statistics, specifically in the subfields of variational inference, Bayesian computation, and particle-based approximate inference. It bridges optimization and Monte Carlo sampling.
- Inception of the Field: The method was introduced in 2016, building on concepts from Stein's method in probability theory and kernel methods in machine learning.
- Key Researchers: The algorithm was developed by Qiang Liu and Dilin Wang, building on earlier work by researchers in Stein's method and kernelized Stein discrepancies.
Main Results
- Mathematical Setup: The goal is to approximate a target distribution $p(x)$ (e.g., a Bayesian posterior) using a set of particles $\{x_i\}_{i=1}^n$. SVGD minimizes the KL divergence between the particle distribution $q$ and $p$ by iteratively transporting particles along a smooth vector field $\phi$ in a reproducing kernel Hilbert space (RKHS). The optimal descent direction is given by the kernelized Stein discrepancy:$$\phi^(\cdot) = \mathbb{E}_{x \sim q}[\mathcal{A}_p \kappa(x, \cdot)]$$ where $\mathcal{A}_p$ is the Stein operator and $\kappa$ is a positive definite kernel.
- Main Results: The main algorithmic achievement is the SVGD update rule:$$x_i \leftarrow x_i + \epsilon \hat{\phi}^(x_i)$$ where $$\hat{\phi}^(x) = \frac{1}{n}\sum_{j=1}^n\left[\kappa(x_j, x)\nabla_{x_j}\log p(x_j) + \nabla_{x_j}\kappa(x_j, x)\right]$$
This transforms sampling into a deterministic particle evolution that simultaneously minimizes KL divergence while maintaining particle diversity through the repulsive force in the kernel gradient term.
References
- Foundational:
- Liu, Q., & Wang, D. (2016). Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm. Advances in Neural Information Processing Systems.
- Theoretical Development:
- Liu, Q., Lee, J. D., & Jordan, M. I. (2016). A Kernelized Stein Discrepancy for Goodness-of-fit Tests. International Conference on Machine Learning.
- Key Textbook/Survey:
- Liu, Q. (2017). Stein Variational Gradient Descent as Gradient Flow. Advances in Neural Information Processing Systems*.