Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

Yu, Hanlin; Hartmann, Marcelo; Williams, Bernardo; Klami, Arto

Computer Science > Machine Learning

arXiv:2303.05101 (cs)

[Submitted on 9 Mar 2023 (v1), last revised 31 Mar 2024 (this version, v4)]

Title:Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

Authors:Hanlin Yu, Marcelo Hartmann, Bernardo Williams, Arto Klami

View PDF HTML (experimental)

Abstract:Stochastic-gradient sampling methods are often used to perform Bayesian inference on neural networks. It has been observed that the methods in which notions of differential geometry are included tend to have better performances, with the Riemannian metric improving posterior exploration by accounting for the local curvature. However, the existing methods often resort to simple diagonal metrics to remain computationally efficient. This loses some of the gains. We propose two non-diagonal metrics that can be used in stochastic-gradient samplers to improve convergence and exploration but have only a minor computational overhead over diagonal metrics. We show that for fully connected neural networks (NNs) with sparsity-inducing priors and convolutional NNs with correlated priors, using these metrics can provide improvements. For some other choices the posterior is sufficiently easy also for the simpler metrics.

Comments:	Adjust the template and minor fixes
Subjects:	Machine Learning (cs.LG); Computation (stat.CO)
Cite as:	arXiv:2303.05101 [cs.LG]
	(or arXiv:2303.05101v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2303.05101

Submission history

From: Hanlin Yu [view email]
[v1] Thu, 9 Mar 2023 08:20:28 UTC (1,236 KB)
[v2] Tue, 25 Jul 2023 15:51:16 UTC (2,222 KB)
[v3] Mon, 21 Aug 2023 12:33:30 UTC (2,223 KB)
[v4] Sun, 31 Mar 2024 22:58:36 UTC (2,223 KB)

Computer Science > Machine Learning

Title:Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators