【计算机科学】【2017】变分推理与深度学习

在这里插入图片描述

本文为荷兰阿姆斯特丹大学(做者:Kingma,D.P.)的博士论文,共174页。web

在本文中,咱们提出了变分(贝叶斯)推理、生成建模、表示学习、半监督学习和随机优化问题的新的解决方案。算法

•咱们提出了一种有效的变分推理算法【Kingmaand Welling,2013年】(第2章),适用于解决大型模型的高维推理问题。该方法使用模型隐变量和/或参数的一阶梯度;这种梯度使用反向传播算法进行计算是有效的,使得该方法特别适合于用深度神经网络进行推理和学习。网络

•咱们提出了变分自动编码器(VAE)[Kingma and Welling,2013年](第2章)。VAE框架将基于神经网络的推理模型与基于神经网络的生成模型相结合,为两种网络在给定数据的对数似然上的联合优化提供了一种简单的方法。双随机梯度降低过程容许扩展到很是大的数据集。咱们展现了变分自编码在生成建模和表示学习中的应用。框架

•咱们演示了如何使用VAE框架来解决半监督学习问题【Kingma等人,2014年】(第3章),从而在出版时得到标准半监督图像分类基准的最新结果。ide

•咱们提出了逆自回归流【Kingma等人,2016年】(第5章),这是一类基于规范化流的后验分布,容许在高维隐空间上推断高度非高斯后验分布。咱们演示了如何使用该方法来学习其对数似然性能与自回归模型至关的VAE,同时容许更快的数量级合成。svg

•咱们提出局部重参数化技巧(第6章),以进一步提升高斯后验模型参数变分推理的效率【Kingma等人,2015年】。该方法为流行的正则化方法dropout提供了一个新的(贝叶斯)视角;利用这一联系,咱们提出了变分dropout,它容许咱们学习dropout率。post

•咱们提出了Adam【Kingma和Ba,2015年】(第7章),一种基于自适应矩的随机梯度优化方法。性能

In this thesis, Variational Inference andDeep Learning: A New Synthesis, we propose novel solutions to the problems ofvariational (Bayesian) inference, generative modeling, representation learning,semi-supervised learning, and stochastic optimization. • We propose anefficient algorithm for variational inference [Kingma and Welling, 2013](chapter 2), suitable for solving high-dimensional inference problems withlarge models. The method uses first-order gradients of the model w.r.t. thelatent variables and/or parameters; such gradients are efficient to computeusing the backpropagation algorithm. This makes the method especially well-suitedfor inference and learning with deep neural networks. • We propose variationalautoencoders (VAEs) [Kingma and Welling, 2013] (chapter 2). The VAE frameworkcombines a neural-network based inference model with a neural-network basedgenerative model, and provides a simple method for joint optimization of bothnetworks towards a bound on the log-likelihood of the parameters given thedata. A doubly stochastic gradient descent procedure allows for scaling to verylarge datasets. We demonstrate the use of variational autoencoders forgenerative modeling and representation learning. • We demonstrate how the VAEframework can be used to tackle the problem of semi-supervised learning [Kingmaet al., 2014] (chapter 3), resulting in state-of-the-art results on standardsemi-supervised image classification benchmarks at the time of publication. •We propose inverse autoregressive flows [Kingma et al., 2016] (chapter 5), aflexible class of posterior distributions based on normalizing flows, allowinginference of highly non-Gaussian posterior distributions over high-dimensionallatent spaces. We demonstrate how the method can be used to learn a VAE whoselog-likelihood performance is comparable to autoregressive models, whileallowing for orders of magnitude faster synthesis. • We propose the localreparameterization trick (chapter 6) for further improving the efficiency ofvariational inference of a Gaussian posterior over model parameters [Kingma etal., 2015]. This method provides an additional (Bayesian) perspective ofdropout, a popular regularization method; making using of this connection wepropose variational dropout, which allows us to learn the dropout rate. • Wepropose Adam [Kingma and Ba, 2015] (chapter 7), a method for stochasticgradient-based optimization based on adaptive moments.学习

  1. 引言与背景
  2. 变分自编码器
  3. 半监督学习
  4. 深度生成模型
  5. 逆自回归流
  6. 变分Dropout与局部重参数化
  7. ADAM:一种随机优化的方法

更多精彩文章请关注公众号:在这里插入图片描述flex