题目:Modeling Documents with a Deep Boltzmann Machine
作者:HInton
发表于:UAI 2013
主要内容:
这篇文章写用神经网络来对文本(文章)进行建模的。在Replicated Softmax model的基础上,增加了一个隐含层,但是并不增加参数,用来提升模型性能。与标准RSM和LDA模型相比较,作者这个模型性能更好。
1. Replicated Softmax model
实质上是一个RBM(玻尔兹曼机)模型。两层网络,如图:
左图施展开模型,两层神经网络,输入层各个节点代表词语(如果词语是用向量表示的,则每个节点代表一个向量),中间隐含层的节点是0-1随机变量,表示文本topic。整体上,沿用RBM的模型,训练算法也用CD-k算法。不过,与标准RBM有两点不同:1. 输入层的节点个数不确定——节点个数由一篇文章的词数决定,而文章长度不确定,词数也不确定;2. 也是为了解决1所带来的问题——输入层都不确定,网络权重是由输入层和隐含层共同决定的,网络权重个数也不确定,那还怎么训练?!所以作者要求,网络权重仅仅由隐含层决定,所有输入层节点与某一个隐含层节点的关联权重都相同。作者在文中称为每一篇文本建立一个RBM,有点言过其实了——就是每个文章的长度不同、输入节点数目不同,网络结构显得不同罢了。
右图是另一种解释方式,即只有一个输入节点,与各个隐含节点相关联。这个输入节点表示一个随机变量,该随机变量sample N次(N是文本长度)。注意,如果每个词语用向量表示,则输入层(单个词语)与隐含层的节点还是用二维矩阵表示。
根据右图,我们也能了解该模型的训练方法。初步的,文本中每个词语相当于右图输入节点sample的结果,这些词语sample的结果(如N个词语表示向量)共同作用于隐含层,此时再套用CD-k算法,就可以训练了。
记得朴素贝叶斯用于文本分类在实现中也有两种模型,一种是标准模型,又称为多项式模型;另一种是专门处理变长文本的,叫做贝努利模型,实际操作也就是随机变量sample N次(N是文本长度),由每一次sample的结果来决定文本的分类。
2. Over-Replicated Softmax Model
作者在Replicated Softmax Model的基础上提出的一种模型,在隐含层基础上增加了一个top layer(实际上也是一个隐含层)。但是隐含层和top layer之间的权重沿用隐含层和输入层之间的权重,所以总体参数个数并没有增加。如图:
很奇葩的模型。隐含层和输入层之间也用RBM逻辑,隐含层和top layer之间也是RBM逻辑,都是auto-encoder。top layer的节点个数严格等于隐含层的节点个数,权重也都一样。思路就是用top layer的auto-encoder来修正隐含层和输入层的训练结果。三层网络,两个auto-encoder,算是stack auto-encoder了,正常应该用两个CD-k分别训练。不过文章中对网络权重的限制,使得用一个CD-k就能解决。
总有疑问就是作者为什么这么做(增加了一个隐含层)?如果这么做效果好的话,为什么不多增加几个隐含层?
3. 实验
在文本perplexity和信息检索上做的实验。没细看,怎么把topic转成词语ngram的概率,且计算perplexity的。
由于神经网络输入输出的形式,用来表示文本还是不常见的。不过最近Deep Learning这么火,自然连Hinton这样的牛人都要去尝试。这篇文章的思路还是auto-encoder的思路。
4. 再说两句
关于Replicated Softmax model的原始文章是Hinton在2009年在NIPS会议上的一个文章,标题为“Replicated Softmax: an Undirected Topic Model”。基本思路就上上面说的那样。不过当时Hinton没有用CD-k训练算法,而是用了一种基于蒙塔卡罗抽样(其实CD-k也是一种简化的蒙塔卡罗抽样算法)的AIS算法。AIS是把Replicated Softmax model看成了一种产生式模型。
分享到:
相关推荐
This course aims to create a smooth learning path that will teach you how to effectively use deep learning with Java with other de facto components to get the most out of it. Through this ...
Preface I think Python is an ... I’m really proud of this book and I hope that you find it a useful companion on your machine learning journey with Python. Jason Brownlee Melbourne, Australia 2016
A Tour of Machine Learning Classifiers Using Scikit-Learn Building Good Training Sets - Data Preprocessing Compressing Data via Dimensionality Reduction Learning Best Practices for Model Evaluation ...
Pro Deep Learning with TensorFlow is a practical and mathematical guide to deep learning using TensorFlow. Deep learning is a branch of machine learning where you model the world in terms of a ...
Data science professionals or analysts who have performed machine learning tasks and now want to explore deep learning and want a quick reference that could address the pain points while implementing ...
Preface I think R is an amazing platform for machine learning.... I’m really proud of this book and I hope that you find it a useful companion on your machine learning journey with R.
LMS optimized with deep learning for tracking incoming radio wave angle
Pro Deep Learning with TensorFlow is a practical and mathematical guide to deep learning using TensorFlow. Deep learning is a branch of machine learning where you model the world in terms of a ...
This book offers a complete overview of Deep Learning with Theano, a Python-based library that makes optimizing numerical expressions and deep learning models easy on CPU or GPU. The book provides ...
Pro Deep Learning with TensorFlow is a practical and mathematical guide to deep learning using TensorFlow. Deep learning is a branch of machine learning where you model the world in terms of a ...
co-founder and CEO of Tesla and SpaceX, Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts....
Starting with an introduction to basic machine learning algorithms, to give you a solid foundation, Deep Learning with Java takes you further into this vital world of stunning predictive insights and ...
co-founder and CEO of Tesla and SpaceX, Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts....
deep learning has taken the world by surprise, driving rapid progress in fields as diverse as computer vision, natural language processing, automatic speech recognition, reinforcement learning, and ...
Modeling Deep Learning Accelerator Enabled GPUs.pdf
Python Deep Learning Projects: 9 projects demystifying neural network and deep learning models for building intelligent systems By 作者: Matthew Lamons – Rahul Kumar – Abhishek Nagaraja ISBN-10 书号...
Learning Physics Modeling with PhysX Learning Physics Modeling with PhysX