deep compression

时间 2020-12-23

原文原文链接

文章分三个步骤压缩模型: Prunes the network：只保留一些重要的连接； Quantize the weights：通过权值量化来共享一些weights； Huffman coding：通过霍夫曼编码进一步压缩； 1.Prunes the network prunes的过程为: 训练一个网络；把模型的权值矩阵weight的很小的值进行剪枝,方法是设定一个阈值,权值小于该阈值的值为0