Quantization and Training of Neural Networks for Efﬁcient Integer-Arithmetic-Only Inference

时间 2020-12-20

原文原文链接

摘要作者提出了一种只使用整数运算的quantization方式，比起float point运算效率更高。同时提出了一种相应的训练方式来保证quantization之后的准确率。这篇文章的方法提升了accuracy和on-device latency之间的trade off，并且可以在MobileNets上使用。 1 introduction 作者总结了目前有效将庞大的神经网络应用在资源更为有限的