ON LARGE BATCH TRAINING FOR DEEP LEARNING: GENERALIZATION GAP AND SHARP MINIMA

文章目录 概 主要内容 一些解决办法 Keskar N S, Mudigere D, Nocedal J, et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima[J]. arXiv: Learning, 2016. 作者代码 @article{keskar2016on, title
相关文章
相关标签/搜索