Batch Normalization

 每个batch中的元素单位大小相同,有点像归一化 优点: because of less covariate shift, learning rate可以设大一点 less vanishing  gradient problems less sensitive to initialization
相关文章
相关标签/搜索