[coursera/ImprovingDL/week2]Optimization algorithms(summary&question)

summary 2.1 mini-batch gradient the size of batch: m BGD: too long for each iteration the size of batch:1 SGD: lose speed up(vectorization) in-between mini-batch 2.2 bias correct For the beginning of
本站公众号
   欢迎关注本站公众号,获取更多信息