CS231N-6&7-Training Neural Networks

Activation functions Data Preprocessing Weight Initialization Batch Normalization Learning rate Optimization condition number saddle point SGD with momentum AdaGradRMSProp Adam Learning rate decay Sec
相关文章
相关标签/搜索