import numpy as np
import time
a = np.random.rand(1000000)
b = np.random.rand(1000000)
tic = time.time()
c = np.dot(a, b)
print("cost " + str((time.time() - tic)*1000) + "ms")
复制代码
import numpy as np
a = np.random.randn(5) #do not use print("a:",a.shape,"\n", a)
b = np.random.randn(5, 1)
print("b:",b.shape,"\n", b)
c = np.random.randn(1, 5)
print("c:",c.shape,"\n", c)
a = a.reshape(5, 1)
assert(a.shape == (5, 1))
复制代码
3. 浅层神经网络
3.1 神经网络概览
3.2 神经网络表示
3.5 向量化实现的解释
3.6 激活函数
3.7 为何使用非线性的激活函数
若是是线性的 通过几层以后仍是线性的,多层就没有意义了函数
3.8 激活函数的导数
3.9 激活函数的导数
3.11 随机初始化
多神经元为什么W不能初始化为0矩阵学习
4. 深层神经网络
4.1 深层神经网络
4.3 核对矩阵的维数
4.7 参数VS超参数
课程二 改善深层神经网络:超参数调试、正则化以及优化
1. 深度学习的实用层面
1.1 训练、开发、测试集
1.2 误差、方差
1.4 Regularization
lamda 很大会发生什么:
1.6 Drop Out Regularization
1.8 其余Regularization方法
early stopping
1.9 Normalizing inputs
1.10 vanishing/exploding gradients
1.11 权重初始化
1.13 Gradient Check
1.14 Gradient Check Implementation Notes
2. 优化算法
2.1 Mini-batch gradient descent
batch-size 要适配CPU/GPU memory
2.3 Exponentially weighted averages
移动平都可抚平短时间波动,将长线趋势或周期显现出来。数学上,移动平都可视为一种卷积。
Bias correction
2.6 Gradient Descent with Momentum
2.7 RMSprop
2.8 Adam优化算法
Momentum + RMSprop
2.9 Learning rate decay
逐步减少Learning rate的方式
2.10 局部最优的问题
在高维空间,容易遇到saddle point可是local optima其实不容易遇到
plateaus是个问题,learning会很慢,可是相似adam的方法能减轻这个问题
3. 超参数调试、batch正则化和程序框架
3.1 搜索超参数
Try random values: don't use a grid
Coarse to fine
3.4 Batch Normalization
一个问题,在回归中能够normalization在神经网络中能否作相似的事情
经过lamda和beta能够控制mean和variance
3.6 Batch Normalization为何有效
By normlization values to similar range of values, it speed up learning
Batch normlization reduces the problem of input values(对于每一层) changing
Has a slight regulazation effect (like dropout, it adds some noice to each hidden layer's activations)
3.7 Batch Normalization at test time
使用训练中加权指数平均算出来的mean,variance来test
3.8 Softmax regression
多类,而不是二类。generazation of logistic regression.
3.10 深度学习框架
课程三 结构化机器学习项目
1. 机器学习(ML)策略(1)
1.1 为何是ML策略
1.2 正交化
Fit training set well in cost function
If it doesn’t fit well, the use of a bigger neural network or switching to a better optimization algorithm might help.
Fit development set well on cost function
If it doesn’t fit well, regularization or using bigger training set might help.
Fit test set well on cost function
If it doesn’t fit well, the use of a bigger development set might help
Performs well in real world
If it doesn’t perform well, the development test set is not set correctly or the cost function is not evaluating the right thing