神经网络的前向传播和反向传播推导

神经网络的前向传播和反向传播推导

在这里插入图片描述
x 1 x_{1} x 2 x_{2} 表示输入
w i j w_{ij} 表示权重
b i j b_{ij} 表示偏置
σ i \sigma_{i} 表示激活函数,这里使用sigmoid激活函数
o u t out 表示输出
y y 表示真实值
η \eta 表示学习率

前向传播
h 1 = w 11 x 1 + w 13 x 2 + b 11 h_{1}=w_{11}x_{1}+w_{13}x_{2}+b_{11} α 1 = σ ( h 1 ) = 1 1 + e h 1 \alpha_{1}=\sigma(h1)=\frac{1}{1+e^{-h1}}

h 2 = w 12 x 1 + w 14 x 2 + b 12 h_{2}=w_{12}x_{1}+w_{14}x_{2}+b_{12} α 2 = σ ( h 2 ) = 1 1 + e h 2 \alpha_{2}=\sigma(h2)=\frac{1}{1+e^{-h2}}

z = w 21 α 1 + w 22 α 2 + b 21 z=w_{21}\alpha_{1}+w_{22}\alpha_{2}+b_{21} o u t = σ ( z ) = 1 1 + e z out=\sigma(z)=\frac{1}{1+e^{-z}}

损失函数

E = 1 2 ( o u t y ) 2 E=\frac{1}{2}(out-y)^2

反向传播
求导
w 21 = E w 21 = E o u t o u t z z w 21 = ( o u t y ) σ ( z ) ( 1 σ ( z ) ) α 1 \bigtriangleup w_{21}=\frac{\partial E}{\partial w_{21}}=\frac{\partial E}{\partial out}\frac{{\partial out}}{\partial z}\frac{\partial z}{\partial w_{21}}=(out-y)\sigma(z)(1-\sigma(z))\alpha_{1}

w 22 = E w 22 = E o u t o u t z z w 22 = ( o u t y ) σ ( z ) ( 1 σ ( z ) ) α 2 \bigtriangleup w_{22}=\frac{\partial E}{\partial w_{22}}=\frac{\partial E}{\partial out}\frac{{\partial out}}{\partial z}\frac{\partial z}{\partial w_{22}}=(out-y)\sigma(z)(1-\sigma(z))\alpha_{2}

b 21 = E b 21 = E o u t o u t z z b 21 = ( o u t y ) σ ( z ) ( 1 σ ( z ) ) \bigtriangleup b_{21}=\frac{\partial E}{\partial b_{21}}=\frac{\partial E}{\partial out}\frac{{\partial out}}{\partial z}\frac{\partial z}{\partial b_{21}}=(out-y)\sigma(z)(1-\sigma(z))

更新 w 21 w 22 b 21 w_{21}、w_{22}、b_{21}

w 21 = w 21 η w 21 w_{21}=w_{21}-\eta \bigtriangleup w_{21}

w 22 = w 22 η w 22 w_{22}=w_{22}-\eta \bigtriangleup w_{22}

b 21 = b 21 η b 21 b_{21}=b_{21}-\eta \bigtriangleup b_{21}

求导

w 12 = α 2 h 2 h 2 w 12 = σ ( h 2 ) ( 1 σ ( h 2 ) ) x 1 \bigtriangleup w_{12}=\frac{\partial \alpha_{2}}{\partial h_{2}}\frac{{\partial h_{2}}}{\partial w_{12}}=\sigma(h_{2})(1-\sigma(h_{2}))x_{1}

w 14 = α 2 h 2 h 2 w 14 = σ ( h 2 ) ( 1 σ ( h 2 ) ) x 2 \bigtriangleup w_{14}=\frac{\partial \alpha_{2}}{\partial h_{2}}\frac{{\partial h_{2}}}{\partial w_{14}}=\sigma(h_{2})(1-\sigma(h_{2}))x_{2}

b 12 = α 2 h 2 h 2 b 12 = σ ( h 2 ) ( 1 σ ( h 2 ) ) \bigtriangleup b_{12}=\frac{\partial \alpha_{2}}{\partial h_{2}}\frac{{\partial h_{2}}}{\partial b_{12}}=\sigma(h_{2})(1-\sigma(h_{2}))

w 11 = α 1 h 1 h 1 w 11 = σ ( h 1 ) ( 1 σ ( h 1 ) ) x 1 \bigtriangleup w_{11}=\frac{\partial \alpha_{1}}{\partial h_{1}}\frac{{\partial h_{1}}}{\partial w_{11}}=\sigma(h_{1})(1-\sigma(h_{1}))x_{1}

w 13 = α 1 h 1 h 1 w 13 = σ ( h 1 ) ( 1 σ ( h 1 ) ) x 2 \bigtriangleup w_{13}=\frac{\partial \alpha_{1}}{\partial h_{1}}\frac{{\partial h_{1}}}{\partial w_{13}}=\sigma(h_{1})(1-\sigma(h_{1}))x_{2}

b 11 = α 1 h 1 h 1 b 11 = σ ( h 1 ) ( 1 σ ( h 1 ) ) \bigtriangleup b_{11}=\frac{\partial \alpha_{1}}{\partial h_{1}}\frac{{\partial h_{1}}}{\partial b_{11}}=\sigma(h_{1})(1-\sigma(h_{1}))

更新 w 12 w 14 b 12 w_{12}、w_{14}、b_{12}

w 12 = w 12 η w 12 w_{12}=w_{12}-\eta \bigtriangleup w_{12}

w 14 = w 14 η w 14 w_{14}=w_{14}-\eta \bigtriangleup w_{14}

b 12 = b 12 η b 12 b_{12}=b_{12}-\eta \bigtriangleup b_{12}

更新 w 11 w 13 b 11 w_{11}、w_{13}、b_{11}

w 11 = w 11 η w 11 w_{11}=w_{11}-\eta \bigtriangleup w_{11}

w 13 = w 13 η w 13 w_{13}=w_{13}-\eta \bigtriangleup w_{13}

b 11 = b 11 η b 11 b_{11}=b_{11}-\eta \bigtriangleup b_{11}