《python深度学习》第二章神经网络的数学基础

时间 2019-12-01

原文原文链接

from tensorflow.keras.datasets import mnist# keras做为tf2.0高阶api 如此食用
复制代码

train_images 和 train_labels 组成了训练集（training set）。模型将从这些数据中进行学习。而后在测试集（test set，即 test_images 和 test_labels）上对模型进行测试。图像被编码为 Numpy 数组，而标签是数字数组，取值范围为 0~9。图像和标签一一对应。

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
复制代码

咱们来看一下训练数据：

print(train_images.shape)
print(len(train_labels))
print(train_labels)
复制代码

(60000, 28, 28)
60000
[5 0 4 ... 5 6 8]
复制代码

首先，将训练数据（train_images 和 train_labels）输入神经网络；其次，网络学习将图像和标签关联在一块儿；最后，网络对 test_images 生成预测，而咱们将验证这些预测与 test_labels 中的标签是否匹配。

from tensorflow.keras import models
from tensorflow.keras import layers# layers层，一种数据处理模块，能够当作数据过滤器，过滤出更有用的数据
复制代码

层从输入数据中提取表示——咱们指望这种表示有助于解决手头的问题。大多数深度学习都是将简单的层连接起来，从而实现渐进式的数据蒸馏（data distillation）。深度学习模型就像是数据处理的筛子，包含一系列愈来愈精细的数据过滤器（即层）。

network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,))) 
network.add(layers.Dense(10, activation='softmax'))
复制代码

本例中的网络包含 2 个 Dense 层，它们是密集链接（也叫全链接）的神经层。第二层（也是最后一层）是一个 10 路 softmax 层，它将返回一个由 10 个几率值（总和为 1）组成的数组。每一个几率值表示当前数字图像属于 10 个数字类别中某一个的几率。

损失函数（loss function）：网络如何衡量在训练数据上的性能，即网络如何朝着正确的方向前进。
优化器（optimizer）：基于训练数据和损失函数来更新网络的机制。
在训练和测试过程当中须要监控的指标（metric）：本例只关心精度，即正确分类的图像所占的比例。

network.compile(optimizer='rmsprop',loss='categorical_crossentropy', metrics=['accuracy'])
复制代码

train_images = train_images.reshape((60000, 28 * 28)) 
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255
复制代码

#对标签进行分类编码
from tensorflow.keras.utils import to_categorical
train_labels = to_categorical(train_labels) 
test_labels = to_categorical(test_labels)
复制代码

#训练网络
network.fit(train_images, train_labels, epochs=5, batch_size=128)
复制代码

Train on 60000 samples
Epoch 1/5
60000/60000 [==============================] - 4s 74us/sample - loss: 0.2543 - accuracy: 0.9248
Epoch 2/5
60000/60000 [==============================] - 4s 62us/sample - loss: 0.1040 - accuracy: 0.9692
Epoch 3/5
60000/60000 [==============================] - 4s 62us/sample - loss: 0.0686 - accuracy: 0.9791
Epoch 4/5
60000/60000 [==============================] - 4s 64us/sample - loss: 0.0497 - accuracy: 0.9856
Epoch 5/5
60000/60000 [==============================] - 4s 61us/sample - loss: 0.0368 - accuracy: 0.9890





<tensorflow.python.keras.callbacks.History at 0x248802cfd88>
复制代码

一个是网络在训练数据上的损失（loss），另外一个是网络在训练数据上的精度（acc）。python

检查一下模型在测试集上的性能。 test_loss, test_acc = network.evaluate(test_images, test_labels) print('test_acc:', test_acc)api

训练精度和测试精度之间的这种差距是过拟合（overfit）形成的。过拟合是指机器学习模型在新数据上的性能每每比在训练数据上要差数组

《python深度学习》第二章 神经网络的数学基础

《python深度学习》第二章神经网络的数学基础