代码详解：一文掌握神经网络超参数调优

时间 2019-12-01

标签代码详解一文掌握神经网络参数繁體版

原文原文链接

全文共7002字，预计学习时长14分钟或更长git

神经网络在通讯行业和研究中的使用十分常见，但使人遗憾的是，大部分应用都未能产出足以运行其余算法的高性能网络。github

应用数学家在开发新型优化算法时，喜欢进行功能测试，有时也被称为人造景观。人造景观有助于从如下方面比较各算法的性能：算法

· 收敛（算出答案的速度）json

· 精准度（与正确答案的接近程度）bash

· 稳健性（是否全部功能表现优良，或仅一小部分如此）微信

· 综合表现（如概念复杂度）网络

浏览有关功能优化测试的维基词条，就会发现有些功能很难对付。不少功能因找出优化算法的问题而被普遍使用。但本文将讨论一项看似微不足道的功能——Beale功能。app

Beale功能

Beale功能以下图所示：dom

Beale功能是测试功能的缘由在于，它能在坡度极小的平坦区域内评估调优算法的性能。在这种状况下，基于坡度的优化算法程序难以有效地学习，所以很难达到最小值。ide

本文接下来将按照GitHub库里的Jupyter笔记本教程开展讨论，以得出解决人造景观的可行方式。该景观相似于神经网络的损失平面。训练神经网络的目的是经过某种形式的优化找到损失平面上的最小值——典型的随机坡度减小。

在学习使用高难度的优化功能后，本文读者能充分应对施行神经网络时遇到的实际问题场景。

测试神经网络前，首先须要给功能下定义能并找出最小值（不然没法肯定为正确答案）。第一步（引进相关软件包后），在笔记本中定义Beale功能：

# define Beale's function which we want to minimizedef objective(X): 
x = X[0]; y = X[1]    
return (1.5 - x + x*y)**2 + (2.25 - x + x*y**2)**2 + 
(2.625 - x + x*y**3)**2复制代码

已知此案例中（由咱们构想）最小值的大概范围及栅极网孔的步长，第二步设置功能边界值。

# function boundariesxmin, xmax, 
xstep = -4.5, 4.5, .9ymin, ymax, ystep = -4.5, 4.5, .9复制代码

根据以上信息制做一组点状网孔栅极，就能够找出最小值。

# Let's create some pointsx1, y1 = np.
meshgrid(np.arange(xmin, xmax + xstep, xstep), np.arange(ymin, ymax + ystep, ystep))复制代码

如今，得出（很是）初步的结论。

# initial guessx0 = [4., 4.] f0 = objective(x0)print (f0)复制代码

而后使用scipy.optimize功能，得出答案。

bnds = ((xmi, xmax), (ymin, ymax))minimum = minimize(objective, x0, bounds=bnds)print(minimum)复制代码

答案结果以下：

答案彷佛是（3，0.5）。若是把这些值填入等式，这确实是最小值（维基上也显示如此）。

接下来进入神经网络部分。

神经网络的优化

神经网络能够被定义为一个结合输入并猜想输出的系统。幸运的话，在得出被称做“地面实况”的结果后，将之与神经网络的各类输出进行比对，就能计算错误。所以，神经网络首先进行猜想，而后计算错误功能；再次猜想，将错误最小化；再次猜想，直到错误最小化。这就是优化。

神经网络中最常使用的优化算法是GD（gradient descent，坡降）类型。坡降中使用的客观功能正是想缩至最小的损失功能。

本教程的重头戏是Keras，所以再回顾一下。

Keras复习

Keras是一个深度学习Python库，可同时在Theano和TensorFlow上运行，它们也是两个强大的快速数字计算Python库，分别在脸书和谷歌上建立发布。

Keras旨在开发尽量快捷简单的深度学习模型，以运用在研究和实用程序中。Keras使用Python 2.7或3.5语言运行，可无缝切换至GPU和CPU运行。

Keras基于一个模型的概念。在其核心有一些按顺序线性排列的层级，称为顺序模型。Keras还提供功能性界面，可定义复杂模型，如多产出模型、定向非循环图以及有共有层级的模型。

可以使用顺序模型总结Keras的深度学习模型构建，以下所示：

1. 定义模型：建立顺序模型，增长层级。

2. 编译模型：具体设置损失功能和优化器，调用the .compile()功能。

3. 调试模型：调用the .fit() 功能用数据测试模型。

4. 进行预测：经过调用.evaluate() 和.predict()功能，使用该模型对新数据生成新预测。

有些人可能会疑惑——如何在运行模型过程当中检测其性能？这是个好问题，答案就是使用回叫。

回叫：训练模型过程当中进行监测

经过使用回叫，可在训练的任何阶段监测模型。回叫是指对训练程序中特定阶段使用的一系列功能。使用回叫，可在训练过程当中观察模型内部状态及数据。可向顺序或模型分类的the .fit()方法传输一系列回叫（做为关键词变元回叫）。回叫的相关方法将会在训练的每个阶段使用。

· 大众所熟悉的Keras回叫功能是keras.callbacks.History()。这是.fit()方法自带的。

· keras.callbacks.ModelCheckpoint也颇有用，可在训练中存储特定阶段模型的重量。若是模型长时间运行且出现系统故障，该功能会颇有效果。使用该功能后任何数据都不会遗失。好比，只有当累加器计算且观测到改进时，存储模型重量才是适宜的作法。

· 可监测的大批错误中止改进时，keras.callbacks.EarlyStopping功能中止训练。

· keras.callbacks.LearningRateScheduler功能将改变训练过程当中的学习速度。

以后将应用一些回叫。详细记录参见https://keras.io/callbacks/。

首先须要引进不少不一样的功能，以方便操做。

import tensorflow as tfimport kerasfrom keras import layersfrom 
keras import modelsfrom keras import utilsfrom keras.
layers import Densefrom keras.models import 
Sequentialfrom keras.layers import Flattenfrom keras.
layers import Dropoutfrom keras.layers import 
Activationfrom keras.regularizers import l2from 
keras.optimizers import SGD
from keras.optimizers import RMSprop
from keras import datasetsfrom keras.callbacks import Learning
RateSchedulerfrom keras.callbacks import Historyfrom keras import 
lossesfrom sklearn.utils import shuffleprint
(tf.VERSION)print(tf.keras.__version__)复制代码

若是想要网络使用随机数字但结果可重复，还能够执行的一个步骤是使用随机种子。随机种子每次产出一样顺序的数字，哪怕它们是伪随机的（有助于比较模型和测试可复制性）。

# fix random seed for reproducibilitynp.random.seed(5)复制代码

第一步——肯定网络拓扑（不必定是优化，但也相当重要）

这一步将使用MNIST数据集，其包含手写数字（0到9）的灰度图，28×28像素维度。每一个像素是8位数，所以其数值范围在0到255之间。

Keras有此内置功能，所以能便捷地获取数据集。

mnist = keras.datasets.mnist(x_train, y_train),
(x_test, y_test) = mnist.load_data
()x_train.shape, y_train.shape复制代码

X和Y数据的产出分别是(60000, 28, 28)和(60000,1)。建议打印一些数据，检验数值（同时须要数据类型）。

可经过观察每一个数字的图像来检查训练数据，以确保数据中没有任何遗漏的。

plt.figure(figsize=(10,10))for i in range(10):    
plt.subplot(5,5,i+1)    
plt.xticks([])    
plt.yticks([])    
plt.grid(False)    
plt.imshow(x_train[i], 
cmap=plt.cm.binary)    
plt.xlabel(y_train[i])复制代码

最后一项检查是针对训练维度和测试集，这一步骤操做相对简单：

print(f'We have {x_train.shape[0]} train samples')print
(f'We have {x_test.shape[0]} test samples')复制代码

有60,000个训练图像和10,000个测试图像。以后要预处理数据。

预处理数据

运行神经网络前，须要预处理数据（如下步骤可任意替换顺序）：

· 首先，须要将2D图像阵列转为1D（扁平化）。可以使用numpy.reshape()功能进行阵列重塑，或使用Keras的方法：keras.layers.Flatten层级，可将2D阵列（28×28像素）图像转化为1D阵列图像（28 * 28 = 784像素）。

· 而后须要将像素值调至正常状态（将数值调整为0到1之间），转换以下：

在案例中，最小值为0，最大值为255，所以公式为：:=𝑥/255。

# normalize the datax_train, x_test = x_train 
/ 255.0, x_test / 255.0
# reshape the data into 1D vectors
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
num_classes = 10
# Check the column aengthx_train.shape[1]复制代码

如今数据中须要一个独热码。

# Convert class vectors to binary class
 matricesy_train = keras.
utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)复制代码

第二步——调整学习速度

最经常使用的优化算法之一是随机坡降（SGD）。其中可调优的超参数是学习速度，动量，衰变和nesterov项。

学习速度在每批结束时控制重量，而且动量控制先前重量如何影响当前重量。衰变表示每次更新时学习速度的降低幅度。nesterov取值“True”或“False”取决因而否要应用Nesterov动量。

这些超参数的一般数值是lr = 0.01，衰变= 1e-6，动量= 0.9，nesterov = True。

学习速度超参数会存在于优化功能中，以下所示。 Keras在SGDoptimizer中具备默认学习速度调度器，会经过随机坡降的优化算法下降学习速度。学习速度随着如下公式下降：

lr=lr×1/(1+decay∗epoch)

http://cs231n.github.io/neural-networks-3

接下来在Keras中实施学习速度适应时间表。先从SGD开始，学习速度数值为0.1。而后针对模型训练60个时期并将衰变参数设置为0.0016（0.1 / 60）。其中还包括动量值0.8，由于它在使用、适应学习速度时运做良好。

pochs=60learning_rate = 0.1decay_rate = learning_rate / epochs
momentum = 0.8sgd = SGD(lr=learning_rate, 
momentum=momentum, decay=decay_rate, nesterov=False)复制代码

接下来开始构建神经网络：

# build the modelinput_dim = x_train.shape[1]lr_model = Sequential
()lr_model.add(Dense(64, activation=tf.nn.relu, kernel_initializer='uniform',                 
input_dim = input_dim)) lr_model.
add(Dropout(0.1))lr_model.add(Dense(64, kernel_initializer='uniform', 
activation=tf.nn.relu))lr_model.
add(Dense(num_classes, kernel_initializer='uniform', 
activation=tf.nn.softmax))
# compile the modellr_model.compile
(loss='categorical_crossentropy',              
optimizer=sgd,              
metrics=['acc'])复制代码

如今能够运行模型，看看它的表现如何。机器花费了大约20分钟，各人的机器运行速度不一。

%%time# Fit the modelbatch_size = int
(input_dim/100)lr_model_history = lr_model.fit(x_train, y_train,                    
batch_size=batch_size,                    
epochs=epochs,                    
verbose=1,                    
validation_data=(x_test, y_test))复制代码

运行完毕后，能够把准确度和损失功能绘制为训练和测试集的时期函数，以查看网络运行状况。

# Plot the loss functionfig, 
ax = plt.subplots(1, 1, 
figsize=(10,6))ax.plot(np.sqrt
(lr_model_history.history['loss']), 'r', label='train')
ax.plot(np.sqrt(lr_model_history.history['val_loss']), 'b' ,
label='val')ax.set_xlabel(r'Epoch', 
fontsize=20)ax.set_ylabel
(r'Loss', fontsize=20)ax.legend()
ax.tick_params(labelsize=20)
# Plot the accuracyfig, 
ax = plt.subplots(1, 1, figsize=(10,6))ax.plot(np.sqrt
(lr_model_history.history['acc']), 'r', label='train')ax.plot
(np.sqrt(lr_model_history.history['val_acc']), 'b' ,label='val')ax.set_xlabel(r'Epoch', 
fontsize=20)ax.set_ylabel(r'Accuracy', 
fontsize=20)ax.legend()ax.tick_params(labelsize=20)复制代码

损失函数图以下：

准确度以下：

如今应用自定义学习速度。

使用LearningRateScheduler改变自定义学习速度

编写一个执行指数学习速度衰变的函数，以下公式所示：

𝑙𝑟=𝑙𝑟0×𝑒^（ - 𝑘𝑡）

这与以前很是类似，所以会在一个代码块中执行此操做，并描述差别。

# solutionepochs = 60learning_rate = 0.1 
# initial learning ratedecay_rate = 0.1momentum = 0.8
# define the optimizer functionsgd = SGD
(lr=learning_rate, momentum=momentum, decay=decay_rate, 
nesterov=False)input_dim = x_train.
shape[1]num_classes = 10batch_size = 196# build the modelex
ponential_decay_model = Sequential()
exponential_decay_model.add(Dense(64, 
activation=tf.nn.relu, kernel_initializer='uniform', input_dim = input_dim))
exponential_decay_model.add(Dropout(0.1))
exponential_decay_model.add(Dense(64, 
kernel_initializer='uniform', activation=tf.nn.relu))
exponential_decay_model.add(Dense(num_classes, 
kernel_initializer='uniform', activation=tf.nn.softmax))
# compile the modelexponential_decay_model.
compile(loss='categorical_crossentropy',                                
 optimizer=sgd,                                 
metrics=['acc'])      
                          
# define the learning rate change def exp_decay(epoch): 
lrate = learning_rate * np.exp(-decay_rate*epoch)    

return lrate    
# learning schedule callbackloss_history = History()
lr_rate = LearningRateScheduler(exp_decay)callbacks_list = [loss_history, lr_rate]
# you invoke the LearningRateScheduler during the .fit() 
phaseexponential_decay_model_history = exponential_decay_model.
fit(x_train, y_train,                                    
batch_size=batch_size,                                    
epochs=epochs,                                    
callbacks=callbacks_list,                                    
verbose=1,                                    
validation_data=(x_test, y_test))复制代码

此处看到，惟一改变的是被定义的exp_decay函数，以及它在LearningRateScheduler函数中的使用。注意本次还选择向模型添加一些回叫。

如今能够将学习速度和损失功能绘制为时期数量的函数。学习速度图很是平稳，由于它符合预约义的指数衰变函数。

与以前相比，损失函数更为平稳。

这代表开发学习速度调度程序有助于提升神经网络的性能。

第三步——选择优化器和损失函数

在构建模型并使用它进行预测时，如为图像（“猫”，“平面”等）加标签，但愿经过定义“损失”函数来衡量成败（或目标函数）。优化目标是有效计算使该损失函数最小化的参数/权重。Keras提供各类类型的损失函数。

有时“损失”函数能够测量“距离”，经过符合问题或数据集的各类方式在两个数据点之间定义这个“距离”。使用的距离取决于数据类型和正在处理的特定问题。例如，在天然语言处理（分析文本数据）中，汉明距离的使用更为常见。

距离

· 欧几里德（Euclidean）

· 曼哈顿(Manhattan)

· 如汉明等距离用于测量弦之间的距离。 “carolin”和“cathrin”之间的汉明距离为3。

损失函数

· MSE（用于回归）

· 分类交叉熵（用于分类）

· 二元交叉熵（用于分类）

# build the modelinput_dim = x_train.shape[1]
model = Sequential()model.add(Dense(64, 
activation=tf.nn.relu, kernel_initializer='uniform',                 
input_dim = input_dim)) 
# fully-connected layer with 64 hidden unitsmodel.add(Dropout(0.1))
model.add(Dense(64, kernel_initializer='uniform', activation=tf.nn.relu))
model.add(Dense(num_classes, kernel_initializer='uniform', activation=tf.nn.softmax))
# defining the parameters for RMSprop 
(I used the keras defaults here)rms = RMSprop
(lr=0.001, rho=0.9, epsilon=None, decay=0.0)
model.compile(loss='categorical_crossentropy',             
optimizer=rms,              
metrics=['acc'])复制代码

第4步——肯定批量大小和时期数量

批量大小定义经过网络传播的样本数。

例如，有1000个训练样本，而且要设置batch_size为100。算法从训练数据集中获取前100个样本（从第1到第100个）训练网络。接下来，须要另外100个样本（从第101到第200）并再次训练网络。此过程需一直执行直至传播完样本。

使用批量大小的优势<全部样本数量的优势：

· 所需内存更小。因为使用较少样本训练网络，总体训练过程须要较小的内存。若是没法将整个数据集放入机器的内存中，那么这一点尤其重要。

· 一般，使用小批量的网络培训得更快，缘由是每次传播后会更新权重。

使用批量大小的缺点<全部样本的数量的缺点：

· 批次越小，梯度的估计就越不许确。

时期数是一个超参数，定义学习算法在整个训练数据集中的工做次数。

一个时期意味着训练数据集中的每一个样本都有机会更新内部模型参数。时期由一个或多个批次组成。

选择批量大小或时期数没有硬性和快速的规则，而且增长时期数不必定比较少时期数产生更好的结果。

%%timebatch_size = input_dimepochs = 60model_history =
 model.fit(x_train, y_train,                    
batch_size=batch_size,                    
epochs=epochs,                    
verbose=1,                    
validation_data=(x_test, y_test))复制代码

score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])print('Test accuracy:', score[1])fig, ax = plt.subplots
(1, 1, figsize=(10,6))ax.
plot(np.sqrt(model_history.history['acc']), 
'r', label='train_acc')
ax.plot(np.sqrt(model_history.history['val_acc']), 'b' ,
label='val_acc')ax.set_xlabel(r'Epoch', 
fontsize=20)ax.set_ylabel(r'Accuracy',
fontsize=20)ax.legend()ax.tick_params(labelsize=20)
fig, ax = plt.subplots(1, 1, 
figsize=(10,6))ax.
plot(np.sqrt(model_history.history['loss']), 'r', 
label='train')ax.plot(np.sqrt(model_history.history['val_loss']), 
'b' ,label='val')ax.set_xlabel(r'Epoch', 
fontsize=20)ax.set_ylabel(r'Loss', fontsize=20)ax.
legend()ax.tick_params(labelsize=20)复制代码

第5步——随机重启

此方法彷佛没法Keras中实现，但能够经过更改keras.callbacks.LearningRateScheduler轻松完成。本文将此做为练习留给读者，它主要是在有限时期数以后重置学习速度。

使用交叉验证调整超参数

如今无需手动尝试不一样值，而可以使用Scikit-Learn的GridSearchCV，为超参数尝试几个值，并比较结果。

为使用Keras进行交叉验证，将运用到Scikit-Learn API的包装器。其将Sequential Keras模型使用（仅单输入）做为Scikit-Learn工做流程的一部分。

如下为两个包装器：

keras.wrappers.scikit_learn.KerasClassifier（build_fn = None，** sk_params），它实现了Scikit-Learn分类器接口。

keras.wrappers.scikit_learn.KerasRegressor（build_fn = None，** sk_params），它实现了Scikit-Learn回归量接口。

import numpyfrom sklearn.model_selection import GridSearch
CVfrom keras.wrappers.scikit_learn import KerasClassifier复制代码

尝试不一样的权重初始化

将尝试经过交叉验证进行优化的第一个超参数是不一样的权重初始化。

# let's create a function that creates the model (required for KerasClassifier)
 # while accepting the hyperparameters we want to tune 
# we also pass some default values such as optimizer='rmsprop'def 
create_model(init_mode='uniform'):    
# define model 
model = Sequential()    
model.add(Dense(64, kernel_initializer=init_mode, 
activation=tf.nn.relu, input_dim=784))     
model.add(Dropout(0.1))   
 model.add(Dense(64, kernel_initializer=init_mode, activation=tf.nn.relu))    
model.add(Dense(10, kernel_initializer=init_mode, activation=tf.nn.softmax))   
# compile model 
model.compile(loss='categorical_crossentropy',              
optimizer=RMSprop(),              
metrics=['accuracy'])return model复制代码

%%timeseed = 7numpy.random.seed(seed)
batch_size = 128epochs = 10model_CV = 
KerasClassifier(build_fn=create_model, epochs=epochs,                            
batch_size=batch_size, verbose=1)
# define the grid search parametersinit_mode = 
['uniform', 'lecun_uniform', 'normal', 'zero',              
'glorot_normal', 'glorot_uniform', 'he_normal', 
'he_uniform']param_grid = dict(init_mode=init_mode)
grid = GridSearchCV(estimator=model_CV, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(x_train, y_train)复制代码

# print resultsprint(f'Best Accuracy for {grid_result.best_score_} 
using {grid_result.best_params_}') means = grid_result.cv_results_['mean_test_score']stds = grid_result.cv_results_['std_test_score']params = grid_result.cv_results_['params']for mean, stdev, param in zip(means, stds, params): print(f' mean={mean:.4}, std={stdev:.4} using {param}')复制代码

GridSearch结果以下：

能够看到，从使用lecun_uniform初始化或glorot_uniform初始化的模型中得出最好的结果，而且能够得到近97％的准确度。

将神经网络模型保存为JSON

分层数据格式（HDF5）用于存储大阵列数据，包括神经网络中权重的值。

能够安装HDF5 Python模块：pip install h5py

Keras有助于使用JSON格式描述和保存任何模型。

from keras.models import model_from_json# serialize model to J
SONmodel_json = model.to_json()with open("model.json", "w") as json_file:    
json_file.write(model_json)
# save weights to HDF5model.save_weights("model.h5")
print("Model saved")
# when you want to retrieve the model: load json and 
create modeljson_file = open('model.json', 'r')
saved_model = json_file.read()# close the file as
good practicejson_file.close()model_from_json = 
model_from_json(saved_model)# load weights 
into new modelmodel_from_json.load_weights
("model.h5")print("Model loaded")复制代码

使用多个超参数进行交叉验证

一般人们对一个参数变化的方式不感兴趣，而对多个参数变化如何影响结果感到好奇。能够同时对多个参数进行交叉验证，尝试它们的组合。

注意：神经网络中的交叉验证须要大量计算。在实验以前要三思！将须要验证的要素数量相乘，查看有多少组合。使用k折交叉验证评估每一个组合（k是咱们选择的参数）。

例如，能够选择搜索不一样的值：

· 批量大小

· 时期数量

· 初始化模式

选项被指定到字典中并传递给GridSearchCV。

如今对批量大小、时期数和初始化程序组合执行GridSearch。

# repeat some of the initial values here so we make sure they were not 
changedinput_dim = x_train.shape[1]num_classes = 10
# let's create a function that creates the model (required for KerasClassifier) 
# while accepting the hyperparameters we want to tune 
# we also pass some default values such as optimizer='rmsprop'def 
create_model_2(optimizer='rmsprop', init='glorot_uniform'):    
model = Sequential()    
model.add(Dense(64, input_dim=input_dim, kernel_initializer=init, activation='relu'))    
model.add(Dropout(0.1))   
 model.add(Dense(64, kernel_initializer=init, activation=tf.nn.relu))    
model.add(Dense(num_classes, kernel_initializer=init, activation=tf.nn.softmax))    
# compile model 
model.compile(loss='categorical_crossentropy',                   
optimizer=optimizer,                   
metrics=['accuracy'])return model复制代码

%%time# fix random seed for reproducibility (this might work or might not work 
# depending on each library's implenentation)seed = 7numpy.random.seed(seed)
# create the sklearn model for the 
networkmodel_init_batch_epoch_CV = KerasClassifier(build_fn=create_model_2, verbose=1)
# we choose the initializers that came at the top in our previous cross-validation!!
init_mode = ['glorot_uniform', 'uniform'] batches = [128, 512]epochs = [10, 20
# grid search for initializer, batch size and number of epochsparam_grid = 
dict(epochs=epochs, batch_size=batches, init=init_mode)
grid = GridSearchCV(estimator=model_init_batch_epoch_CV,                    
 param_grid=param_grid,                    
cv=3)grid_result = grid.fit(x_train, y_train)复制代码

# print resultsprint(f'Best Accuracy for {grid_result.best_score_:.4}
 using {grid_result.best_params_}')means = grid_result.cv_results_['mean_test_score'] stds = grid_result.cv_results_['std_test_score']params = grid_result.cv_results_ ['params']for mean, stdev, param in zip(means, stds, params): print(f'mean={mean:.4}, std={stdev:.4} using {param}')复制代码

最后一个问题：若是在GridSearchCV中必须循环的参数数量和值的数量特别大，该怎么办？

这多是一个棘手的问题。想象一下，有5个参数以及为每一个参数选择的10个可能值。可能组合的数量是10⁵，这意味着必须训练一个庞大的网络。显然，这种操做会很疯狂，因此一般使用RandomizedCV。

RandomizedCV容许人们指定全部可能的参数。对于交叉验证中的每一个折叠，它选择用于当前模型的随机参数子集。最后，用户能够选择最佳参数集并将其用做近似解。

留言点赞关注

咱们一块儿分享AI学习与发展的干货
欢迎关注全平台AI垂类自媒体 “读芯术”

（添加小编微信：dxsxbb，加入读者圈，一块儿讨论最新鲜的人工智能科技哦～）