目录html
TFlearnpython
pip3 安装 TFlearngit
pip3 install tflearn --user Installing collected packages: tflearn Successfully installed tflearn-0.3.2
在这个例子中咱们将对泰坦尼克号上的乘客进行存活可能性预测。github
数据集中,每个乘客的相关信息以下:后端
VARIABLE DESCRIPTIONS: survived Survived (0 = No; 1 = Yes) pclass Passenger Class (1 = st; 2 = nd; 3 = rd) name Name sex Sex age Age sibsp Number of Siblings/Spouses Aboard parch Number of Parents/Children Aboard ticket Ticket Number fare Passenger Fare
其中总共有9项,咱们将其分为标签(label)和输入(data),令标签为是否存活,存活为1,那么输入包含8项,其中咱们认为姓名以及船票的号码(能够由票价直接体现)对于咱们预测乘客的存活概率是没有什么用的,因此在预处理中,咱们将其抛弃。网络
数据集被存储为 csv
文件格式。csv
,全称为 Comma-Separated Values
,即逗号分隔值,其文本以纯文本形式存储表格数据,咱们可使用文本编辑器或 excel
直接打开。先加载数据到内存中session
使用 load_csv()
函数从csv文件中读取数据,并转为 python List
。其中 target_column
参数用于表示咱们的标签列 id
,该函数将返回一个元组:(data,labels)
。而后按照咱们前面说的,抛弃输入中的姓名以及船票号码字段,并将性别字段转为数值,0 表示男性,1 表示女性。app
TFLearn中采用Tensor进行运算,所以这里的net都是Tensor,与TensorFlow中同样,咱们也能够将其中的某一个部分用TensorFlow中的函数本身写,从而实现一些TFLearn库中没有的功能。其中全链接层的W(weights_init)和b(bias_init)能够指定,不过默认为W:'truncated_normal',b:'zeros',此外,其中的 activation 参数默认为'linear'。dom
其中 tflearn.DNN 是TFLearn中提供的一个模型 wrapper,至关于咱们将不少功能包装起来,咱们给它一个 net 结构,生成一个 model 对象,而后调用model对象的训练、预测、存储等功能,DNN类有三个属性(成员变量):trainer,predictor,session。在fit()函数中n_epoch=10表示整个训练数据集将会用10遍,batch_size=16表示一次用16个数据计算参数的更新。编辑器
最后利用训练获得的模型进行预测:
import numpy as np import tflearn # Download the Titanic dataset from tflearn.datasets import titanic titanic.download_dataset('titanic_dataset.csv') # Load CSV file, indicate that the first column represents labels from tflearn.data_utils import load_csv data, labels = load_csv('titanic_dataset.csv', target_column=0, categorical_labels=True, n_classes=2) # Preprocessing function def preprocess(data, columns_to_ignore): # Sort by descending id and delete columns for id in sorted(columns_to_ignore, reverse=True): [r.pop(id) for r in data] for i in range(len(data)): # Converting 'sex' field to float (id is 1 after removing labels column) data[i][1] = 1. if data[i][1] == 'female' else 0. return np.array(data, dtype=np.float32) # Ignore 'name' and 'ticket' columns (id 1 & 6 of data array) to_ignore=[1, 6] # Preprocess data data = preprocess(data, to_ignore) # Build neural network net = tflearn.input_data(shape=[None, 6]) net = tflearn.fully_connected(net, 32) net = tflearn.fully_connected(net, 32) net = tflearn.fully_connected(net, 2, activation='softmax') net = tflearn.regression(net) # Define model model = tflearn.DNN(net) # Start training (apply gradient descent algorithm) model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True) # Let's create some data for DiCaprio and Winslet dicaprio = [3, 'Jack Dawson', 'male', 19, 0, 0, 'N/A', 5.0000] winslet = [1, 'Rose DeWitt Bukater', 'female', 17, 1, 2, 'N/A', 100.0000] # Preprocess data dicaprio, winslet = preprocess([dicaprio, winslet], to_ignore) # Predict surviving chances (class 1 results) pred = model.predict([dicaprio, winslet]) print("DiCaprio Surviving Rate:", pred[0][1]) print("Winslet Surviving Rate:", pred[1][1])
Training samples: 1309 Validation samples: 0 -- successfully opened CUDA library libcublas.so.10.0 locally Training Step: 82 | total loss: 0.65318 | time: 3.584s | Adam | epoch: 001 | loss: 0.65318 - acc: 0.6781 -- iter: 1309/1309 -- Training Step: 164 | total loss: 0.63713 | time: 1.298s | Adam | epoch: 002 | loss: 0.63713 - acc: 0.6687 -- iter: 1309/1309 -- Training Step: 246 | total loss: 0.55357 | time: 1.354s | Adam | epoch: 003 | loss: 0.55357 - acc: 0.7219 -- iter: 1309/1309 -- Training Step: 328 | total loss: 0.56566 | time: 1.312s | Adam | epoch: 004 | loss: 0.56566 - acc: 0.7091 -- iter: 1309/1309 -- Training Step: 410 | total loss: 0.48417 | time: 1.311s | Adam | epoch: 005 | loss: 0.48417 - acc: 0.7854 -- iter: 1309/1309 -- Training Step: 492 | total loss: 0.56114 | time: 1.300s | Adam | epoch: 006 | loss: 0.56114 - acc: 0.7463 -- iter: 1309/1309 -- Training Step: 574 | total loss: 0.51057 | time: 1.289s | Adam | epoch: 007 | loss: 0.51057 - acc: 0.7988 -- iter: 1309/1309 -- Training Step: 656 | total loss: 0.56562 | time: 1.312s | Adam | epoch: 008 | loss: 0.56562 - acc: 0.7551 -- iter: 1309/1309 -- Training Step: 738 | total loss: 0.52883 | time: 1.324s | Adam | epoch: 009 | loss: 0.52883 - acc: 0.7654 -- iter: 1309/1309 -- Training Step: 820 | total loss: 0.50510 | time: 1.340s | Adam | epoch: 010 | loss: 0.50510 - acc: 0.7687 -- iter: 1309/1309 -- DiCaprio Surviving Rate: 0.17452878 Winslet Surviving Rate: 0.938663
咱们的模型完成训练,整体准确率在 76.87%,这意味着它能够预测76%总乘客的正确结果(幸存与否)。
其中 Dicaprio
是男主角,Winslet
为女主角,能够看出预测仍是比较准的。
掌握 keras 能够大幅提高对开发效率和网络结构的理解。优势:
pip3 install keras --user Successfully installed keras-2.2.4
安装完成后,进入python3,检查一下安装成果,import keras时,下方提示using TensorFlow backend,就证实Keras安装成功并使用TensorFlow做为backend。
import keras Using TensorFlow backend. ModuleNotFoundError: No module named 'numpy.core._multiarray_umath' ImportError: numpy.core.multiarray failed to import
这里有一个小问题,须要升级numpy包
pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade numpy --user Successfully installed numpy-1.16.3
而后keras成功安装
import keras Using TensorFlow backend.
keras 的核心数据是模型。模型是用来组织网络层的方式。模型有两种,一种叫 Sequential 模型,另外一种叫 Model 模型 。 Sequential 模型是一系列网络层按顺序构成的栈,是单输入单输出的,层与层之间只有相邻关系,是最简单的一种模型。
Keras 是一个用 Python 编写的高级神经网络 API,它可以以 TensorFlow, CNTK, 或者 Theano 做为后端运行。Keras 的开发重点是支持快速的实验。可以以最小的时延把你的想法转换为实验结果,是作好研究的关键。
若是你在如下状况下须要深度学习库,请使用 Keras:
Keras 兼容的 Python 版本: Python 2.7-3.6。
用户友好。 Keras 是为人类而不是为机器设计的 API。它把用户体验放在首要和中心位置。Keras 遵循减小认知困难的最佳实践:它提供一致且简单的 API,将常见用例所需的用户操做数量降至最低,而且在用户错误时提供清晰和可操做的反馈。
模块化。 模型被理解为由独立的、彻底可配置的模块构成的序列或图。这些模块能够以尽量少的限制组装在一块儿。特别是神经网络层、损失函数、优化器、初始化方法、激活函数、正则化方法,它们都是能够结合起来构建新模型的模块。
易扩展性。 新的模块是很容易添加的(做为新的类和函数),现有的模块已经提供了充足的示例。因为可以轻松地建立能够提升表现力的新模块,Keras 更加适合高级研究。
基于 Python 实现。 Keras 没有特定格式的单独配置文件。模型定义在 Python 代码中,这些代码紧凑,易于调试,而且易于扩展。
在 examples 目录 中,你能够找到真实数据集的示例模型:
Keras下载的数据集在如下目录中:
C:\Users\user_name\.keras\datasets
/home/user_name
,对于root用户,主目录是:/root
Keras下载的预训练模型在一下目录中:
/root/.keras/models
/home/user_name/.keras/models
# https://github.com/keras-team/keras/tree/master/examples/cifar10_cnn.py """ #Trains a ResNet on the CIFAR10 dataset. ResNet v1: [Deep Residual Learning for Image Recognition ](https://arxiv.org/pdf/1512.03385.pdf) ResNet v2: [Identity Mappings in Deep Residual Networks ](https://arxiv.org/pdf/1603.05027.pdf) Model|n|200-epoch accuracy|Original paper accuracy |sec/epoch GTX1080Ti :------------|--:|-------:|-----------------------:|---: ResNet20 v1| 3| 92.16 %| 91.25 %|35 ResNet32 v1| 5| 92.46 %| 92.49 %|50 ResNet44 v1| 7| 92.50 %| 92.83 %|70 ResNet56 v1| 9| 92.71 %| 93.03 %|90 ResNet110 v1| 18| 92.65 %| 93.39+-.16 %|165 ResNet164 v1| 27| - %| 94.07 %| - ResNet1001 v1|N/A| - %| 92.39 %| - Model|n|200-epoch accuracy|Original paper accuracy |sec/epoch GTX1080Ti :------------|--:|-------:|-----------------------:|---: ResNet20 v2| 2| - %| - %|--- ResNet32 v2|N/A| NA %| NA %| NA ResNet44 v2|N/A| NA %| NA %| NA ResNet56 v2| 6| 93.01 %| NA %|100 ResNet110 v2| 12| 93.15 %| 93.63 %|180 ResNet164 v2| 18| - %| 94.54 %| - ResNet1001 v2|111| - %| 95.08+-.14 %| - """ # %matplotlib inline # %config InlineBackend.figure_format = 'svg' # calculate time using import timeit start = timeit.default_timer() import keras from keras.layers import Dense, Conv2D, BatchNormalization, Activation from keras.layers import AveragePooling2D, Input, Flatten from keras.optimizers import Adam from keras.callbacks import ModelCheckpoint, LearningRateScheduler from keras.callbacks import ReduceLROnPlateau from keras.preprocessing.image import ImageDataGenerator from keras.regularizers import l2 from keras import backend as K from keras.models import Model from keras.datasets import cifar10 import numpy as np import os # calculate time using import timeit start = timeit.default_timer() # Training parameters batch_size = 32 # orig paper trained all networks with batch_size=128 epochs = 10 data_augmentation = True num_classes = 10 # Subtracting pixel mean improves accuracy subtract_pixel_mean = True # Model parameter # ---------------------------------------------------------------------------- # | | 200-epoch | Orig Paper| 200-epoch | Orig Paper| sec/epoch # Model | n | ResNet v1 | ResNet v1 | ResNet v2 | ResNet v2 | GTX1080Ti # |v1(v2)| %Accuracy | %Accuracy | %Accuracy | %Accuracy | v1 (v2) # ---------------------------------------------------------------------------- # ResNet20 | 3 (2)| 92.16 | 91.25 | ----- | ----- | 35 (---) # ResNet32 | 5(NA)| 92.46 | 92.49 | NA | NA | 50 ( NA) # ResNet44 | 7(NA)| 92.50 | 92.83 | NA | NA | 70 ( NA) # ResNet56 | 9 (6)| 92.71 | 93.03 | 93.01 | NA | 90 (100) # ResNet110 |18(12)| 92.65 | 93.39+-.16| 93.15 | 93.63 | 165(180) # ResNet164 |27(18)| ----- | 94.07 | ----- | 94.54 | ---(---) # ResNet1001| (111)| ----- | 92.39 | ----- | 95.08+-.14| ---(---) # --------------------------------------------------------------------------- n = 3 # Model version # Orig paper: version = 1 (ResNet v1), Improved ResNet: version = 2 (ResNet v2) version = 1 # Computed depth from supplied model parameter n if version == 1: depth = n * 6 + 2 elif version == 2: depth = n * 9 + 2 # Model name, depth and version model_type = 'ResNet%dv%d' % (depth, version) # Load the CIFAR10 data. (x_train, y_train), (x_test, y_test) = cifar10.load_data() # Input image dimensions. input_shape = x_train.shape[1:] # Normalize data. x_train = x_train.astype('float32') / 255 x_test = x_test.astype('float32') / 255 # If subtract pixel mean is enabled if subtract_pixel_mean: x_train_mean = np.mean(x_train, axis=0) x_train -= x_train_mean x_test -= x_train_mean print('x_train shape:', x_train.shape) print(x_train.shape[0], 'train samples') print(x_test.shape[0], 'test samples') print('y_train shape:', y_train.shape) # Convert class vectors to binary class matrices. y_train = keras.utils.to_categorical(y_train, num_classes) y_test = keras.utils.to_categorical(y_test, num_classes) def lr_schedule(epoch): """Learning Rate Schedule Learning rate is scheduled to be reduced after 80, 120, 160, 180 epochs. Called automatically every epoch as part of callbacks during training. # Arguments epoch (int): The number of epochs # Returns lr (float32): learning rate """ lr = 1e-3 if epoch > 180: lr *= 0.5e-3 elif epoch > 160: lr *= 1e-3 elif epoch > 120: lr *= 1e-2 elif epoch > 80: lr *= 1e-1 print('Learning rate: ', lr) return lr def resnet_layer(inputs, num_filters=16, kernel_size=3, strides=1, activation='relu', batch_normalization=True, conv_first=True): """2D Convolution-Batch Normalization-Activation stack builder # Arguments inputs (tensor): input tensor from input image or previous layer num_filters (int): Conv2D number of filters kernel_size (int): Conv2D square kernel dimensions strides (int): Conv2D square stride dimensions activation (string): activation name batch_normalization (bool): whether to include batch normalization conv_first (bool): conv-bn-activation (True) or bn-activation-conv (False) # Returns x (tensor): tensor as input to the next layer """ conv = Conv2D(num_filters, kernel_size=kernel_size, strides=strides, padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4)) x = inputs if conv_first: x = conv(x) if batch_normalization: x = BatchNormalization()(x) if activation is not None: x = Activation(activation)(x) else: if batch_normalization: x = BatchNormalization()(x) if activation is not None: x = Activation(activation)(x) x = conv(x) return x def resnet_v1(input_shape, depth, num_classes=10): """ResNet Version 1 Model builder [a] Stacks of 2 x (3 x 3) Conv2D-BN-ReLU Last ReLU is after the shortcut connection. At the beginning of each stage, the feature map size is halved (downsampled) by a convolutional layer with strides=2, while the number of filters is doubled. Within each stage, the layers have the same number filters and the same number of filters. Features maps sizes: stage 0: 32x32, 16 stage 1: 16x16, 32 stage 2: 8x8, 64 The Number of parameters is approx the same as Table 6 of [a]: ResNet20 0.27M ResNet32 0.46M ResNet44 0.66M ResNet56 0.85M ResNet110 1.7M # Arguments input_shape (tensor): shape of input image tensor depth (int): number of core convolutional layers num_classes (int): number of classes (CIFAR10 has 10) # Returns model (Model): Keras model instance """ if (depth - 2) % 6 != 0: raise ValueError('depth should be 6n+2 (eg 20, 32, 44 in [a])') # Start model definition. num_filters = 16 num_res_blocks = int((depth - 2) / 6) inputs = Input(shape=input_shape) x = resnet_layer(inputs=inputs) # Instantiate the stack of residual units for stack in range(3): for res_block in range(num_res_blocks): strides = 1 if stack > 0 and res_block == 0: # first layer but not first stack strides = 2 # downsample y = resnet_layer(inputs=x, num_filters=num_filters, strides=strides) y = resnet_layer(inputs=y, num_filters=num_filters, activation=None) if stack > 0 and res_block == 0: # first layer but not first stack # linear projection residual shortcut connection to match # changed dims x = resnet_layer(inputs=x, num_filters=num_filters, kernel_size=1, strides=strides, activation=None, batch_normalization=False) x = keras.layers.add([x, y]) x = Activation('relu')(x) num_filters *= 2 # Add classifier on top. # v1 does not use BN after last shortcut connection-ReLU x = AveragePooling2D(pool_size=8)(x) y = Flatten()(x) outputs = Dense(num_classes, activation='softmax', kernel_initializer='he_normal')(y) # Instantiate model. model = Model(inputs=inputs, outputs=outputs) return model def resnet_v2(input_shape, depth, num_classes=10): """ResNet Version 2 Model builder [b] Stacks of (1 x 1)-(3 x 3)-(1 x 1) BN-ReLU-Conv2D or also known as bottleneck layer First shortcut connection per layer is 1 x 1 Conv2D. Second and onwards shortcut connection is identity. At the beginning of each stage, the feature map size is halved (downsampled) by a convolutional layer with strides=2, while the number of filter maps is doubled. Within each stage, the layers have the same number filters and the same filter map sizes. Features maps sizes: conv1 : 32x32, 16 stage 0: 32x32, 64 stage 1: 16x16, 128 stage 2: 8x8, 256 # Arguments input_shape (tensor): shape of input image tensor depth (int): number of core convolutional layers num_classes (int): number of classes (CIFAR10 has 10) # Returns model (Model): Keras model instance """ if (depth - 2) % 9 != 0: raise ValueError('depth should be 9n+2 (eg 56 or 110 in [b])') # Start model definition. num_filters_in = 16 num_res_blocks = int((depth - 2) / 9) inputs = Input(shape=input_shape) # v2 performs Conv2D with BN-ReLU on input before splitting into 2 paths x = resnet_layer(inputs=inputs, num_filters=num_filters_in, conv_first=True) # Instantiate the stack of residual units for stage in range(3): for res_block in range(num_res_blocks): activation = 'relu' batch_normalization = True strides = 1 if stage == 0: num_filters_out = num_filters_in * 4 if res_block == 0: # first layer and first stage activation = None batch_normalization = False else: num_filters_out = num_filters_in * 2 if res_block == 0: # first layer but not first stage strides = 2 # downsample # bottleneck residual unit y = resnet_layer(inputs=x, num_filters=num_filters_in, kernel_size=1, strides=strides, activation=activation, batch_normalization=batch_normalization, conv_first=False) y = resnet_layer(inputs=y, num_filters=num_filters_in, conv_first=False) y = resnet_layer(inputs=y, num_filters=num_filters_out, kernel_size=1, conv_first=False) if res_block == 0: # linear projection residual shortcut connection to match # changed dims x = resnet_layer(inputs=x, num_filters=num_filters_out, kernel_size=1, strides=strides, activation=None, batch_normalization=False) x = keras.layers.add([x, y]) num_filters_in = num_filters_out # Add classifier on top. # v2 has BN-ReLU before Pooling x = BatchNormalization()(x) x = Activation('relu')(x) x = AveragePooling2D(pool_size=8)(x) y = Flatten()(x) outputs = Dense(num_classes, activation='softmax', kernel_initializer='he_normal')(y) # Instantiate model. model = Model(inputs=inputs, outputs=outputs) return model if version == 2: model = resnet_v2(input_shape=input_shape, depth=depth) else: model = resnet_v1(input_shape=input_shape, depth=depth) model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=lr_schedule(0)), metrics=['accuracy']) model.summary() print(model_type) # Prepare model model saving directory. save_dir = os.path.join(os.getcwd(), 'saved_models') model_name = 'cifar10_%s_model.{epoch:03d}.h5' % model_type if not os.path.isdir(save_dir): os.makedirs(save_dir) filepath = os.path.join(save_dir, model_name) # Prepare callbacks for model saving and for learning rate adjustment. checkpoint = ModelCheckpoint(filepath=filepath, monitor='val_acc', verbose=1, save_best_only=True) lr_scheduler = LearningRateScheduler(lr_schedule) lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1), cooldown=0, patience=5, min_lr=0.5e-6) callbacks = [checkpoint, lr_reducer, lr_scheduler] # Run training, with or without data augmentation. if not data_augmentation: print('Not using data augmentation.') model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test, y_test), shuffle=True, callbacks=callbacks) else: print('Using real-time data augmentation.') # This will do preprocessing and realtime data augmentation: datagen = ImageDataGenerator( # set input mean to 0 over the dataset featurewise_center=False, # set each sample mean to 0 samplewise_center=False, # divide inputs by std of dataset featurewise_std_normalization=False, # divide each input by its std samplewise_std_normalization=False, # apply ZCA whitening zca_whitening=False, # epsilon for ZCA whitening zca_epsilon=1e-06, # randomly rotate images in the range (deg 0 to 180) rotation_range=0, # randomly shift images horizontally width_shift_range=0.1, # randomly shift images vertically height_shift_range=0.1, # set range for random shear shear_range=0., # set range for random zoom zoom_range=0., # set range for random channel shifts channel_shift_range=0., # set mode for filling points outside the input boundaries fill_mode='nearest', # value used for fill_mode = "constant" cval=0., # randomly flip images horizontal_flip=True, # randomly flip images vertical_flip=False, # set rescaling factor (applied before any other transformation) rescale=None, # set function that will be applied on each input preprocessing_function=None, # image data format, either "channels_first" or "channels_last" data_format=None, # fraction of images reserved for validation (strictly between 0 and 1) validation_split=0.0) # Compute quantities required for featurewise normalization # (std, mean, and principal components if ZCA whitening is applied). datagen.fit(x_train) # Fit the model on the batches generated by datagen.flow(). model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size), steps_per_epoch=x_train.shape[0], validation_data=(x_test, y_test), epochs=epochs, verbose=1, workers=4, callbacks=callbacks) # Score trained model. scores = model.evaluate(x_test, y_test, verbose=1) print('Test loss:', scores[0]) print('Test accuracy:', scores[1]) # output time using end = timeit.default_timer() tdf = end -start timeh = tdf // 3600 timem = tdf // 60 times = tdf % 60 print("use time: " , int(timeh) , "h" , int(timem) , "m" ,times, "s") # output time using end = timeit.default_timer() tdf = end -start timeh = tdf // 3600 timem = tdf // 60 times = tdf % 60 print("use time: " , int(timeh) , "h" , int(timem) , "m" ,times, "s")
直接运行后会有错误
python3 cifar10_cnn.py ValueError: steps_per_epoch=None is only valid for a generator based on the keras.utils.Sequence class. Please specify steps_per_epoch or use the keras.utils.Sequence class.
这个是因为版本更迭,有些函数的参数做了修改
只须要将 cifar10_resnet.py
文件中
model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size), validation_data=(x_test, y_test), epochs=epochs, verbose=1, workers=4, callbacks=callbacks)
修改成
model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size), steps_per_epoch=x_train.shape[0] // batch_size, validation_data=(x_test, y_test), epochs=epochs, verbose=1, workers=4, callbacks=callbacks)
Using TensorFlow backend. x_train shape: (50000, 32, 32, 3) 50000 train samples 10000 test samples y_train shape: (50000, 1) Learning rate: 0.001 __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) (None, 32, 32, 3) 0 __________________________________________________________________________________________________ conv2d_1 (Conv2D) (None, 32, 32, 16) 448 input_1[0][0] __________________________________________________________________________________________________ batch_normalization_1 (BatchNor (None, 32, 32, 16) 64 conv2d_1[0][0] __________________________________________________________________________________________________ activation_1 (Activation) (None, 32, 32, 16) 0 batch_normalization_1[0][0] __________________________________________________________________________________________________ conv2d_2 (Conv2D) (None, 32, 32, 16) 2320 activation_1[0][0] __________________________________________________________________________________________________ batch_normalization_2 (BatchNor (None, 32, 32, 16) 64 conv2d_2[0][0] __________________________________________________________________________________________________ activation_2 (Activation) (None, 32, 32, 16) 0 batch_normalization_2[0][0] __________________________________________________________________________________________________ conv2d_3 (Conv2D) (None, 32, 32, 16) 2320 activation_2[0][0] __________________________________________________________________________________________________ batch_normalization_3 (BatchNor (None, 32, 32, 16) 64 conv2d_3[0][0] __________________________________________________________________________________________________ add_1 (Add) (None, 32, 32, 16) 0 activation_1[0][0] batch_normalization_3[0][0] __________________________________________________________________________________________________ activation_3 (Activation) (None, 32, 32, 16) 0 add_1[0][0] __________________________________________________________________________________________________ conv2d_4 (Conv2D) (None, 32, 32, 16) 2320 activation_3[0][0] __________________________________________________________________________________________________ batch_normalization_4 (BatchNor (None, 32, 32, 16) 64 conv2d_4[0][0] __________________________________________________________________________________________________ activation_4 (Activation) (None, 32, 32, 16) 0 batch_normalization_4[0][0] __________________________________________________________________________________________________ conv2d_5 (Conv2D) (None, 32, 32, 16) 2320 activation_4[0][0] __________________________________________________________________________________________________ batch_normalization_5 (BatchNor (None, 32, 32, 16) 64 conv2d_5[0][0] __________________________________________________________________________________________________ add_2 (Add) (None, 32, 32, 16) 0 activation_3[0][0] batch_normalization_5[0][0] __________________________________________________________________________________________________ activation_5 (Activation) (None, 32, 32, 16) 0 add_2[0][0] __________________________________________________________________________________________________ conv2d_6 (Conv2D) (None, 32, 32, 16) 2320 activation_5[0][0] __________________________________________________________________________________________________ batch_normalization_6 (BatchNor (None, 32, 32, 16) 64 conv2d_6[0][0] __________________________________________________________________________________________________ activation_6 (Activation) (None, 32, 32, 16) 0 batch_normalization_6[0][0] __________________________________________________________________________________________________ conv2d_7 (Conv2D) (None, 32, 32, 16) 2320 activation_6[0][0] __________________________________________________________________________________________________ batch_normalization_7 (BatchNor (None, 32, 32, 16) 64 conv2d_7[0][0] __________________________________________________________________________________________________ add_3 (Add) (None, 32, 32, 16) 0 activation_5[0][0] batch_normalization_7[0][0] __________________________________________________________________________________________________ activation_7 (Activation) (None, 32, 32, 16) 0 add_3[0][0] __________________________________________________________________________________________________ conv2d_8 (Conv2D) (None, 16, 16, 32) 4640 activation_7[0][0] __________________________________________________________________________________________________ batch_normalization_8 (BatchNor (None, 16, 16, 32) 128 conv2d_8[0][0] __________________________________________________________________________________________________ activation_8 (Activation) (None, 16, 16, 32) 0 batch_normalization_8[0][0] __________________________________________________________________________________________________ conv2d_9 (Conv2D) (None, 16, 16, 32) 9248 activation_8[0][0] __________________________________________________________________________________________________ conv2d_10 (Conv2D) (None, 16, 16, 32) 544 activation_7[0][0] __________________________________________________________________________________________________ batch_normalization_9 (BatchNor (None, 16, 16, 32) 128 conv2d_9[0][0] __________________________________________________________________________________________________ add_4 (Add) (None, 16, 16, 32) 0 conv2d_10[0][0] batch_normalization_9[0][0] __________________________________________________________________________________________________ activation_9 (Activation) (None, 16, 16, 32) 0 add_4[0][0] __________________________________________________________________________________________________ conv2d_11 (Conv2D) (None, 16, 16, 32) 9248 activation_9[0][0] __________________________________________________________________________________________________ batch_normalization_10 (BatchNo (None, 16, 16, 32) 128 conv2d_11[0][0] __________________________________________________________________________________________________ activation_10 (Activation) (None, 16, 16, 32) 0 batch_normalization_10[0][0] __________________________________________________________________________________________________ conv2d_12 (Conv2D) (None, 16, 16, 32) 9248 activation_10[0][0] __________________________________________________________________________________________________ batch_normalization_11 (BatchNo (None, 16, 16, 32) 128 conv2d_12[0][0] __________________________________________________________________________________________________ add_5 (Add) (None, 16, 16, 32) 0 activation_9[0][0] batch_normalization_11[0][0] __________________________________________________________________________________________________ activation_11 (Activation) (None, 16, 16, 32) 0 add_5[0][0] __________________________________________________________________________________________________ conv2d_13 (Conv2D) (None, 16, 16, 32) 9248 activation_11[0][0] __________________________________________________________________________________________________ batch_normalization_12 (BatchNo (None, 16, 16, 32) 128 conv2d_13[0][0] __________________________________________________________________________________________________ activation_12 (Activation) (None, 16, 16, 32) 0 batch_normalization_12[0][0] __________________________________________________________________________________________________ conv2d_14 (Conv2D) (None, 16, 16, 32) 9248 activation_12[0][0] __________________________________________________________________________________________________ batch_normalization_13 (BatchNo (None, 16, 16, 32) 128 conv2d_14[0][0] __________________________________________________________________________________________________ add_6 (Add) (None, 16, 16, 32) 0 activation_11[0][0] batch_normalization_13[0][0] __________________________________________________________________________________________________ activation_13 (Activation) (None, 16, 16, 32) 0 add_6[0][0] __________________________________________________________________________________________________ conv2d_15 (Conv2D) (None, 8, 8, 64) 18496 activation_13[0][0] __________________________________________________________________________________________________ batch_normalization_14 (BatchNo (None, 8, 8, 64) 256 conv2d_15[0][0] __________________________________________________________________________________________________ activation_14 (Activation) (None, 8, 8, 64) 0 batch_normalization_14[0][0] __________________________________________________________________________________________________ conv2d_16 (Conv2D) (None, 8, 8, 64) 36928 activation_14[0][0] __________________________________________________________________________________________________ conv2d_17 (Conv2D) (None, 8, 8, 64) 2112 activation_13[0][0] __________________________________________________________________________________________________ batch_normalization_15 (BatchNo (None, 8, 8, 64) 256 conv2d_16[0][0] __________________________________________________________________________________________________ add_7 (Add) (None, 8, 8, 64) 0 conv2d_17[0][0] batch_normalization_15[0][0] __________________________________________________________________________________________________ activation_15 (Activation) (None, 8, 8, 64) 0 add_7[0][0] __________________________________________________________________________________________________ conv2d_18 (Conv2D) (None, 8, 8, 64) 36928 activation_15[0][0] __________________________________________________________________________________________________ batch_normalization_16 (BatchNo (None, 8, 8, 64) 256 conv2d_18[0][0] __________________________________________________________________________________________________ activation_16 (Activation) (None, 8, 8, 64) 0 batch_normalization_16[0][0] __________________________________________________________________________________________________ conv2d_19 (Conv2D) (None, 8, 8, 64) 36928 activation_16[0][0] __________________________________________________________________________________________________ batch_normalization_17 (BatchNo (None, 8, 8, 64) 256 conv2d_19[0][0] __________________________________________________________________________________________________ add_8 (Add) (None, 8, 8, 64) 0 activation_15[0][0] batch_normalization_17[0][0] __________________________________________________________________________________________________ activation_17 (Activation) (None, 8, 8, 64) 0 add_8[0][0] __________________________________________________________________________________________________ conv2d_20 (Conv2D) (None, 8, 8, 64) 36928 activation_17[0][0] __________________________________________________________________________________________________ batch_normalization_18 (BatchNo (None, 8, 8, 64) 256 conv2d_20[0][0] __________________________________________________________________________________________________ activation_18 (Activation) (None, 8, 8, 64) 0 batch_normalization_18[0][0] __________________________________________________________________________________________________ conv2d_21 (Conv2D) (None, 8, 8, 64) 36928 activation_18[0][0] __________________________________________________________________________________________________ batch_normalization_19 (BatchNo (None, 8, 8, 64) 256 conv2d_21[0][0] __________________________________________________________________________________________________ add_9 (Add) (None, 8, 8, 64) 0 activation_17[0][0] batch_normalization_19[0][0] __________________________________________________________________________________________________ activation_19 (Activation) (None, 8, 8, 64) 0 add_9[0][0] __________________________________________________________________________________________________ average_pooling2d_1 (AveragePoo (None, 1, 1, 64) 0 activation_19[0][0] __________________________________________________________________________________________________ flatten_1 (Flatten) (None, 64) 0 average_pooling2d_1[0][0] __________________________________________________________________________________________________ dense_1 (Dense) (None, 10) 650 flatten_1[0][0] ================================================================================================== Total params: 274,442 Trainable params: 273,066 Non-trainable params: 1,376 __________________________________________________________________________________________________ ResNet20v1 Using real-time data augmentation. Epoch 1/10 Learning rate: 0.001 successfully opened CUDA library libcublas.so.10.0 locally 50000/50000 [==============================] - 11286s 226ms/step - loss: 0.7185 - acc: 0.8164 - val_loss: 0.7312 - val_acc: 0.8302
训练一个 Epoch
须要3个小时左右,训练后测试集精度为83.03%。例子须要训练200个Epoch
,Jentson Nano 的 0.5T 的算力太差,不适合训练模型,计算量太大选择放弃。