安装好了tensorflow(TensorFlow 安装笔记),接下来就在他的官网指导下进行Mnist手写数字识别实验。node
进入tfgpu虚拟环境后,首先进入目录:/anaconda2/envs/tfgpu/lib/python2.7/site-packages/tensorflow/examples/tutorials/mnist/
,而后进入IPython交互终端。python
In [4]: from tensorflow.examples.tutorials.mnist import input_data ...: mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) ...: Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes. Extracting MNIST_data/train-images-idx3-ubyte.gz Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes. Extracting MNIST_data/train-labels-idx1-ubyte.gz Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes. Extracting MNIST_data/t10k-images-idx3-ubyte.gz Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes. Extracting MNIST_data/t10k-labels-idx1-ubyte.gz In [5]: import tensorflow as tf In [6]: x = tf.placeholder(tf.float32, [None, 784]) In [7]: W = tf.Variable(tf.zeros([784, 10])) ...: b = tf.Variable(tf.zeros([10])) ...: In [8]: y = tf.nn.softmax(tf.matmul(x, W) + b) In [9]: y_ = tf.placeholder(tf.float32, [None, 10]) In [10]: cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) In [11]: train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) In [12]: init = tf.initialize_all_variables() In [13]: sess = tf.Session() ...: sess.run(init) ...: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: GeForce 940M major: 5 minor: 0 memoryClockRate (GHz) 1.124 pciBusID 0000:08:00.0 Total memory: 1023.88MiB Free memory: 997.54MiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940M, pci bus id: 0000:08:00.0) In [14]: for i in range(1000): ...: batch_xs, batch_ys = mnist.train.next_batch(100) ...: sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) ...: In [15]: correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) In [16]: accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) ...: In [17]: print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})) 0.9186
In [4]:主要是下载数据 In [6]:意思是先分配输入x,None即输入图片数量稍后运行时肯定,784即28*28,把一张28*28的图片拉长成为一维向量,保证每张图片拉长方式相同便可 In [7]:分配权重w和偏置b In [8]:实现softmax模型,得到输出判断值y In [9]: 分配实际判断值y_ In [10]:得到交叉熵形式的代价函数 In [11]:每一步使用0.5的学习率(步长)来进行梯度降低算法 In [12]:初始化全部变量 In [13]:开启一个会话,启动模型 In [14]:进行1000次随机梯度降低算法 In [15]:比较输出判断值y和真实判断值y_ In [16]:得到准确率 In [17]:得到测试集上的准确率:91.86%
In [1]: from tensorflow.examples.tutorials.mnist import input_data ...: mnist = input_data.read_data_sets('MNIST_data', one_hot=True) ...: I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally Extracting MNIST_data/train-images-idx3-ubyte.gz Extracting MNIST_data/train-labels-idx1-ubyte.gz Extracting MNIST_data/t10k-images-idx3-ubyte.gz Extracting MNIST_data/t10k-labels-idx1-ubyte.gz In [2]: import tensorflow as tf ...: sess = tf.InteractiveSession() ...: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: GeForce 940M major: 5 minor: 0 memoryClockRate (GHz) 1.124 pciBusID 0000:08:00.0 Total memory: 1023.88MiB Free memory: 997.54MiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940M, pci bus id: 0000:08:00.0) In [3]: def weight_variable(shape): ...: initial = tf.truncated_normal(shape, stddev=0.1) ...: return tf.Variable(initial) ...: ...: def bias_variable(shape): ...: initial = tf.constant(0.1, shape=shape) ...: return tf.Variable(initial) ...: In [4]: In [4]: def conv2d(x, W): ...: return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') ...: ...: def max_pool_2x2(x): ...: return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], ...: strides=[1, 2, 2, 1], padding='SAME') ...: In [5]: W_conv1 = weight_variable([5, 5, 1, 32]) ...: b_conv1 = bias_variable([32]) ...: In [6]: In [7]: x = tf.placeholder(tf.float32, shape=[None, 784]) ...: y_ = tf.placeholder(tf.float32, shape=[None, 10]) ...: In [8]: x_image = tf.reshape(x, [-1,28,28,1]) ...: In [9]: h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) ...: h_pool1 = max_pool_2x2(h_conv1) ...: In [10]: W_conv2 = weight_variable([5, 5, 32, 64]) ...: b_conv2 = bias_variable([64]) ...: ...: h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) ...: h_pool2 = max_pool_2x2(h_conv2) ...: In [11]: W_fc1 = weight_variable([7 * 7 * 64, 1024]) ...: b_fc1 = bias_variable([1024]) ...: ...: h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) ...: h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) ...: In [12]: keep_prob = tf.placeholder(tf.float32) ...: h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) ...: In [13]: W_fc2 = weight_variable([1024, 10]) ...: b_fc2 = bias_variable([10]) ...: ...: y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) ...: In [14]: cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1])) ...: train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) ...: correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) ...: accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) ...: sess.run(tf.initialize_all_variables()) ...: for i in range(2000): ...: batch = mnist.train.next_batch(50) ...: if i%100 == 0: ...: train_accuracy = accuracy.eval(feed_dict={ ...: x:batch[0], y_: batch[1], keep_prob: 1.0}) ...: print("step %d, training accuracy %g"%(i, train_accuracy)) ...: train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) ...: ...: print("test accuracy %g"%accuracy.eval(feed_dict={ ...: x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})) ...: step 0, training accuracy 0.04 step 100, training accuracy 0.86 step 200, training accuracy 0.92 step 300, training accuracy 0.88 step 400, training accuracy 0.96 step 500, training accuracy 0.9 step 600, training accuracy 1 step 700, training accuracy 0.98 step 800, training accuracy 0.92 step 900, training accuracy 0.98 step 1000, training accuracy 0.94 step 1100, training accuracy 0.96 step 1200, training accuracy 1 step 1300, training accuracy 0.98 step 1400, training accuracy 0.94 step 1500, training accuracy 0.96 step 1600, training accuracy 1 step 1700, training accuracy 0.92 step 1800, training accuracy 0.92 step 1900, training accuracy 0.96 I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (256): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (512): Total Chunks: 1, Chunks in use: 0 768B allocated for chunks. 6.4KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (1024): Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin. ... ... ... ... Limit: 836280320 InUse: 83845120 MaxInUse: 117678336 NumAllocs: 246915 MaxAllocSize: 45883392 W tensorflow/core/common_runtime/bfc_allocator.cc:270] *****_******________________________________________________________________________________________ W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 957.03MiB. See logs for memory state. W tensorflow/core/framework/op_kernel.cc:936] Resource exhausted: OOM when allocating tensor with shape[10000,28,28,32] E tensorflow/core/client/tensor_c_api.cc:485] OOM when allocating tensor with shape[10000,28,28,32] [[Node: Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](Reshape, Variable/read)]] In [20]: cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1])) ...: train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) ...: correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) ...: accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) ...: sess.run(tf.initialize_all_variables()) ...: for i in range(20000): ...: batch = mnist.train.next_batch(50) ...: if i%100 == 0: ...: train_accuracy = accuracy.eval(feed_dict={ ...: x:batch[0], y_: batch[1], keep_prob: 1.0}) ...: print("step %d, training accuracy %g"%(i, train_accuracy)) ...: train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) ...: ...: print("test accuracy %g"%accuracy.eval(feed_dict={ ...: x: mnist.test.images[0:200,:], y_: mnist.test.labels[0:200,:], keep_prob: 1.0})) ...: step 0, training accuracy 0.12 step 100, training accuracy 0.78 step 200, training accuracy 0.88 step 300, training accuracy 0.96 step 400, training accuracy 0.9 step 500, training accuracy 0.96 step 600, training accuracy 0.94 step 700, training accuracy 0.92 step 800, training accuracy 0.92 step 900, training accuracy 0.96 step 1000, training accuracy 0.94 step 1100, training accuracy 0.98 step 1200, training accuracy 0.96 step 1300, training accuracy 1 step 1400, training accuracy 0.98 ... ... ... test accuracy 0.995 In [21]: cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) ...: train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) ...: correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) ...: accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) ...: sess.run(tf.initialize_all_variables()) ...: for i in range(20000): ...: batch = mnist.train.next_batch(50) ...: if i%100 == 0: ...: train_accuracy = accuracy.eval(feed_dict={ ...: x:batch[0], y_: batch[1], keep_prob: 1.0}) ...: print "step %d, training accuracy %g"%(i, train_accuracy) ...: train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) ...: ...: print "test accuracy %g"%accuracy.eval(feed_dict={ ...: x: mnist.test.images[200:400,:], y_: mnist.test.labels[200:400,:], keep_prob: 1.0}) ...: ...: step 0, training accuracy 0.12 step 100, training accuracy 0.94 step 200, training accuracy 0.86 step 300, training accuracy 0.96 step 400, training accuracy 0.9 step 500, training accuracy 1 step 600, training accuracy 0.96 step 700, training accuracy 0.88 step 800, training accuracy 1 step 900, training accuracy 0.98 step 1000, training accuracy 0.96 step 1100, training accuracy 0.94 step 1200, training accuracy 0.96 step 1300, training accuracy 0.96 step 1400, training accuracy 0.94 step 1500, training accuracy 0.98 step 1600, training accuracy 0.96 step 1700, training accuracy 0.98 ... ... ... test accuracy 0.975 In [22]: for i in range(20000): ...: batch = mnist.train.next_batch(50) ...: if i%100 == 0: ...: train_accuracy = accuracy.eval(feed_dict={ ...: x:batch[0], y_: batch[1], keep_prob: 1.0}) ...: print "step %d, training accuracy %g"%(i, train_accuracy) ...: train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) ...: ...: print "test accuracy %g"%accuracy.eval(feed_dict={ ...: x: mnist.test.images[400:1000,:], y_: mnist.test.labels[400:1000,:], keep_prob: 1.0}) ...: step 0, training accuracy 1 step 100, training accuracy 1 step 200, training accuracy 0.98 step 300, training accuracy 0.98 step 400, training accuracy 0.98 step 500, training accuracy 1 step 600, training accuracy 0.96 step 700, training accuracy 0.96 ... ... ... test accuracy 0.983333
In [1]: 导入数据,即测试集和验证集 In [2]: 引入 tensorflow 启动InteractiveSession(比session更灵活) In [3]: 定义两个初始化w和b的函数,方便后续操做 In [4]: 定义卷积和池化函数,这里卷积采用padding,使得输入输出图像同样大,池化采起2x2,那么就是4格变一格 In [5]: 定义第一层卷积的w和b In [7]: 分配输入x和y_ In [8]: 修改x的shape In [9]: 把x_image和w进行卷积,加上b,而后应用ReLU激活函数,最后进行max-pooling In [10]: 第二层卷积,和第一层卷积相似 In [11]: 全链接层 In [12]: 为了减小过拟合,能够在输出层以前加入dropout。(可是本例子比较简单,即便不加,影响也不大) In [13]: 由一个softmax层来获得输出 In [14]: 定义代价函数,训练步骤,用ADAM来进行优化,能够看出,最后测试集太大了,我得显存不够 In [20]: 只使用1~200个图片做为测试集,正确率是 0.995 In [21]: 只使用201~400个图片做为测试集,正确率是 0.975 In [22]: 只使用401~1000个图片做为测试集,正确率是 0.983333
这个CNN的结构以下图所示:算法
如今尝试修改这个CNN结构,增长特征数量以期得到更好的效果,修改后的CNN结构如图:segmentfault
实验过程以下:api
In [23]: def weight_variable(shape): ...: initial = tf.truncated_normal(shape, stddev=0.1) ...: return tf.Variable(initial) ...: ...: def bias_variable(shape): ...: initial = tf.constant(0.1, shape=shape) ...: return tf.Variable(initial) ...: def conv2d(x, W): ...: return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') ...: def max_pool_2x2(x): ...: return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], ...: strides=[1, 2, 2, 1], padding='SAME') ...: W_conv1 = weight_variable([5, 5, 1, 64]) ...: b_conv1 = bias_variable([64]) ...: h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) ...: h_pool1 = max_pool_2x2(h_conv1) ...: W_conv2 = weight_variable([5, 5, 64, 128]) ...: b_conv2 = bias_variable([128]) ...: ...: h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) ...: h_pool2 = max_pool_2x2(h_conv2) ...: W_fc1 = weight_variable([7 * 7 * 128, 1024]) ...: b_fc1 = bias_variable([1024]) ...: ...: h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*128]) ...: h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) ...: keep_prob = tf.placeholder("float") ...: h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) ...: W_fc2 = weight_variable([1024, 10]) ...: b_fc2 = bias_variable([10]) ...: ...: y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) ...: cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) ...: train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) ...: correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) ...: accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) ...: sess.run(tf.initialize_all_variables()) ...: for i in range(20000): ...: batch = mnist.train.next_batch(50) ...: if i%100 == 0: ...: train_accuracy = accuracy.eval(feed_dict={ ...: x:batch[0], y_: batch[1], keep_prob: 1.0}) ...: print "step %d, training accuracy %g"%(i, train_accuracy) ...: train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) ...: ...: print "test accuracy %g"%accuracy.eval(feed_dict={ ...: x: mnist.test.images[0:200,:], y_: mnist.test.labels[0:200,:], keep_prob: 1.0}) ...: ... ... ... test accuracy 1 In [24]: for i in range(20000): ...: batch = mnist.train.next_batch(50) ...: if i%100 == 0: ...: train_accuracy = accuracy.eval(feed_dict={ ...: x:batch[0], y_: batch[1], keep_prob: 1.0}) ...: print "step %d, training accuracy %g"%(i, train_accuracy) ...: train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) ...: ...: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[200:400,:], y_: mnist.test.labels[200:400,:], keep_prob: 1.0}) ...: ... ... ... test accuracy 0.975 In [25]: for i in range(20000): ...: batch = mnist.train.next_batch(50) ...: if i%100 == 0: ...: train_accuracy = accuracy.eval(feed_dict={ ...: x:batch[0], y_: batch[1], keep_prob: 1.0}) ...: print "step %d, training accuracy %g"%(i, train_accuracy) ...: train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) ...: ...: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[400:1000,:], y_: mnist.test.labels[400:1000,:], keep_prob: 1.0}) ...: ... ... ... W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 717.77MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available. In [26]: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[400:600,:], y_: mnist.test.labels[400:600,:], keep_prob: 1.0}) ...: test accuracy 0.985 In [28]: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[600:800,:], y_: mnist.test.labels[600:800,:], keep_prob: 1.0}) ...: ...: ...: test accuracy 0.985 In [29]: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[800:1000,:], y_: mnist.test.labels[800:1000,:], keep_prob: 1.0}) ...: ...: test accuracy 0.995
修改前的平均准确率是:网络
( 0.995*2 + 0.975*2 + 0.9833*6 )/ 10 = 0.98398
修改后的平均准确率是:session
(1*2 + 0.975*2+ 0.985*4 + 0.995*2)/ 10 = 0.98800
能够看出增长特征事后,准确率提升了,可是内存消耗也变大了(400~1000的图片验证出现了OOM错误),并且实验过程当中也感觉到时间消耗更大,怎么取舍就取决于具体需求和具体的硬件配置了。python2.7