一、先看官方文档,学习如何使用python调用caffe2包,包括html
caffe2官方教程以python语言为主,指导如何使用python调用caffe2,文档依次从最基本caffe中的几个重要的类的概念、如何使用基础类搭建一个小网络、如何数据预处理、如何使用预训练的模型、如何构造复杂的网络来说述caffe2的使用。初学者能够先行学习官方文档caffe2-tutorials,理解caffe2 中的网络构建、网络训练的理念与思路,体会caffe2与caffe在总体构造上的不一样。python
二、结合着caffe2源码看python实际调用的c++类c++
在python中,caffe2这个包中类与函数大部分是封装了源码文件夹caffe2/caffe2/core下的c++源文件,如基础数据类Tensor,操做类Operator等,经过使用python中类的使用,找到对应c++源码中类和函数的构造和实现,能够为使用c++直接构建和训练网络打下准备。git
如下总结基于官方文档和部分网络资料。github
首先从咱们本身的角度出发来思考,假设咱们本身须要写一个简单的多层神经网络并训练,通常逻辑上咱们须要考虑数据的定义、数据的流动 、数据的更新。数组
在caffe中,数据储存在Blob类的实例当中,在这里,咱们能够理解blob就像是numpy中数组,起的做用就是存储数据。输入的blobs通过不一样层的往前传递,获得输出的blobs,caffe中,咱们能够认为对数据最基本的运算单位是layer。每一层的layer定义了不一样的计算方式,数据通过不一样的层,都作了相应的运算,由这些layers组合到一块儿网络即构成了net,net本质上是一个计算网络。当数据流动的方式构建好了,反向传递的梯度计算的方式也肯定,在这个基础之上,caffe中使用solver类来给定梯度更新的规则,网络在solver的控制下,不断让数据前传,再反传求梯度,再使用梯度更新权值,循环往复。网络
因此对应着caffe中,基础组成有四类:app
再看caffe2中:框架
在caffe2中,operator是caffe2中的特点,取代了caffe中layer做为net的基本构造单位。以下图所示,咱们可使用一个InnerProduct操做运输符号来完成InnerProductLayer的功能。operator的接口定义在caffe2/proto/caffe2.proto,通常来讲,operator接受一串输入,产生一串输出。dom
因为operator定义很基础,很抽象,所以caffe2中的权值初始化、前传、反传、梯度更新均可以用operator实现,因此solver、layer类在caffe2中都不是必要的。在caffe2中,对应的基础组成有
具体使用和理解以下,先用python:
在使用以前,咱们先导入caffe2.core和workspace,基础的类和函数都在其中。同时咱们须要导入caffe2.proto来对protobuf文件进行必要操做。
# We'll also import a few standard python libraries from matplotlib import pyplot import numpy as np import time # These are the droids you are looking for. from caffe2.python import core, workspace from caffe2.proto import caffe2_pb2 # Let's show all plots inline. %matplotlib inline
一、workspace
咱们能够把workspace理解成matlab中变量存储区,咱们能够把定义好的数据blob或net放到都在一个workspace中,也能够用不用的workspace来区分。
下面咱们打印一下当前workspace中blob状况。Blobs()取出blob,HasBlobs(name)判断是否有此名字的blob。
print("Current blobs in the workspace: {}".format(workspace.Blobs())) print("Workspace has blob 'X'? {}".format(workspace.HasBlob("X")))
一开始,固然结果是啥也没有。
咱们使用FeedBlob来给当前workspace添加blob,再打印出来:
X = np.random.randn(2, 3).astype(np.float32) print("Generated X from numpy:\n{}".format(X)) workspace.FeedBlob("X", X)
Generated X from numpy: [[-0.56927377 -1.28052795 -0.95808828] [-0.44225693 -0.0620895 -0.50509363]]
print("Current blobs in the workspace: {}".format(workspace.Blobs())) print("Workspace has blob 'X'? {}".format(workspace.HasBlob("X"))) print("Fetched X:\n{}".format(workspace.FetchBlob("X")))
Current blobs in the workspace: [u'X'] Workspace has blob 'X'? True Fetched X: [[-0.56927377 -1.28052795 -0.95808828] [-0.44225693 -0.0620895 -0.50509363]]
固然,咱们也用多个名字定义多个workspace,而且能够切换工做空间。咱们可使用currentworkspace()在访问当前工做空间,使用switchworkspace(name)来切换工做空间。
print("Current workspace: {}".format(workspace.CurrentWorkspace())) print("Current blobs in the workspace: {}".format(workspace.Blobs())) # Switch the workspace. The second argument "True" means creating # the workspace if it is missing. workspace.SwitchWorkspace("gutentag", True) # Let's print the current workspace. Note that there is nothing in the # workspace yet. print("Current workspace: {}".format(workspace.CurrentWorkspace())) print("Current blobs in the workspace: {}".format(workspace.Blobs()))
Current workspace: default Current blobs in the workspace: ['X'] Current workspace: gutentag Current blobs in the workspace: []
总结一下,在这里workspace功能相似于matlab中的工做区,变量存储在其中,咱们能够经过工做区去访问在工做区中net和blob。
二、Operators
一般咱们在python中,可使用core.CreateOperator来直接创造,也可使用core.Net来访问建立operator,还可使用modelHelper来访问建立operators。在这里咱们使用core.CreateOperator来简单理解operator,在实际状况下,咱们建立网络的时候,不会直接建立每一个operator,这样太麻烦,通常使用modelhelper来帮忙咱们建立网络。
# Create an operator. op = core.CreateOperator( "Relu", # The type of operator that we want to run ["X"], # A list of input blobs by their names ["Y"], # A list of output blobs by their names ) # and we are done!
上面的代码建立了一个Relu运算符,在这里须要知道,在python中建立一个operator,只是定义了一个operator,其实并无运行这个operator。在上面代码中建立的op,其实是一个protobuf对象。
print("Type of the created op is: {}".format(type(op))) print("Content:\n") print(str(op))
Type of the created op is: <class 'caffe2.proto.caffe2_pb2.OperatorDef'> Content: input: "X" output: "Y" name: "" type: "Relu"
在创造op以后,咱们在当前的工做区中添加输入X,而后使用RunOperatorOnce运行这个operator。运行以后,咱们对比下获得的结果。
workspace.FeedBlob("X", np.random.randn(2, 3).astype(np.float32)) workspace.RunOperatorOnce(op)
print("Current blobs in the workspace: {}\n".format(workspace.Blobs())) print("X:\n{}\n".format(workspace.FetchBlob("X"))) print("Y:\n{}\n".format(workspace.FetchBlob("Y"))) print("Expected:\n{}\n".format(np.maximum(workspace.FetchBlob("X"), 0)))
Current blobs in the workspace: ['X', 'Y'] X: [[ 1.03125858 1.0038228 0.0066975 ] [ 1.33142471 1.80271244 -0.54222912]] Y: [[ 1.03125858 1.0038228 0.0066975 ] [ 1.33142471 1.80271244 0. ]] Expected: [[ 1.03125858 1.0038228 0.0066975 ] [ 1.33142471 1.80271244 0. ]]
此外,operator相对于layer更为抽象。operator不只仅能够替代layer类,还能够接受无参数的输入来输出数据,从而用来生成数据,经常使用来初始化权值。下面这一段就能够用来初始化权值。
op = core.CreateOperator(
"GaussianFill", [], # GaussianFill does not need any parameters. ["W"], shape=[100, 100], # shape argument as a list of ints. mean=1.0, # mean as a single float std=1.0, # std as a single float ) print("Content of op:\n") print(str(op))
Content of op:
output: "W" name: "" type: "GaussianFill" arg { name: "std" f: 1.0 } arg { name: "shape" ints: 100 ints: 100 } arg { name: "mean" f: 1.0 }
workspace.RunOperatorOnce(op) temp = workspace.FetchBlob("Z") pyplot.hist(temp.flatten(), bins=50) pyplot.title("Distribution of Z")
三、Nets
Nets是一系列operator的集合,从本质上,是由operator构成的计算图。Caffe2中core.net 封装了源码中 NetDef 类。咱们举个栗子,建立网络来实现如下的公式。
X = np.random.randn(2, 3) W = np.random.randn(5, 3) b = np.ones(5) Y = X * W^T + b
首先建立网络:
net = core.Net("my_first_net") print("Current network proto:\n\n{}".format(net.Proto()))
Current network proto: name: "my_first_net"
首先使用生成权值和输入,在这里,使用core.net来访问建立:
X = net.GaussianFill([], ["X"], mean=0.0, std=1.0, shape=[2, 3], run_once=0) print("New network proto:\n\n{}".format(net.Proto())) W = net.GaussianFill([], ["W"], mean=0.0, std=1.0, shape=[5, 3], run_once=0) b = net.ConstantFill([], ["b"], shape=[5,], value=1.0, run_once=0)
生成输出:
Y = net.FC([X, W, b], ["Y"])
咱们打印下当前的网络:
print("Current network proto:\n\n{}".format(net.Proto()))
Current network proto:
name: "my_first_net" op { output: "X" name: "" type: "GaussianFill" arg { name: "std" f: 1.0 } arg { name: "run_once" i: 0 } arg { name: "shape" ints: 2 ints: 3 } arg { name: "mean" f: 0.0 } } op { output: "W" name: "" type: "GaussianFill" arg { name: "std" f: 1.0 } arg { name: "run_once" i: 0 } arg { name: "shape" ints: 5 ints: 3 } arg { name: "mean" f: 0.0 } } op { output: "b" name: "" type: "ConstantFill" arg { name: "run_once" i: 0 } arg { name: "shape" ints: 5 } arg { name: "value" f: 1.0 } } op { input: "X" input: "W" input: "b" output: "Y" name: "" type: "FC" }
在这里,咱们能够画出来定义的网络:
from caffe2.python import net_drawer from IPython import display graph = net_drawer.GetPydotGraph(net, rankdir="LR") display.Image(graph.create_png(), width=800)
和operator相似,在这里咱们只定义了一个net,可是并无运行net的计算。当咱们在python运行网络时,实际上在c++层面作了两件事情:
在python中有两种方法来运行一个net:
方法一:
workspace.ResetWorkspace()
print("Current blobs in the workspace: {}".format(workspace.Blobs())) workspace.RunNetOnce(net) print("Blobs in the workspace after execution: {}".format(workspace.Blobs())) # Let's dump the contents of the blobs for name in workspace.Blobs(): print("{}:\n{}".format(name, workspace.FetchBlob(name)))
Current blobs in the workspace: [] Blobs in the workspace after execution: ['W', 'X', 'Y', 'b'] W: [[-0.29295802 0.02897477 -1.25667715] [-1.82299471 0.92877913 0.33613944] [-0.64382178 -0.68545657 -0.44015241] [ 1.10232282 1.38060772 -2.29121733] [-0.55766547 1.97437167 0.39324901]] X: [[-0.47522315 -0.40166432 0.7179445 ] [-0.8363331 -0.82451206 1.54286408]] Y: [[ 0.22535783 1.73460138 1.2652775 -1.72335696 0.7543118 ] [-0.71776152 2.27745867 1.42452145 -4.59527397 0.4452306 ]] b: [ 1. 1. 1. 1. 1.]
方法二:
workspace.ResetWorkspace()
print("Current blobs in the workspace: {}".format(workspace.Blobs())) workspace.CreateNet(net) workspace.RunNet(net.Proto().name) print("Blobs in the workspace after execution: {}".format(workspace.Blobs())) for name in workspace.Blobs(): print("{}:\n{}".format(name, workspace.FetchBlob(name)))
Current blobs in the workspace: [] Blobs in the workspace after execution: ['W', 'X', 'Y', 'b'] W: [[-0.29295802 0.02897477 -1.25667715] [-1.82299471 0.92877913 0.33613944] [-0.64382178 -0.68545657 -0.44015241] [ 1.10232282 1.38060772 -2.29121733] [-0.55766547 1.97437167 0.39324901]] X: [[-0.47522315 -0.40166432 0.7179445 ] [-0.8363331 -0.82451206 1.54286408]] Y: [[ 0.22535783 1.73460138 1.2652775 -1.72335696 0.7543118 ] [-0.71776152 2.27745867 1.42452145 -4.59527397 0.4452306 ]] b: [ 1. 1. 1. 1. 1.]
在这里,你们可能比较疑惑为何会有两种运行网络的方式,在以后的实际应用中,你们就会慢慢理解,在这里,暂时记住有这样两种运行网络的方式便可。
总结一下,在caffe2中
在基础知识中,咱们理解了workspace,operator,net等基本的概念,在这里咱们结合caffe2的官方文档简单举出几个例子。
第一个栗子帮助你们理解caffe2框架网络构建、参数初始化、训练、图等的一些关于总体框架的理念。
假设咱们要作训练一个简单的网络,拟合下面这样的一个回归函数:
y = wx + b 其中:w=[2.0, 1.5] b=0.5
通常训练数据是从外部读进来,在这里训练数据咱们直接用caffe2中的operator生成,咱们在后面的栗子中有会举例说明如何从外部读入数据。
首先导入必要的包:
from caffe2.python import core, cnn, net_drawer, workspace, visualize import numpy as np from IPython import display from matplotlib import pyplot
在这里,首先咱们须要创建两个网络图:
这里caffe2的思路和caffe不太同样,在caffe中,咱们在训练网络中定义好了参数的初始化方式,网络加载时,程序会根据网络定义,自动初始化权值,咱们只须要对这个网络,使用solver不断的前传和反传,更新参数便可。在caffe2中,咱们要把全部网络的搭建、初始化、梯度生成、梯度更新都使用operator这样一个方式来实现,全部的数据的生成、流动都要在图中反映出来。这样,那么初始化这一部分我就须要一些operators来实现,这些operators组成的net,咱们把它单独拿出来,称它为用于初始化的网络。咱们能够结合着代码来理解。
首先,咱们建立一个生成训练数据和初始化权值的网络。
init_net = core.Net("init") # The ground truth parameters. W_gt = init_net.GivenTensorFill( [], "W_gt", shape=[1, 2], values=[2.0, 1.5]) B_gt = init_net.GivenTensorFill([], "B_gt", shape=[1], values=[0.5]) # Constant value ONE is used in weighted sum when updating parameters. ONE = init_net.ConstantFill([], "ONE", shape=[1], value=1.) # ITER is the iterator count. ITER = init_net.ConstantFill([], "ITER", shape=[1], value=0, dtype=core.DataType.INT32) # For the parameters to be learned: we randomly initialize weight # from [-1, 1] and init bias with 0.0. W = init_net.UniformFill([], "W", shape=[1, 2], min=-1., max=1.) B = init_net.ConstantFill([], "B", shape=[1], value=0.0) print('Created init net.')
接下来,咱们定义一个用来训练的网络。
train_net = core.Net("train") # First, we generate random samples of X and create the ground truth. X = train_net.GaussianFill([], "X", shape=[64, 2], mean=0.0, std=1.0, run_once=0) Y_gt = X.FC([W_gt, B_gt], "Y_gt") # We add Gaussian noise to the ground truth noise = train_net.GaussianFill([], "noise", shape=[64, 1], mean=0.0, std=1.0, run_once=0) Y_noise = Y_gt.Add(noise, "Y_noise") # Note that we do not need to propagate the gradients back through Y_noise, # so we mark StopGradient to notify the auto differentiating algorithm # to ignore this path. Y_noise = Y_noise.StopGradient([], "Y_noise") # Now, for the normal linear regression prediction, this is all we need. Y_pred = X.FC([W, B], "Y_pred") # The loss function is computed by a squared L2 distance, and then averaged # over all items in the minibatch. dist = train_net.SquaredL2Distance([Y_noise, Y_pred], "dist") loss = dist.AveragedLoss([], ["loss"])
咱们来画出咱们定义的训练网络的图:
graph = net_drawer.GetPydotGraph(train_net.Proto().op, "train", rankdir="LR") display.Image(graph.create_png(), width=800)
在这里,经过上面的图,咱们能够看到init_net部分生成了训练数据、初始化的权值W,以及用来生成计算过程当中须要的常数矩阵,而train_net构建了前向计算过程。
可是咱们尚未定义如何反向传导,和不少其余的深度学习框架相似,caffe2支持自动梯度推导,自动生成产生梯度的operator。
接下来,咱们给train_net加上梯度运算:
# Get gradients for all the computations above. gradient_map = train_net.AddGradientOperators([loss]) graph = net_drawer.GetPydotGraph(train_net.Proto().op, "train", rankdir="LR") display.Image(graph.create_png(), width=800)
能够看到,网络后半部分进行了求梯度运算,输出了各学习参数的梯度值,当咱们获得这些梯度值后,咱们再得到当前训练的学习率,咱们就可使用梯度降低方法更新参数。
接下来,咱们在train_net加上SGD更新的部分:
# Increment the iteration by one. train_net.Iter(ITER, ITER) # Compute the learning rate that corresponds to the iteration. LR = train_net.LearningRate(ITER, "LR", base_lr=-0.1, policy="step", stepsize=20, gamma=0.9) # Weighted sum train_net.WeightedSum([W, ONE, gradient_map[W], LR], W) train_net.WeightedSum([B, ONE, gradient_map[B], LR], B) # Let's show the graph again. graph = net_drawer.GetPydotGraph(train_net.Proto().op, "train", rankdir="LR") display.Image(graph.create_png(), width=800)
到这里,整个模型的参数初始化、前传、反传、梯度更新全都使用operator定义好了。这个就是caffe2中使用operator的威力,它使得caffe2较caffe具备不可比拟的灵活性。在这里注意,咱们只是定义了网络,尚未运行网络,下面让咱们来运行它们:
workspace.RunNetOnce(init_net)
workspace.CreateNet(train_net)
print("Before training, W is: {}".format(workspace.FetchBlob("W"))) print("Before training, B is: {}".format(workspace.FetchBlob("B")))
True Before training, W is: [[-0.77634162 -0.88467366]] Before training, B is: [ 0.]
#run the train net 100 times for i in range(100): workspace.RunNet(train_net.Proto().name) print("After training, W is: {}".format(workspace.FetchBlob("W"))) print("After training, B is: {}".format(workspace.FetchBlob("B"))) print("Ground truth W is: {}".format(workspace.FetchBlob("W_gt"))) print("Ground truth B is: {}".format(workspace.FetchBlob("B_gt")))
在这里,咱们须要注意一点,咱们使用了RunNetOnce和RunNet两种不一样的方式来运行网络,还记得两种运行网络的方式么?
一开始我也不明白为何要有两种方式运行网络,如今结合init_net和train_net来看,就很是明白了。RunNetOnce用来运行生成权值和数据的网络,经常使用于初始化,这样的网络一次生成完,权值输出或数据就存在当前的workspace中,网络自己就没有存在的必要了,就直接销毁,而RunNet能够用来重复训练网络,一开始使用CreateNet,不断迭代调用RunNet就能够不断运行网络更新参数了。
如下是训练结果:
After training, W is: [[ 1.95769441 1.47348857]] After training, B is: [ 0.45236012] Ground truth W is: [[ 2. 1.5]] Ground truth B is: [ 0.5]
,总结一下:
最后,还要说明一点,这个例子中,咱们直接使用operator来构建网络。对于常见的深度网络,直接用operator构建会步骤会很是繁琐,因此caffe2中为了简化网络的搭建,又封装了model_helper类来帮助咱们方便地搭建网络,譬如对于卷积神经网络中的常见的层,咱们就能够直接使用model_helper来构建。在以后的栗子中也有说明。
众所周知,网络中训练须要作一系列的数据预处理,在这里,caffe和caffe2中处理的方式同样。都须要通过XXX等步。由于没有什么区别,在这里就不举了,直接参考官方教程Image Pre-Processing,解释很是清楚。给个赞。
首先,咱们使用一个caffe2中定义的下载模块去下载一个预训练好的模型,命令行中输入以下的命令会下载squeezenet这个预训练模型:
python -m caffe2.python.models.download -i squeezenet
当下载完成时,在caffe2/python/model底下有一个squeezenet文件,文件夹底下有两个文件init_net.pb,predict_net.pb分别保存了权值和网络定义。
在python中咱们使用caffe2的workspace来存放这个模型的网络定义和权重,而且把它们加载到blob、init_net和predict_net。咱们须要使用一个workspace.Predictor来接收两个protobuf,而后剩下的就能够交给caffe2了。
因此通常加载预测模型只须要几步:
一、读入protobuf文件
with open("init_net.pb") as f: init_net = f.read() with open("predict_net.pb") as f: predict_net = f.read()
二、使用workspace中的Predictor来加载从protobuf中取到的blobs:
p = workspace.Predictor(init_net, predict_net)
三、运行网络,获得结果:
results = p.run([img])
须要注意的这里的img是预处理过的图像。
如下是官方文档下的一个完整的栗子:
首先配置一下问文件路径等,导入经常使用包:
# where you installed caffe2. Probably '~/caffe2' or '~/src/caffe2'. CAFFE2_ROOT = "~/caffe2" # assumes being a subdirectory of caffe2 CAFFE_MODELS = "~/caffe2/caffe2/python/models" # if you have a mean file, place it in the same dir as the model %matplotlib inline from caffe2.proto import caffe2_pb2 import numpy as np import skimage.io import skimage.transform from matplotlib import pyplot import os from caffe2.python import core, workspace import urllib2 print("Required modules imported.")
IMAGE_LOCATION = "https://cdn.pixabay.com/photo/2015/02/10/21/28/flower-631765_1280.jpg" # What model are we using? You should have already converted or downloaded one. # format below is the model's: # folder, INIT_NET, predict_net, mean, input image size # you can switch the comments on MODEL to try out different model conversions MODEL = 'squeezenet', 'init_net.pb', 'predict_net.pb', 'ilsvrc_2012_mean.npy', 227 # codes - these help decypher the output and source from a list from AlexNet's object codes to provide an result like "tabby cat" or "lemon" depending on what's in the picture you submit to the neural network. # The list of output codes for the AlexNet models (also squeezenet) codes = "https://gist.githubusercontent.com/aaronmarkham/cd3a6b6ac071eca6f7b4a6e40e6038aa/raw/9edb4038a37da6b5a44c3b5bc52e448ff09bfe5b/alexnet_codes" print "Config set!"
定义数据预处理的函数:
def crop_center(img,cropx,cropy): y,x,c = img.shape startx = x//2-(cropx//2) starty = y//2-(cropy//2) return img[starty:starty+cropy,startx:startx+cropx] def rescale(img, input_height, input_width): print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!") print("Model's input shape is %dx%d") % (input_height, input_width) aspect = img.shape[1]/float(img.shape[0]) print("Orginal aspect ratio: " + str(aspect)) if(aspect>1): # landscape orientation - wide image res = int(aspect * input_height) imgScaled = skimage.transform.resize(img, (input_width, res)) if(aspect<1): # portrait orientation - tall image res = int(input_width/aspect) imgScaled = skimage.transform.resize(img, (res, input_height)) if(aspect == 1): imgScaled = skimage.transform.resize(img, (input_width, input_height)) pyplot.figure() pyplot.imshow(imgScaled) pyplot.axis('on') pyplot.title('Rescaled image') print("New image shape:" + str(imgScaled.shape) + " in HWC") return imgScaled print "Functions set." # set paths and variables from model choice and prep image CAFFE2_ROOT = os.path.expanduser(CAFFE2_ROOT) CAFFE_MODELS = os.path.expanduser(CAFFE_MODELS) # mean can be 128 or custom based on the model # gives better results to remove the colors found in all of the training images MEAN_FILE = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[3]) if not os.path.exists(MEAN_FILE): mean = 128 else: mean = np.load(MEAN_FILE).mean(1).mean(1) mean = mean[:, np.newaxis, np.newaxis] print "mean was set to: ", mean # some models were trained with different image sizes, this helps you calibrate your image INPUT_IMAGE_SIZE = MODEL[4] # make sure all of the files are around... if not os.path.exists(CAFFE2_ROOT): print("Houston, you may have a problem.") INIT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[1]) print 'INIT_NET = ', INIT_NET PREDICT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[2]) print 'PREDICT_NET = ', PREDICT_NET if not os.path.exists(INIT_NET): print(INIT_NET + " not found!") else: print "Found ", INIT_NET, "...Now looking for", PREDICT_NET if not os.path.exists(PREDICT_NET): print "Caffe model file, " + PREDICT_NET + " was not found!" else: print "All needed files found! Loading the model in the next block." # load and transform image img = skimage.img_as_float(skimage.io.imread(IMAGE_LOCATION)).astype(np.float32) img = rescale(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE) img = crop_center(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE) print "After crop: " , img.shape pyplot.figure() pyplot.imshow(img) pyplot.axis('on') pyplot.title('Cropped') # switch to CHW img = img.swapaxes(1, 2).swapaxes(0, 1) pyplot.figure() for i in range(3): # For some reason, pyplot subplot follows Matlab's indexing # convention (starting with 1). Well, we'll just follow it... pyplot.subplot(1, 3, i+1) pyplot.imshow(img[i]) pyplot.axis('off') pyplot.title('RGB channel %d' % (i+1)) # switch to BGR img = img[(2, 1, 0), :, :] # remove mean for better results img = img * 255 - mean # add batch size img = img[np.newaxis, :, :, :].astype(np.float32) print "NCHW: ", img.shape
运行一下,输出结果:
Functions set.
mean was set to: 128 INIT_NET = /home/aaron/models/squeezenet/init_net.pb PREDICT_NET = /home/aaron/models/squeezenet/predict_net.pb Found /home/aaron/models/squeezenet/init_net.pb ...Now looking for /home/aaron/models/squeezenet/predict_net.pb All needed files found! Loading the model in the next block. Original image shape:(751, 1280, 3) and remember it should be in H, W, C! Model's input shape is 227x227 Orginal aspect ratio: 1.70439414115 New image shape:(227, 386, 3) in HWC After crop: (227, 227, 3) NCHW: (1, 3, 227, 227)
当图像通过处理以后,就能够按照前面的安排加载和运行网络。
# initialize the neural net with open(INIT_NET) as f: init_net = f.read() with open(PREDICT_NET) as f: predict_net = f.read() p = workspace.Predictor(init_net, predict_net) # run the net and return prediction results = p.run([img]) # turn it into something we can play with and examine which is in a multi-dimensional array results = np.asarray(results) print "results shape: ", results.shape
results shape: (1, 1, 1000, 1, 1)
这里输出来了1000个值,表示这张图片分别对应1000类的几率。咱们能够取出来其中几率最高的值,来找到它对应的标签:
# the rest of this is digging through the results results = np.delete(results, 1) index = 0 highest = 0 arr = np.empty((0,2), dtype=object) arr[:,0] = int(10) arr[:,1:] = float(10) for i, r in enumerate(results): # imagenet index begins with 1! i=i+1 arr = np.append(arr, np.array([[i,r]]), axis=0) if (r > highest): highest = r index = i print index, " :: ", highest # lookup the code and return the result # top 3 results # sorted(arr, key=lambda x: x[1], reverse=True)[:3] # now we can grab the code list response = urllib2.urlopen(codes) # and lookup our result from the list for line in response: code, result = line.partition(":")[::2] if (code.strip() == str(index)): print result.strip()[1:-2]
985 :: 0.979059 daisy
一、模型、帮助函数、brew
在前面咱们已经基本介绍了在python中关于caffe2中基本的操做。
这个例子中,咱们来简单搭建一个CNN模型。在这个须要说明一点:
这一点须要你们区分开,否则容易疑惑。举例,若是咱们要构造一个模型,只有一个FC层,在这里使用modelHelper来表示一个model,使用operators来构造网络,通常model有一个param_init_net和一个net。分别用于模型初始化和训练:
model = model_helper.ModelHelper(name="train") # initialize your weight weight = model.param_init_net.XavierFill( [], blob_out + '_w', shape=[dim_out, dim_in], **kwargs, # maybe indicating weight should be on GPU here ) # initialize your bias bias = model.param_init_net.ConstantFill( [], blob_out + '_b', shape=[dim_out, ], **kwargs, ) # finally building FC model.net.FC([blob_in, weights, bias], blob_out, **kwargs)
前面,咱们说过在平常搭建网络的时候呢,咱们一般不是彻底使用operator搭建网络,由于使用这种方式,每一个参数都须要咱们手动初始化,以及每一个operator都须要构造,太过于繁琐。咱们想着,对于经常使用层,能不能把构造它的operators都封装起来,封装成一个函数,咱们构造时只需给这个函数要提供必要的参数,函数中的代码就能帮助咱们完成层初始化和operator的构建。
在caffe2中,为了便于开发者搭建网络,caffe2在python/helpers中提供了许多help函数,像上面例子中的FC层,使用python/helpers/fc.py来构造,很是简单就一行代码:
fcLayer = fc(model, blob_in, blob_out, **kwargs) # returns a blob reference
这里面help函数可以帮助咱们将权值初始化和计算网络自动分开到两个网络,这样一来就简单多了。caffe2为了更方便调用和管理,把这些帮助函数集合到一块儿,放在brew这个包里面。能够经过导入brew这个包来调用这些帮助函数。像上面的fc层的实现就可使用:
from caffe2.python import brew brew.fc(model, blob_in, blob_out, ...)
咱们使用brew构造网络就十分简单,下面的代码就构造了一个LeNet模型:
from caffe2.python import brew def AddLeNetModel(model, data): conv1 = brew.conv(model, data, 'conv1', 1, 20, 5) pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2, stride=2) conv2 = brew.conv(model, pool1, 'conv2', 20, 50, 5) pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2) fc3 = brew.fc(model, pool2, 'fc3', 50 * 4 * 4, 500) fc3 = brew.relu(model, fc3, fc3) pred = brew.fc(model, fc3, 'pred', 500, 10) softmax = brew.softmax(model, pred, 'softmax')
caffe2 使用brew提供不少构造网络的帮助函数,大大简化了咱们构建网络的过程。但实际上,这些只是封装的结果,网络构造的原理和以前说的使用operators构建的原理是同样的。
二、建立一个CNN模型用于MNIST手写体数据集
首先,导入必要的包:
%matplotlib inline
from matplotlib import pyplot import numpy as np import os import shutil from caffe2.python import core, model_helper, net_drawer, workspace, visualize, brew # If you would like to see some really detailed initializations, # you can change --caffe2_log_level=0 to --caffe2_log_level=-1 core.GlobalInit(['caffe2', '--caffe2_log_level=0']) print("Necessities imported!")
下载MNIST dataset,而且把数据集转成leveldb:
./make_mnist_db --channel_first --db leveldb --image_file ~/Downloads/train-images-idx3-ubyte --label_file ~/Downloads/train-labels-idx1-ubyte --output_file ~/caffe2_notebooks/tutorial_data/mnist/mnist-train-nchw-leveldb ./make_mnist_db --channel_first --db leveldb --image_file ~/Downloads/t10k-images-idx3-ubyte --label_file ~/Downloads/t10k-labels-idx1-ubyte --output_file ~/caffe2_notebooks/tutorial_data/mnist/mnist-test-nchw-leveldb
# This section preps your image and test set in a leveldb current_folder = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks') data_folder = os.path.join(current_folder, 'tutorial_data', 'mnist') root_folder = os.path.join(current_folder, 'tutorial_files', 'tutorial_mnist') image_file_train = os.path.join(data_folder, "train-images-idx3-ubyte") label_file_train = os.path.join(data_folder, "train-labels-idx1-ubyte") image_file_test = os.path.join(data_folder, "t10k-images-idx3-ubyte") label_file_test = os.path.join(data_folder, "t10k-labels-idx1-ubyte") # Get the dataset if it is missing def DownloadDataset(url, path): import requests, zipfile, StringIO print "Downloading... ", url, " to ", path r = requests.get(url, stream=True) z = zipfile.ZipFile(StringIO.StringIO(r.content)) z.extractall(path) def GenerateDB(image, label, name): name = os.path.join(data_folder, name) print 'DB: ', name if not os.path.exists(name): syscall = "/usr/local/bin/make_mnist_db --channel_first --db leveldb --image_file " + image + " --label_file " + label + " --output_file " + name # print "Creating database with: ", syscall os.system(syscall) else: print "Database exists already. Delete the folder if you have issues/corrupted DB, then rerun this." if os.path.exists(os.path.join(name, "LOCK")): # print "Deleting the pre-existing lock file" os.remove(os.path.join(name, "LOCK")) if not os.path.exists(data_folder): os.makedirs(data_folder) if not os.path.exists(label_file_train): DownloadDataset("https://download.caffe2.ai/datasets/mnist/mnist.zip", data_folder) if os.path.exists(root_folder): print("Looks like you ran this before, so we need to cleanup those old files...") shutil.rmtree(root_folder) os.makedirs(root_folder) workspace.ResetWorkspace(root_folder) # (Re)generate the leveldb database (known to get corrupted...) GenerateDB(image_file_train, label_file_train, "mnist-train-nchw-leveldb") GenerateDB(image_file_test, label_file_test, "mnist-test-nchw-leveldb") print("training data folder:" + data_folder) print("workspace root folder:" + root_folder)
在这里,咱们使用modelHelper来表明咱们的模型,使用brew和operators来搭建模型,modelHelper包含了两个net,包括param_init_net和net,分别表明初始化网络和主训练网络。
咱们来一步一步分块构造模型:
(1)输入部分(AddInput function) (2)网络计算部分(AddLeNetModel function) (3)网络训练部分,添加梯度运算,更新等(AddTrainingOperators function) (4)记录统计部分,打印一些统计数据来观察(AddBookkeepingOperators function)
(1)输入部分(AddInput function)
AddInput会从DB加载data,AddInput加载完成以后,和获得data 和label:
- data with shape `(batch_size, num_channels, width, height)` - in this case `[batch_size, 1, 28, 28]` of data type *uint8* - label with shape `[batch_size]` of data type *int*
def AddInput(model, batch_size, db, db_type): # load the data data_uint8, label = model.TensorProtosDBInput( [], ["data_uint8", "label"], batch_size=batch_size, db=db, db_type=db_type) # cast the data to float data = model.Cast(data_uint8, "data", to=core.DataType.FLOAT) # scale data from [0,255] down to [0,1] data = model.Scale(data, data, scale=float(1./256)) # don't need the gradient for the backward pass data = model.StopGradient(data, data) return data, label
在这里简单解释一下AddInput中的一些操做,首先将data转换成float类型,这样作是由于咱们主要作浮点运算。为了保证计算稳定,咱们将图像从[0,255]缩放到[0,1],而且这里作的事占位运算,不须要保存未缩放以前的值。当计算反向过程当中,这一部分不须要计算梯度,咱们使用StopGradient来禁止梯度反传,这样自动生成梯度时,这个operator和它以前的operator就不会变了。
def AddInput(model, batch_size, db, db_type): # load the data data_uint8, label = model.TensorProtosDBInput( [], ["data_uint8", "label"], batch_size=batch_size, db=db, db_type=db_type) # cast the data to float data = model.Cast(data_uint8, "data", to=core.DataType.FLOAT) # scale data from [0,255] down to [0,1] data = model.Scale(data, data, scale=float(1./256)) # don't need the gradient for the backward pass data = model.StopGradient(data, data) return data, label
在这个基础上,就是加入网络AddLenetModel,同时加入一个AddAccuracy来追踪模型的准确率:
def AddLeNetModel(model, data): # Image size: 28 x 28 -> 24 x 24 conv1 = brew.conv(model, data, 'conv1', dim_in=1, dim_out=20, kernel=5) # Image size: 24 x 24 -> 12 x 12 pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2, stride=2) # Image size: 12 x 12 -> 8 x 8 conv2 = brew.conv(model, pool1, 'conv2', dim_in=20, dim_out=50, kernel=5) # Image size: 8 x 8 -> 4 x 4 pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2) # 50 * 4 * 4 stands for dim_out from previous layer multiplied by the image size fc3 = brew.fc(model, pool2, 'fc3', dim_in=50 * 4 * 4, dim_out=500) fc3 = brew.relu(model, fc3, fc3) pred = brew.fc(model, fc3, 'pred', 500, 10) softmax = brew.softmax(model, pred, 'softmax') return softmax def AddAccuracy(model, softmax, label): accuracy = model.Accuracy([softmax, label], "accuracy") return accuracy
接下来,咱们将加入梯度生成和更新,这部分由AddTrainingOperators实现,梯度生成和更新和以前例子中的原理同样。
def AddTrainingOperators(model, softmax, label):
# something very important happens here xent = model.LabelCrossEntropy([softmax, label], 'xent') # compute the expected loss loss = model.AveragedLoss(xent, "loss") # track the accuracy of the model AddAccuracy(model, softmax, label) # use the average loss we just computed to add gradient operators to the model model.AddGradientOperators([loss]) # do a simple stochastic gradient descent ITER = model.Iter("iter") # set the learning rate schedule LR = model.LearningRate( ITER, "LR", base_lr=-0.1, policy="step", stepsize=1, gamma=0.999 ) # ONE is a constant value that is used in the gradient update. We only need # to create it once, so it is explicitly placed in param_init_net. ONE = model.param_init_net.ConstantFill([], "ONE", shape=[1], value=1.0) # Now, for each parameter, we do the gradient updates. for param in model.params: # Note how we get the gradient of each parameter - ModelHelper keeps # track of that. param_grad = model.param_to_grad[param] # The update is a simple weighted sum: param = param + param_grad * LR model.WeightedSum([param, ONE, param_grad, LR], param) # let's checkpoint every 20 iterations, which should probably be fine. # you may need to delete tutorial_files/tutorial-mnist to re-run the tutorial model.Checkpoint([ITER] + model.params, [], db="mnist_lenet_checkpoint_%05d.leveldb", db_type="leveldb", every=20)
接下来,咱们使用AddBookkeepingOperations来打印一些统计数据供咱们以后观察,这一部分不影响训练部分,只是统计,打印日志。
def AddBookkeepingOperators(model): # Print basically prints out the content of the blob. to_file=1 routes the # printed output to a file. The file is going to be stored under # root_folder/[blob name] model.Print('accuracy', [], to_file=1) model.Print('loss', [], to_file=1) # Summarizes the parameters. Different from Print, Summarize gives some # statistics of the parameter, such as mean, std, min and max. for param in model.params: model.Summarize(param, [], to_file=1) model.Summarize(model.param_to_grad[param], [], to_file=1) # Now, if we really want to be verbose, we can summarize EVERY blob # that the model produces; it is probably not a good idea, because that # is going to take time - summarization do not come for free. For this # demo, we will only show how to summarize the parameters and their # gradients. print("Bookkeeping function created")
在这里,咱们一共作了四件事:
(1)输入部分(AddInput function) (2)网络计算部分(AddLeNetModel function) (3)网络训练部分,添加梯度运算,更新等(AddTrainingOperators function) (4)记录统计部分,打印一些统计数据来观察(AddBookkeepingOperators function)
基本的操做咱们都定义好了,接下来调用定义模型,在这里,它定义了一个训练模型,用于训练,一个部署模型,用于部署:
arg_scope = {"order": "NCHW"} train_model = model_helper.ModelHelper(name="mnist_train", arg_scope=arg_scope) data, label = AddInput( train_model, batch_size=64, db=os.path.join(data_folder, 'mnist-train-nchw-leveldb'), db_type='leveldb') softmax = AddLeNetModel(train_model, data) AddTrainingOperators(train_model, softmax, label) AddBookkeepingOperators(train_model) # Testing model. We will set the batch size to 100, so that the testing # pass is 100 iterations (10,000 images in total). # For the testing model, we need the data input part, the main LeNetModel # part, and an accuracy part. Note that init_params is set False because # we will be using the parameters obtained from the train model. test_model = model_helper.ModelHelper( name="mnist_test", arg_scope=arg_scope, init_params=False) data, label = AddInput( test_model, batch_size=100, db=os.path.join(data_folder, 'mnist-test-nchw-leveldb'), db_type='leveldb') softmax = AddLeNetModel(test_model, data) AddAccuracy(test_model, softmax, label) # Deployment model. We simply need the main LeNetModel part. deploy_model = model_helper.ModelHelper( name="mnist_deploy", arg_scope=arg_scope, init_params=False) AddLeNetModel(deploy_model, "data") # You may wonder what happens with the param_init_net part of the deploy_model. # No, we will not use them, since during deployment time we will not randomly # initialize the parameters, but load the parameters from the db.
运行网络,打印loss曲线:
# The parameter initialization network only needs to be run once. workspace.RunNetOnce(train_model.param_init_net) # creating the network workspace.CreateNet(train_model.net) # set the number of iterations and track the accuracy & loss total_iters = 200 accuracy = np.zeros(total_iters) loss = np.zeros(total_iters) # Now, we will manually run the network for 200 iterations. for i in range(total_iters): workspace.RunNet(train_model.net.Proto().name) accuracy[i] = workspace.FetchBlob('accuracy') loss[i] = workspace.FetchBlob('loss') # After the execution is done, let's plot the values. pyplot.plot(loss, 'b') pyplot.plot(accuracy, 'r') pyplot.legend(('Loss', 'Accuracy'), loc='upper right')
咱们也能够输出来预测:
# Let's look at some of the data. pyplot.figure() data = workspace.FetchBlob('data') _ = visualize.NCHW.ShowMultiple(data) pyplot.figure() softmax = workspace.FetchBlob('softmax') _ = pyplot.plot(softmax[0], 'ro') pyplot.title('Prediction for the first image')
记得咱们也定义了一个test_model,咱们能够运行它获得测试集准确率,虽然test_model的权值由train_model来加载,可是测试数据输入还须要运行param_init_net。
# run a test pass on the test net workspace.RunNetOnce(test_model.param_init_net) workspace.CreateNet(test_model.net) test_accuracy = np.zeros(100) for i in range(100): workspace.RunNet(test_model.net.Proto().name) test_accuracy[i] = workspace.FetchBlob('accuracy') # After the execution is done, let's plot the values. pyplot.plot(test_accuracy, 'r') pyplot.title('Acuracy over test batches.') print('test_accuracy: %f' % test_accuracy.mean())
test_accuracy: 0.946700
这样,咱们就简单的完成了模型的搭建、训练、部署。
这个教程是caffe2的python接口教程。教程例子基本都是官方提供的,只是加了些本身的理解思路,也简单对比了caffe,可能有疏忽和理解错的地方,敬请指正。
2017.07.07 cskenken