output_10_1.pnghtml
TRAINING A CLASSIFIER
参考Pytorch Tutorial
:Deep Learning with PyTorch: A 60 Minute Blitzpython
在学会了如下后:网络
- 定义神经网络
- 计算损失函数
- 更新权重
What about data
Generally, when you have to deal with image, text, audio or video data, you can use standard python packages that load data into a numpy array. Then you can convert this array into a torch.*Tensor.ide
For images, packages such as Pillow, OpenCV are useful For audio, packages such as scipy and librosa For text, either raw Python or Cython based loading, or NLTK and SpaCy are useful
Specifically for vision, we have created a package called torchvision, that has data loaders for common datasets such as Imagenet, CIFAR10, MNIST, etc. and data transformers for images, viz., torchvision.datasets and torch.utils.data.DataLoader.函数
当处理图像、文本、音频或视频数据时,能够用python的标准包来家在数据并存为Numpy Array,然后再转成torch.Tensor性能
- 图像: 经常使用Pillow,OpenCv
- 音频: scipy,librosa
- 文本: 原python或cython加载,或NLTK和Spacy经常使用
针对计算机视觉,pytorch有提供了便于处理的包torchvision
里面包括了'data loader',能够加载经常使用的数据集imagenet,Cifar10,Mnist等测试
还包括一些转换器(能够作数据加强 Augment)优化
torchvision.datasets
torch.utils.data.DataLoader
this
在这个实验中,使用CIFAR10
数据集
包含类型:‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’url
CIFAR10数据集中的图片size均为33232(3个通道rgb,32*32大小)
Training an image classifier
步骤:
- 加载并标准化训练与测试数据集,使用
torchvision
- 定义卷积神经网络convnet
- 定义损失函数
- 训练集训练神经网络
- 测试集测试网络性能
Step1:加载并标准化训练与测试数据集
import torch import torchvision import torchvision.transforms as transforms
torchvison数据集是 PILImage类型,值在[0,1]之间,须要转换成Tensors并标准化到[-1,1]
transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))]) #compose 是将多个转换器功能混合在一块儿 #./是当前目录 ../是父目录 /是根目录 trainset = torchvision.datasets.CIFAR10(root='./data',train=True,download=True,transform=transform)#已经下载就不会再下载了 trainloader = torch.utils.data.DataLoader(trainset,batch_size=4,shuffle=True,num_workers=2) testset = torchvision.datasets.CIFAR10(root='./data',train=False,download=True,transform=transform) testloader = torch.utils.data.DataLoader(testset,batch_size=4,shuffle=False,num_workers=2) #num_workers 处理进程数 classes = ('plane','car','bird','cat','deer','dog','frog','horse','ship','truck')
Files already downloaded and verified Files already downloaded and verified
print(trainset) print("----"*10) print(testset)
Dataset CIFAR10 Number of datapoints: 50000 Split: train Root Location: ./data Transforms (if any): Compose( ToTensor() Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)) ) Target Transforms (if any): None ---------------------------------------- Dataset CIFAR10 Number of datapoints: 10000 Split: test Root Location: ./data Transforms (if any): Compose( ToTensor() Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)) ) Target Transforms (if any): None
#show一些图片 for fun?? %matplotlib inline import matplotlib.pyplot as plt import numpy as np def imshow(img): img = img/2+0.5 npimg = img.numpy() plt.imshow(np.transpose(npimg,(1,2,0))) #转回正常格式 从chw转回hwc dataiter = iter(trainloader) #迭代器 images,labels = dataiter.next() print(labels) imshow(torchvision.utils.make_grid(images)) print(''.join('%5s'%classes[labels[j]] for j in range(4))) #由于一个batch是4,因此一次next取4个
tensor([2, 8, 1, 5]) bird ship car dog
labels
tensor([2, 8, 1, 5])
Step2: 定义卷积神经网络
import torch.nn as nn import torch.nn.functional as F class Net(nn.Module): #这一步只是定义了可能要用到的层,在计算中,可能有的层用了屡次,有的不用 def __init__(self): super(Net,self).__init__() self.conv1 = nn.Conv2d(3,6,5) #(输入channel,输出channel,卷积核) self.pool = nn.MaxPool2d(2,2) #定义一个池化层,用两次 self.conv2 = nn.Conv2d(6,16,5) self.fc1 = nn.Linear(16*5*5,120) self.fc2 = nn.Linear(120,84) self.fc3 = nn.Linear(84,10) #实际如何构建神经网络是根据forward肯定 def forward(self,x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1,16*5*5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x net = Net()
Net( (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1)) (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) (fc1): Linear(in_features=400, out_features=120, bias=True) (fc2): Linear(in_features=120, out_features=84, bias=True) (fc3): Linear(in_features=84, out_features=10, bias=True) )
定义损失函数和优化器(用于更新权重)
注意⚠️:torch 中最后输出了10维,而labels是一个1* 1 数字。这样处理的也是正确的,计算loss时是经过x[labels]来取得每个数来计算,因此其实是同样
而在其余地方是将labels看成10维向量来处理。其实都是一个东西。系统内部自行处理,不用太纠结于细节
import torch.optim as optim #这里的crossentropy包含了softmax层,能够不用再加softmax了。 #并且这个损失函数的原理是让正确值尽量大,错值尽量小 criterion = nn.CrossEntropyLoss() # 交叉熵 #在这里计算的交叉熵是直接用类别来取值的,而不是化成n类-》n列向量,所在类为1这样子 optimizer = optim.SGD(net.parameters(),lr = 0.001,momentum=0.9)
训练网络
for epoch in range(2): #训练的epoch数 running_loss = 0.0 for i,data in enumerate(trainloader,0): #0表示是从0开始,通常默认就是0 #获得data inputs,labels = data #初始化梯度(0) optimizer.zero_grad() #前向计算 outputs = net(inputs) #计算损失函数 loss = criterion(outputs,labels) #反向传播(计算梯度) loss.backward() #更新梯度 optimizer.step() #print 统计数据 running_loss += loss.item() #统计数据的损失 if i% 2000 == 1999: #每2000个batch 打印一次 print('[%d, %5d] loss: %.3f'%(epoch+1,i+1,running_loss)) running_loss = 0.0 #打印完归零 print('Finished Training')
[1, 2000] loss: 4505.347 [1, 4000] loss: 3816.202 [1, 6000] loss: 3448.905 [1, 8000] loss: 3221.118 [1, 10000] loss: 3091.055 [1, 12000] loss: 2993.834 [2, 2000] loss: 2793.536 [2, 4000] loss: 2777.763 [2, 6000] loss: 2710.222 [2, 8000] loss: 2668.854 [2, 10000] loss: 2622.627 [2, 12000] loss: 2571.615 Finished Training
用test数据测试网络
经过预测类别并对比ground-truth
#先显示下test的图像 dataiter = iter(testloader) images,labels = dataiter.next() imshow(torchvision.utils.make_grid(images)) print('GroundTruth: ',' '.join('%5s' % classes[labels[j]] for j in range(4)))
GroundTruth: cat ship ship plane
outputs = net(images) #放进去计算预测结果 _,predicted = torch.max(outputs,1) #outputs的第2维(各行的每一列中取出最大的1列)中取出最大的数(丢弃),取出最大数所在索引(predicted) print('Predicted: ' ,' '.join('%5s'% classes[predicted[j]] for j in range(4)))
Predicted: deer cat deer horse
print(outputs) print(predicted)
tensor([[-3.4898, -3.6106, 1.2521, 3.3437, 3.3692, 3.2635, 2.6993, 2.0445, -4.8485, -3.5421], [-1.9592, -2.6239, 1.1073, 3.4853, 1.0128, 3.2079, -0.2431, 1.9412, -2.4887, -2.2249], [-0.2035, 1.3960, 0.6715, -0.1788, 3.5923, -1.4808, 0.4605, -0.0833, -2.6476, -1.5091], [-1.7742, -2.5306, 1.0426, 0.2753, 3.6487, 0.9355, 0.2774, 4.9753, -4.7646, -2.7965]], grad_fn=<ThAddmmBackward>) tensor([4, 3, 4, 7])
计算总体精度
在整个测试集的表现
correct = 0 total = 0 with torch.no_grad(): #告诉机器不用再去自动计算每个tensor梯度了。 for data in testloader: images,labels = data outputs = net(images) _,predicted = torch.max(outputs.data,1) total += labels.size(0) correct += (predicted == labels).sum().item() print('Accuracy of the network on the 10000 test images:%d %%'%(100*correct/total))
Accuracy of the network on the 10000 test images:54 %
彷佛学到了东西,再看看哪些类别表现的更好
class_correct = list(0.for i in range(10)) #生成浮点型list class_total = list(0.for i in range(10)) with torch.no_grad(): for data in testloader: images,labels = data outputs = net(images) _,predicted = torch.max(outputs,1) c = (predicted == labels).squeeze() #就是全部数据都挤到一行,能够方便c[i]取值 for i in range(4): label = labels[i] class_correct[label] += c[i].item() class_total[label] +=1 for i in range(10): print('Accuracy of %5s : %2d %%'%(classes[i],100*class_correct[i]/class_total[i]))
Accuracy of plane : 57 % Accuracy of car : 80 % Accuracy of bird : 37 % Accuracy of cat : 45 % Accuracy of deer : 45 % Accuracy of dog : 43 % Accuracy of frog : 61 % Accuracy of horse : 54 % Accuracy of ship : 64 % Accuracy of truck : 54 %
用GPU作怎么作?
就像转移tensor到gpu同样,转移整个neural net 到gpu。 先定义一个device做为首个可见的cuda device(若是有,没有则作不了)
device = torch.device("cude:0" if torch.cuda.is_available() else 'cpu') #假如在cuda机器中,这里会打印cuda device print(device)
cpu
net.to(device) #切记 要在每一步的inputs和targets都放到gpu device 中 inputs,labels = inputs.to(device),labels.to(device)
为何没有显著速度提高?由于网络的过小,不明显
如何用上全部GPUs(多个)? Data Parallelism
有用的函数
- torch.from_numpy() numpy直接转tensor,不变维度
- transforms.ToTensor() numpy转tensor,第三维变成第一维,其余两维后移
- x.numpy() 转回numpy格式 x是tensor变量
- x.transpose((2,0,1)) x是numpy格式,但维度不正确,进行维度转换 意思是将最后一维变为第一维 ,(0,1,2)即表示不变