Python图像读写方法对比

时间 2020-11-15

标签数组网络框架 spa code orm blog 进程图片栏目 Python 繁體版

原文原文链接

　　训练视觉相关的神经网络模型时，老是要用到图像的读写。方法有不少，好比matplotlib、cv二、PIL等。下面比较几种读写方式，旨在选出一个最快的方式，提高训练速度。数组

实验标准

　　由于训练使用的框架是Pytorch，所以读取的实验标准以下：网络

　　一、读取分辨率都为1920x1080的5张图片（png格式一张，jpg格式四张）并保存到数组。框架

　　二、将读取的数组转换为维度顺序为CxHxW的Pytorch张量，并保存到显存中（我使用GPU训练），其中三个通道的顺序为RGB。spa

　　三、记录各个方法在以上操做中所耗费的时间。由于png格式的图片大小差很少是质量有微小差别的jpg格式的10倍，因此数据集一般不会用png来保存，就不比较这两种格式的读取时间差别了。code

　　写入的实验标准以下：orm

　　一、将5张1920x1080的5张图像对应的Pytorch张量转换为对应方法可以使用的数据类型数组。blog

　　二、以jpg格式保存五张图片。进程

　　三、记录各个方法保存图片所耗费的时间。图片

实验状况

cv2

　　由于有GPU，因此cv2读取图片有两种方式：ip

　　一、先把图片都读取为一个numpy数组，再转换成保存在GPU中的pytorch张量。

　　二、初始化一个保存在GPU中的pytorch张量，而后将每张图直接复制进这个张量中。

　　第一种方式实验代码以下：

import os, torch
import cv2 as cv 
import numpy as np 
from time import time 
 
read_path = 'D:test'
write_path = 'D:test\\write\\'
 
# cv2读取 1
start_t = time()
imgs = np.zeros([5, 1080, 1920, 3])
for img, i in zip(os.listdir(read_path), range(5)): 
  img = cv.imread(filename=os.path.join(read_path, img))
  imgs[i] = img   
imgs = torch.tensor(imgs).to('cuda')[...,[2,1,0]].permute([0,3,1,2])/255 
print('cv2 读取时间1：', time() - start_t) 
# cv2保存
start_t = time()
imgs = (imgs.permute([0,2,3,1])[...,[2,1,0]]*255).cpu().numpy()
for i in range(imgs.shape[0]): 
  cv.imwrite(write_path + str(i) + '.jpg', imgs[i])
print('cv2 保存时间：', time() - start_t)

　　实验结果：

cv2 读取时间1： 0.39693760871887207
cv2 保存时间： 0.3560612201690674

　　第二种方式实验代码以下：

import os, torch
import cv2 as cv 
import numpy as np 
from time import time 
 
read_path = 'D:test'
write_path = 'D:test\\write\\'
 
 
# cv2读取 2
start_t = time()
imgs = torch.zeros([5, 1080, 1920, 3], device='cuda')
for img, i in zip(os.listdir(read_path), range(5)): 
  img = torch.tensor(cv.imread(filename=os.path.join(read_path, img)), device='cuda')
  imgs[i] = img   
imgs = imgs[...,[2,1,0]].permute([0,3,1,2])/255 
print('cv2 读取时间2：', time() - start_t) 
# cv2保存
start_t = time()
imgs = (imgs.permute([0,2,3,1])[...,[2,1,0]]*255).cpu().numpy()
for i in range(imgs.shape[0]): 
  cv.imwrite(write_path + str(i) + '.jpg', imgs[i])
print('cv2 保存时间：', time() - start_t)

　　实验结果：

cv2 读取时间2： 0.23636841773986816
cv2 保存时间： 0.3066873550415039

matplotlib

　　一样两种读取方式，第一种代码以下：

import os, torch 
import numpy as np
import matplotlib.pyplot as plt 
from time import time 
 
read_path = 'D:test'
write_path = 'D:test\\write\\'
 
# matplotlib 读取 1
start_t = time()
imgs = np.zeros([5, 1080, 1920, 3])
for img, i in zip(os.listdir(read_path), range(5)): 
  img = plt.imread(os.path.join(read_path, img)) 
  imgs[i] = img    
imgs = torch.tensor(imgs).to('cuda').permute([0,3,1,2])/255  
print('matplotlib 读取时间1：', time() - start_t) 
# matplotlib 保存
start_t = time()
imgs = (imgs.permute([0,2,3,1])).cpu().numpy()
for i in range(imgs.shape[0]):  
  plt.imsave(write_path + str(i) + '.jpg', imgs[i])
print('matplotlib 保存时间：', time() - start_t)

　　实验结果：

matplotlib 读取时间1： 0.45380306243896484
matplotlib 保存时间： 0.768944263458252

　　第二种方式实验代码：

import os, torch 
import numpy as np
import matplotlib.pyplot as plt 
from time import time 
 
read_path = 'D:test'
write_path = 'D:test\\write\\'
 
# matplotlib 读取 2
start_t = time()
imgs = torch.zeros([5, 1080, 1920, 3], device='cuda')
for img, i in zip(os.listdir(read_path), range(5)): 
  img = torch.tensor(plt.imread(os.path.join(read_path, img)), device='cuda')
  imgs[i] = img    
imgs = imgs.permute([0,3,1,2])/255  
print('matplotlib 读取时间2：', time() - start_t) 
# matplotlib 保存
start_t = time()
imgs = (imgs.permute([0,2,3,1])).cpu().numpy()
for i in range(imgs.shape[0]):  
  plt.imsave(write_path + str(i) + '.jpg', imgs[i])
print('matplotlib 保存时间：', time() - start_t)

　　实验结果：

matplotlib 读取时间2： 0.2044532299041748
matplotlib 保存时间： 0.4737534523010254

　　须要注意的是，matplotlib读取png格式图片获取的数组的数值是在$[0, 1]$范围内的浮点数，而jpg格式图片倒是在$[0, 255]$范围内的整数。因此若是数据集内图片格式不一致，要注意先转换为一致再读取，不然数据集的预处理就麻烦了。

PIL

　　PIL的读取与写入并不能直接使用pytorch张量或numpy数组，要先转换为Image类型，因此很麻烦，时间复杂度上确定也是占下风的，就不实验了。

torchvision

　　torchvision提供了直接从pytorch张量保存图片的功能，和上面读取最快的matplotlib的方法结合，代码以下：

import os, torch  
import matplotlib.pyplot as plt 
from time import time 
from torchvision import utils 

read_path = 'D:test'
write_path = 'D:test\\write\\'
 
# matplotlib 读取 2
start_t = time()
imgs = torch.zeros([5, 1080, 1920, 3], device='cuda')
for img, i in zip(os.listdir(read_path), range(5)): 
  img = torch.tensor(plt.imread(os.path.join(read_path, img)), device='cuda')
  imgs[i] = img    
imgs = imgs.permute([0,3,1,2])/255  
print('matplotlib 读取时间2：', time() - start_t) 
# torchvision 保存
start_t = time() 
for i in range(imgs.shape[0]):   
  utils.save_image(imgs[i], write_path + str(i) + '.jpg')
print('torchvision 保存时间：', time() - start_t)

　　实验结果：

matplotlib 读取时间2： 0.15358829498291016
torchvision 保存时间： 0.14760661125183105

　　能够看出这两个是最快的读写方法。另外，要让图片的读写尽可能不影响训练进程，咱们还可让这两个过程与训练并行。另外，utils.save_image能够将多张图片拼接成一张来保存，具体使用方法以下：

utils.save_image(tensor = imgs,     # 要保存的多张图片张量 shape = [n, C, H, W]
                 fp = 'test.jpg',   # 保存路径
                 nrow = 5,          # 多图拼接时，每行所占的图片数
                 padding = 1,       # 多图拼接时，每张图之间的间距
                 normalize = True,  # 是否进行规范化，一般输出图像用tanh，因此要用规范化 
                 range = (-1,1))    # 规范化的范围