转载请注明原出处:http://blog.csdn.net/ouyangfushu/article/details/79543575html
做者:SyGoingnode
QQ: 2446799425python
SSD目标检测算法在MSCOCO上训练,SSD默认训练格式是VOC数据集格式,要想训练MSCOCO数据集能够将其转化成VOC,而后再训练。算法
1、数据集介绍json
一、MSCOCO数据集(针对目标检测)app
COCO数据集是微软团队获取的一个能够用来图像recognition+segmentation+captioning数据集,数据集具备如下特色:dom
(1)Object segmentation工具
(2)Recognition in Context性能
(3)Multiple objects per image学习
(4)More than 300,000 images
(5)More than 2 Million instances
(6)80 object categories
(7)5 captions per image
(8)Keypoints on 100,000 people
COCO的2014版本的数据集一共有20G左右的图片和500M左右的标签文件。标签文件标记了每一个segmentation的像素精确位置+bounding box的精确坐标,其精度均为小数点后两位。文件目录:
其中trian和val存放图片,annotations文件夹存放json格式的标注信息:
一个目标的标签示意以下(anotation的.json文件中):
{"segmentation":[[392.87,275.77, 402.24, 284.2, 382.54, 342.36, 375.99, 356.43, 372.23, 357.37, 372.23,397.7, 383.48, 419.27,407.87, 439.91, 427.57, 389.25, 447.26, 346.11, 447.26,328.29, 468.84, 290.77,472.59, 266.38], [429.44,465.23, 453.83, 473.67, 636.73,474.61, 636.73, 392.07, 571.07, 364.88, 546.69,363.0]], "area":28458.996150000003, "iscrowd": 0,"image_id": 503837, "bbox":[372.23, 266.38, 264.5,208.23], "category_id":4, "id": 151109}
详情参考MSCOCO连接:http://mscoco.org/
二、PASCAL VOC数据集(针对目标检测)
PASCAL VOC挑战赛是视觉对象的分类识别和检测的一个基准测试,提供了检测算法和学习性能的标准图像注释数据集和标准的评估系统。PASCAL VOC图片集包括20个目录:人类;动物(鸟、猫、牛、狗、马、羊);交通工具(飞机、自行车、船、公共汽车、小轿车、摩托车、火车);室内(瓶子、椅子、餐桌、盆栽植物、沙发、电视)。PASCAL VOC挑战赛在2012年后便再也不举办,但其数据集图像质量好,标注完备,很是适合用来测试算法性能。
官网连接:http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html
SSD官方给的训练demo就是在VOC数据集格式,
SSD目标检测主要关注:Annotations、ImageSets和JPEGImages
Annotations:存放每张图片的XML文件,该文件内容有每张图片目标的BBOX坐标、图片名称、类别等信息,文件的内容具体为:
1. <annotation> 2. <folder>VOC2012</folder> 3. <filename>2007_000392.jpg</filename> //文件名 4. <source> //图像来源(不重要) 5. <database>The VOC2007 Database</database> 6. <annotation>PASCAL VOC2007</annotation> 7. <image>flickr</image> 8. </source> 9. <size> //图像尺寸(长宽以及通道数) 10. <width>500</width> 11. <height>332</height> 12. <depth>3</depth> 13. </size> 14. <segmented>1</segmented> //是否用于分割(在图像物体识别中01无所谓) 15. <object> //检测到的物体 16. <name>horse</name> //物体类别 17. <pose>Right</pose> //拍摄角度 18. <truncated>0</truncated> //是否被截断(0表示完整) 19. <difficult>0</difficult> //目标是否难以识别(0表示容易识别) 20. <bndbox> //bounding-box(包含左下角和右上角xy坐标) 21. <xmin>100</xmin> 22. <ymin>96</ymin> 23. <xmax>355</xmax> 24. <ymax>324</ymax> 25. </bndbox> 26. </object> 27. <object> //检测到多个物体 28. <name>person</name> 29. <pose>Unspecified</pose> 30. <truncated>0</truncated> 31. <difficult>0</difficult> 32. <bndbox> 33. <xmin>198</xmin> 34. <ymin>58</ymin> 35. <xmax>286</xmax> 36. <ymax>197</ymax> 37. </bndbox> 38. </object> 39. </annotation>
ImageSets:存放的是每一种类型的challenge对应的图像数据。
在ImageSets下有四个文件夹:
目标检测主要关注Main文件下包含test.txt,train.txt,trainval.txt,val.txt四个txt文件:
实际用到的事test.txt和trainval.txt(val.txt和train.txt的结合)
JPEGImages:存放全部图片的文件夹。
2、数据集转化(COCOàVOC)
对于数据集有个初步认识以后,如今开始实施转化。主要是生成VOC的XML格式,这一步完成剩下的就很简单了,这里我只是筛选其中的car,bus,truck,person。
准备工做:创建好一个存放COCOàVOC的文件夹,该文件夹VOC2007_g,该目录下包含Annotations,ImageSets,JPEGImages三个文件:
好的,如今开始:
Step 1:截取须要的类别(和VOC相同的类别) getclassNum.py
#-*- coding:utf-8-*- import json className = { 1:'person', 3:'car', 6:'bus', 8:'truck' } classNum = [1, 3,6,8] cocojson="E:/coco/COCO/annotations/instances_train2014.json" def writeNum(Num): withopen("COCO_train.json", "a+") as f: f.write(str(Num)) inputfile = [] inner = {} with open(cocojson, "r+") as f: allData = json.load(f) data =allData["annotations"] print(data[1]) print("read ready") for i in data: if (i['category_id'] in classNum): inner = { "filename":str(i["image_id"]).zfill(12), "name":className[i["category_id"]], "bndbox":i["bbox"] } inputfile.append(inner) inputfile = json.dumps(inputfile) writeNum(inputfile)
Step2: 根据选取出来的类别中的图片筛选须要的图片到指定目录存放。chooseImagesbyID.py
# -*- coding: utf-8 -*- # @Time : 2018/03/09 10:46 # @Author : SyGoing # @Site : # @File : getimagesbyID.py # @Software: PyCharm import json import os import cv2 nameStr = [] with open("COCO_train.json", "r+") as f: data = json.load(f) print("read ready") for i in data: imgName = "COCO_train2014_"+ str(i["filename"]) + ".jpg" nameStr.append(imgName) nameStr = set(nameStr) print(nameStr) print(len(nameStr)) path = 'E:/coco/COCO/train/' savePath="E:/coco/COCO/VOC2007_/JPEGImages/" count=0 for file in nameStr: img=cv2.imread(path+file) cv2.imwrite(savePath+file,img) count=count+1 print('num: '+count.__str__()+' '+file+'\n')
Step 3:根据筛选出来的图片ID生成VOC数据集的XML文件到Annotations文件夹下。CreateXML.py
#-*- coding:utf-8-*- import xml.dom import xml.dom.minidom import os # from PIL import Image import cv2 import json # xml文件规范定义 _IMAGE_PATH = 'E:/coco/COCO/train' _INDENT = '' * 4 _NEW_LINE = '\n' _FOLDER_NODE = 'COCO2014' _ROOT_NODE = 'annotation' _DATABASE_NAME = 'LOGODection' _ANNOTATION = 'COCO2014' _AUTHOR = 'SyGoing_CSDN' _SEGMENTED = '0' _DIFFICULT = '0' _TRUNCATED = '0' _POSE = 'Unspecified' # _IMAGE_COPY_PATH= 'JPEGImages' _ANNOTATION_SAVE_PATH = 'E:/coco/COCO/VOC2007_/Annotations' # _IMAGE_CHANNEL= 3 # 封装建立节点的过程 def createElementNode(doc, tag, attr): #建立一个元素节点 element_node = doc.createElement(tag) # 建立一个文本节点 text_node = doc.createTextNode(attr) # 将文本节点做为元素节点的子节点 element_node.appendChild(text_node) return element_node # 封装添加一个子节点 def createChildNode(doc, tag, attr, parent_node): child_node = createElementNode(doc,tag, attr) parent_node.appendChild(child_node) # object节点比较特殊 def createObjectNode(doc, attrs): object_node =doc.createElement('object') midname=attrs['name'] if midname !='person': midname='car' createChildNode(doc, 'name', midname, object_node) #createChildNode(doc, 'name',attrs['name'], # object_node) createChildNode(doc, 'pose', _POSE, object_node) createChildNode(doc, 'truncated', _TRUNCATED,object_node) createChildNode(doc, 'difficult', _DIFFICULT,object_node) bndbox_node = doc.createElement('bndbox') createChildNode(doc, 'xmin',str(int(attrs['bndbox'][0])), bndbox_node) createChildNode(doc, 'ymin',str(int(attrs['bndbox'][1])), bndbox_node) createChildNode(doc, 'xmax',str(int(attrs['bndbox'][0] + attrs['bndbox'][2])), bndbox_node) createChildNode(doc, 'ymax',str(int(attrs['bndbox'][1] + attrs['bndbox'][3])), bndbox_node) object_node.appendChild(bndbox_node) return object_node # 将documentElement写入XML文件 def writeXMLFile(doc, filename): tmpfile = open('tmp.xml', 'w') doc.writexml(tmpfile, addindent='' *4, newl='\n', encoding='utf-8') tmpfile.close() # 删除第一行默认添加的标记 fin = open('tmp.xml') # print(filename) fout = open(filename, 'w') # print(os.path.dirname(fout)) lines = fin.readlines() for line in lines[1:]: if line.split(): fout.writelines(line) # new_lines =''.join(lines[1:]) # fout.write(new_lines) fin.close() fout.close() if __name__ == "__main__": ##读取图片列表 img_path ="E:/coco/COCO/VOC2007_/JPEGImages/" fileList = os.listdir(img_path) if fileList == 0: os._exit(-1) withopen("COCO_train.json", "r") as f: ann_data = json.load(f) current_dirpath =os.path.dirname(os.path.abspath('__file__')) if notos.path.exists(_ANNOTATION_SAVE_PATH): os.mkdir(_ANNOTATION_SAVE_PATH) # if notos.path.exists(_IMAGE_COPY_PATH): # os.mkdir(_IMAGE_COPY_PATH) for imageName in fileList: saveName =imageName.strip(".jpg") print(saveName) # pos =fileList[xText].rfind(".") # textName =fileList[xText][:pos] # ouput_file = open(_TXT_PATH +'/' + fileList[xText]) # ouput_file =open(_TXT_PATH) # lines = ouput_file.readlines() xml_file_name =os.path.join(_ANNOTATION_SAVE_PATH, (saveName + '.xml')) # withopen(xml_file_name,"w") as f: # pass img =cv2.imread(os.path.join(img_path, imageName)) print(os.path.join(img_path,imageName)) # cv2.imshow(img) height, width, channel =img.shape print(height, width, channel) my_dom = xml.dom.getDOMImplementation() doc = my_dom.createDocument(None,_ROOT_NODE, None) # 得到根节点 root_node = doc.documentElement # folder节点 createChildNode(doc, 'folder',_FOLDER_NODE, root_node) # filename节点 createChildNode(doc, 'filename',saveName + '.jpg', root_node) # source节点 source_node =doc.createElement('source') # source的子节点 createChildNode(doc, 'database',_DATABASE_NAME, source_node) createChildNode(doc, 'annotation',_ANNOTATION, source_node) createChildNode(doc, 'image','flickr', source_node) createChildNode(doc, 'flickrid','NULL', source_node) root_node.appendChild(source_node) # owner节点 owner_node = doc.createElement('owner') # owner的子节点 createChildNode(doc, 'flickrid','NULL', owner_node) createChildNode(doc, 'name',_AUTHOR, owner_node) root_node.appendChild(owner_node) # size节点 size_node =doc.createElement('size') createChildNode(doc, 'width',str(width), size_node) createChildNode(doc, 'height',str(height), size_node) createChildNode(doc, 'depth',str(channel), size_node) root_node.appendChild(size_node) # segmented节点 createChildNode(doc, 'segmented',_SEGMENTED, root_node) for ann in ann_data: imgName ="COCO_train2014_" + str(ann["filename"]) cname=saveName; if (saveName == imgName ): # object节点 object_node =createObjectNode(doc, ann) root_node.appendChild(object_node) else: continue # 构建XML文件名称 print(xml_file_name) # 建立XML文件 # createXMLFile(attrs, width,height, xml_file_name) # # 写入文件 # writeXMLFile(doc, xml_file_name)
Step4:生成Main文件夹下的trainval.txt,train.txt,val.txt
Test.txt. 这里用的事matlab写的.m文件
clc; clear; xmlfilepath='F:\ObjectDetection\object_mark\datamake\VOCdevkit\VOC2007\Annotations'; txtsavepath='F:\ObjectDetection\object_mark\datamake\VOCdevkit\VOC2007\ImageSets\Main\'; trainval_percent=0.5; train_percent=0.5; xmlfile=dir(xmlfilepath); numOfxml=length(xmlfile)-2; trainval=sort(randperm(numOfxml,floor(numOfxml*trainval_percent))); test=sort(setdiff(1:numOfxml,trainval)); trainvalsize=length(trainval); train=sort(trainval(randperm(trainvalsize,floor(trainvalsize*train_percent)))); val=sort(setdiff(trainval,train)); ftrainval=fopen([txtsavepath'trainval.txt'],'w'); ftest=fopen([txtsavepath'test.txt'],'w'); ftrain=fopen([txtsavepath'train.txt'],'w'); fval=fopen([txtsavepath'val.txt'],'w'); for i=1:numOfxml if ismember(i,trainval) fprintf(ftrainval,'%s\n',xmlfile(i+2).name(1:end-4)); if ismember(i,train) fprintf(ftrain,'%s\n',xmlfile(i+2).name(1:end-4)); else fprintf(fval,'%s\n',xmlfile(i+2).name(1:end-4)); end else fprintf(ftest,'%s\n',xmlfile(i+2).name(1:end-4)); end end fclose(ftrainval); fclose(ftrain); fclose(fval); fclose(ftest);
到这里基本完工!而后根据本身的需求应用吧!
参考博客:
http://blog.csdn.net/zhangjunbob/article/details/52769381
http://blog.csdn.net/yjl9122/article/details/56842098