COCO 数据集使用说明书

时间 2019-11-24

标签 coco 数据使用说明书繁體版

原文原文链接

下面的代码改写自 COCO 官方 API，改写后的代码 cocoz.py 被我放置在 Xinering/cocoapi。个人主要改进有：html

增长对 Windows 系统的支持；
替换 defaultdict 为 dict.get()，解决 Windows 的编码问题。
跳过解压这一步骤（包括直接的或间接的解压），直接对图片数据 images 与标注数据 annotations 操做。
由于，无需解压，因此 API 的使用更加便捷和高效。

具体的 API 使用说明见以下内容：python

0 准备

COCOZ 简介git

为了可使用 cocoz，你须要下载 Xinering/cocoapi。以后将其放在你须要运行的项目或程序根目录，亦或者使用以下命令添加环境变量（暂时的）：github

import sys
sys.path.append('D:\API\cocoapi\PythonAPI')  # 你下载的 cocoapi 所在路径

from pycocotools.cocoz import AnnZ, ImageZ, COCOZ   # 载入 cocoz

下面咱们就能够利用这个 API 的 cocoz.AnnZ、cocoz.ImageZ 和 cocoz.COCOZ 类来操做 COCO 图片和标注了。下面我以 Windows 系统为例说明，Linux 是相似的。json

1 cocoz.AnnZ 与 cocoz.ImageZ

root = r'E:\Data\coco'   # COCO 数据集所在根目录
annType = 'annotations_trainval2017'   # COCO 标注数据类型

annZ = AnnZ(root, annType)

咱们来查看一下，该标注数据所包含的标注类型：api

annZ.names

['annotations/instances_train2017.json',
 'annotations/instances_val2017.json',
 'annotations/captions_train2017.json',
 'annotations/captions_val2017.json',
 'annotations/person_keypoints_train2017.json',
 'annotations/person_keypoints_val2017.json']

以 dict 的形式载入 'annotations/instances_train2017.json' 的具体信息：网络

annFile = 'annotations/instances_val2017.json'
dataset = annZ.json2dict(annFile)

Loading json in memory ...
used time: 0.890035 s

dataset.keys()

dict_keys(['info', 'licenses', 'images', 'annotations', 'categories'])

dataset['images'][0]  # 记录了一张图片的一些标注信息

{'license': 4,
 'file_name': '000000397133.jpg',
 'coco_url': 'http://images.cocodataset.org/val2017/000000397133.jpg',
 'height': 427,
 'width': 640,
 'date_captured': '2013-11-14 17:02:52',
 'flickr_url': 'http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg',
 'id': 397133}

1.1 从网页获取图片

%pylab inline
import skimage.io as sio

coco_url = dataset['images'][0]['coco_url']
# use url to load image
I = sio.imread(coco_url)
plt.axis('off')
plt.imshow(I)
plt.show()

Populating the interactive namespace from numpy and matplotlib

1.2 从本地读取图片

为了不解压数据集，我使用了 zipfile 模块：app

imgType = 'val2017'
imgZ = ImageZ(root, imgType)

I = imgZ.buffer2array(imgZ.names[0])

plt.axis('off')
plt.imshow(I)
plt.show()

2 cocoz.COCOZ

root = r'E:\Data\coco'   # COCO 数据集所在根目录
annType = 'annotations_trainval2017'   # COCO 标注数据类型
annFile = 'annotations/instances_val2017.json'

annZ = AnnZ(root, annType)
coco = COCOZ(annZ, annFile)

Loading json in memory ...
used time: 1.02004 s
Loading json in memory ...
creating index...
index created!
used time: 0.431003 s

若是你须要预览你载入的 COCO 数据集，可使用 print() 来实现：dom

print(coco)

description: COCO 2017 Dataset
url: http://cocodataset.org
version: 1.0
year: 2017
contributor: COCO Consortium
date_created: 2017/09/01

coco.keys()

dict_keys(['dataset', 'anns', 'imgToAnns', 'catToImgs', 'imgs', 'cats'])

2.1 展现 COCO 的类别与超类

cats = coco.loadCats(coco.getCatIds())
nms = set([cat['name'] for cat in cats])  # 获取 cat 的 name 信息
print('COCO categories: \n{}\n'.format(' '.join(nms)))
# ============================================================
snms = set([cat['supercategory'] for cat in cats])  # 获取 cat 的 name 信息
print('COCO supercategories: \n{}'.format(' '.join(snms)))

COCO categories: 
kite potted plant handbag clock umbrella sports ball bird frisbee toilet toaster spoon car snowboard banana fire hydrant skis chair tv skateboard wine glass tie cell phone cake zebra baseball glove stop sign airplane bed surfboard cup knife apple broccoli bicycle train carrot remote cat bear teddy bear person bench horse dog couch orange hair drier backpack giraffe sandwich book donut sink oven refrigerator boat mouse laptop toothbrush keyboard truck motorcycle bottle pizza traffic light cow microwave scissors bus baseball bat elephant fork bowl tennis racket suitcase vase sheep parking meter dining table hot dog

COCO supercategories: 
accessory furniture sports vehicle appliance electronic animal indoor outdoor person kitchen food

2.2 经过给定条件获取图片

获取包含给定类别的全部图片electron

# get all images containing given categories, select one at random
catIds = coco.getCatIds(catNms=['cat', 'dog', 'snowboar'])  # 获取 Cat 的 Ids
imgIds = coco.getImgIds(catIds=catIds )  # 
img = coco.loadImgs(imgIds)

随机选择一张图片的信息：

img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]

img

{'license': 4,
 'file_name': '000000318238.jpg',
 'coco_url': 'http://images.cocodataset.org/val2017/000000318238.jpg',
 'height': 640,
 'width': 478,
 'date_captured': '2013-11-21 00:01:06',
 'flickr_url': 'http://farm8.staticflickr.com/7402/9964003514_84ce7550c9_z.jpg',
 'id': 318238}

2.2.1 获取图片

从网络获取图片：

coco_url = img['coco_url']

I = sio.imread(coco_url)
plt.axis('off')
plt.imshow(I)
plt.show()

从本地获取图片：

这里有一个梗：cv2 的图片默认模式是 BGR 而不是 RGB,因此，将 I 直接使用 plt 会改变原图的颜色空间，为此咱们可使用 cv2.COLOR_BGR2RGB.

imgType = 'val2017'
imgZ = ImageZ(root, imgType)

I = imgZ.buffer2array(img['file_name'])

plt.axis('off')
plt.imshow(I)
plt.show()

2.3 将图片的 anns 信息标注在图片上

# load and display instance annotations
plt.imshow(I)
plt.axis('off')
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco.loadAnns(annIds)
coco.showAnns(anns)

2.4 关键点检测

# initialize COCO api for person keypoints annotations
root = r'E:\Data\coco'   # COCO 数据集所在根目录
annType = 'annotations_trainval2017'   # COCO 标注数据类型
annFile = 'annotations/person_keypoints_val2017.json'

annZ = AnnZ(root, annType)
coco_kps = COCOZ(annZ, annFile)

Loading json in memory ...
used time: 0.882997 s
Loading json in memory ...
creating index...
index created!
used time: 0.368036 s

先选择一张带有 person 的图片：

catIds = coco.getCatIds(catNms=['person'])  # 获取 Cat 的 Ids
imgIds = coco.getImgIds(catIds=catIds)  
img = coco.loadImgs(imgIds)[77]

# use url to load image
I = sio.imread(img['coco_url'])
plt.axis('off')
plt.imshow(I)
plt.show()

# load and display keypoints annotations
plt.imshow(I); plt.axis('off')
ax = plt.gca()
annIds = coco_kps.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco_kps.loadAnns(annIds)
coco_kps.showAnns(anns)

2.5 看图说话

# initialize COCO api for person keypoints annotations
root = r'E:\Data\coco'   # COCO 数据集所在根目录
annType = 'annotations_trainval2017'   # COCO 标注数据类型
annFile = 'annotations/captions_val2017.json'

annZ = AnnZ(root, annType)
coco_caps = COCOZ(annZ, annFile)

Loading json in memory ...
used time: 0.435748 s
Loading json in memory ...
creating index...
index created!
used time: 0.0139964 s

# load and display caption annotations
annIds = coco_caps.getAnnIds(imgIds=img['id']);
anns = coco_caps.loadAnns(annIds)
coco_caps.showAnns(anns)
plt.imshow(I)
plt.axis('off')
plt.show()

show：

A brown horse standing next to a woman in front of a house.
a person standing next to a horse next to a building
A woman stands beside a large brown horse.
The woman stands next to the large brown horse.
A woman hold a brown horse while a woman watches.

若是你须要使用官方 API, 能够参考 COCO 数据集的使用。

若是你以为对你有帮助，请帮忙在 Github 上点个 star：datasetsome。该教程的代码我放在了 GitHub: COCOZ 使用说明书。