Person Transfer GAN to Bridge Domain Gap for Person Re-identification


注:原创不易,转载请务必注明原做者和出处,感谢支持!算法

相关背景

行人再识别(Person Re-identification, Person ReID)是指给定一个行人的图片/视频(probe),而后从一个监控网络所拍摄的图片/视频(gallery)库中识别出该行人的这个一个过程。其能够看作是一个基于内容的图像检索(CBIR)的一个子问题。网络

论文题目:Person Transfer GAN to Bridge Domain Gap for Person Re-identificationapp

来源:CVPR 2018dom

摘要:Although the performance of person Re-Identification(ReID) has been significantly boosted, many challengins issues in real scenarios have not been fully investigated, e.g., the complex scenes and lighting variations, viewpoint and pose changes, and the large number of identities in a camera network. To facilitate the research towards conquering those issues, this paper contributes a new dataset called MSMT17 with many important features, e.g., 1) the raw videos are taken by an 15-camera network deployed in both indoor and outdoor scenes, 2) the videos cover a long period of time and present complex lighting variations, and 3) it contains currently the largest number of annotated identities, i.e. 4101 identities and 126441 bounding boxes. We also observe that, domain gap commonly exists between datasets, which essentially causes severe performance drop when training and testing on different datasets. This results in that available training data cannot be effectively leveraged for new testing domains. To relieve the expensive costs of annotating new training samples, we propose a Person Transfer Generative Adversarial Network(PTGAN) to bridge the domain gap. Comprehensive experiments show that the domain gap could be substantially narrowed-down by the PTGAN.ide

主要内容

MSMT17

数据集网址:http://www.pkuvmc.com函数

针对目前Person ReID数据集存在的缺陷:性能

  • 数据量规模小
  • 场景单一
  • 数据采集的时间跨度短,光照变化不明显
  • 数据标注方式不合理

本文发布了一个新的Person ReID数据集——MSMT17。MSMT17是目前为止数据量规模最大的Person ReID数据集。共有126441个Bounding Boxes,4101个Identities,15个Cameras,涵盖了indoor和outdoor两个场景,Detector用的是更为先进的Faster RCNN。测试

Person Transfer GAN(PTGAN)

Domain Gap现象this

举个例子,好比在CUHK03数据集上训练好的模型放到PRID数据集上测试,结果rank-1的准确率只有2.0%。在不一样的Person ReID数据集上进行算法的训练和测试会致使ReID的性能急剧降低。而这种降低是广泛存在的。这意味着基于旧有的训练数据训练到的模型没法直接应用在新的数据集中,如何下降Domain Gap的影响以利用好旧有的标注数据颇有研究的必要。为此本文提出了PTGAN模型。

形成Domain Gap现象的缘由是复杂的,多是因为光照、图像分辨率、人种、季节和背景等复杂因素形成的。

好比,咱们在数据集B上作Person ReID任务时,为了更好地利用现有数据集A的训练数据,咱们能够试着将数据集A中的行人图片迁移到目标数据集B当中。但因为Domain Gap的存在,在迁移时,要求算法可以作到如下两点:

  1. 被迁移的行人图片应该具备和目标数据集图片相一致的style,这是为了尽量地下降由于style不一致所致使的Domain Gap所带来的性能降低。
  2. 具备区分不一样行人能力的外观特征(appearance)和身份线索(identity cues)应该在迁移以后保持不变!由于迁移前和迁移后的行人具备相同的label,即他们应该是同一我的。

由于Person Transfer与Unpaired Image-to-Image Translation任务相似,因此本文选择在Unpaired Image-to-Image Translation任务中表现优异的Cycle-GAN模型基础上,提出了Person Transfer GAN模型。PTGAN模型的loss函数\(L_{PTGAN}\)被设计成以下公式:
\[ L_{PTGAN} = L_{Style} + \lambda_1L_{ID} \]
其中:
\(L_{Style}\):the style loss
\(L_{ID}\):the identity loss
\(\lambda_1\):the parameter for the trade-off between the two losses above

定义下列符号,则\(L_{Style}\)能够表示成:
\(G\):the style mapping function from dataset A to dataset B
\(\overline{G}\):the style mapping function from dataset B to dataset A
\(D_A\):the style discriminator for dataset A
\(D_B\):the style discriminator for dataset B

\[ L_{Style} = L_{GAN}(G, D_B, A, B) + L_{GAN}(\overline{G}, D_A, B, A) + \lambda_2L_{cyc}(G, \overline{G}) \]
其中:
\(L_{GAN}\):the standard adversarial loss
\(L_{cyc}\):the cycle consistency loss

定义下列符号,则\(L_{ID}\)能够表示成:
\(a\)\(b\):original image from dataset A and B
\(G(a)\)\(\overline{G}(b)\):transferred image from image a and b
\(M(a)\)\(M(b)\):forground mask of image a and b

\[ L_{ID} = \mathbb{E}_{a \sim p_{data}(a)}\left[\left\| (G(a) - a) \odot M(a)\right \|_2\right] + \mathbb{E}_{b \sim p_{data}(b)}\left[\left\| (\overline{G}(b) - b) \odot M(b)\right \|_2\right] \]

迁移效果图

总结

  • 本文发布了一个更接近实际应用场景的新数据集MSMT17,因其更接近实际的复杂应用场景,使得MSMT17数据集更具挑战性和研究价值
  • 本文提出了一个可以下降Domain Gap影响的PTGAN模型,并经过实验证实其有效性
相关文章
相关标签/搜索