导言html
目标检测(Object Detection)能够识别一幅图像中的多个物体,定位不一样物体的同时(边界框),贴上相应的类别。简单来讲,解决了what和where问题。授人以鱼,不如授人以渔,本文不会具体介绍某类/某种算法(one-stage or two-stage),但会给出目标检测相关论文的最强合集(持续更新ing)。为了follow潮流(装B),Amusi将目标检测论文合集的github库起名为awesome-object-detection。git
CVergithub
编辑: Amusi 算法
校稿: Amusiapp
Object Detection Wikidom
Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. Well-researched domains of object detection include face detection and pedestrian detection. Object detection has applications in many areas of computer vision, including image retrieval and video surveillance.
Object Detectionide
首先,Amusi先安利一个网站,打开下述连接后,既能够看到使人热血沸腾的画面。学习
link:网站
https://handong1587.github.io/deep_learning/2015/10/09/object-detection.htmlui
当初看到这个网址,我很惊讶,连接上写的是2015/10/09,我觉得是很老的资源,但看到内容后,着实震惊了。该库在handong大神的我的主页上,但并无Object Detection单独的github库。受此启发,我擅自(由于尚未获得本人赞成)将handong大神的Object Detection整理的内容进行精简和补充(实在班门弄斧了)。因而建立了一个名为awesome-object-detection的github库。
Awesome-Object-Detection
接下来,重点介绍一下这个“很copy”的库。awesome-object-detection的目的是为了提供一个目标检测(Object Detection)学习的平台。特色是:介绍最新的paper和最新的code(尽可能更新!)因为Amusi仍是初学者,目前尚未办法对每一个paper进行介绍,但后续会推出paper精讲的内容,也欢迎你们star,fork并pull本身所关注到最新object detection的工做。
那来看看目前,awesome-object-detection里有哪些干货吧~
为了节省篇幅,这里只介绍较为重要的工做:
R-CNN三件套(R-CNN Fast R-CNN和Faster R-CNN)
Light-Head R-CNN
Cascade R-CNN
YOLO三件套(YOLOv1 YOLOv2 YOLOv3)
SSD(SSD DSSD FSSD ESSD Pelee)
R-FCN
FPN
DSOD
RetinaNet
DetNet
...
你们对常见的R-CNN系列和YOLO系列必定很熟悉了,这里Amusi也不想重复,由于显得没有逼格~这里主要简单推荐两篇paper,来凸显一下awesome-object-detection的意义。
Pelee
《Pelee: A Real-Time Object Detection System on Mobile Devices》
intro: (ICLR 2018 workshop track)
arxiv: https://arxiv.org/abs/1804.06882
github: https://github.com/Robert-JunWang/Pelee
Abstract:An increasing need of running Convolutional Neural Network (CNN) models on mobile devices with limited computing power and memory resource encourages studies on efficient model design. A number of efficient architectures have been proposed in recent years, for example, MobileNet, ShuffleNet, and NASNet-A. However, all these models are heavily dependent on depthwise separable convolution which lacks efficient implementation in most deep learning frameworks. In this study, we propose an efficient architecture named PeleeNet, which is built with conventional convolution instead. On ImageNet ILSVRC 2012 dataset, our proposed PeleeNet achieves a higher accuracy by 0.6% (71.3% vs. 70.7%) and 11% lower computational cost than MobileNet, the state-of-the-art efficient architecture. Meanwhile, PeleeNet is only 66% of the model size of MobileNet. We then propose a real-time object detection system by combining PeleeNet with Single Shot MultiBox Detector (SSD) method and optimizing the architecture for fast speed. Our proposed detection system, named Pelee, achieves 76.4% mAP (mean average precision) on PASCAL VOC2007 and 22.4 mAP on MS COCO dataset at the speed of 17.1 FPS on iPhone 6s and 23.6 FPS on iPhone 8. The result on COCO outperforms YOLOv2 in consideration of a higher precision, 13.6 times lower computational cost and 11.3 times smaller model size. The code and models are open sourced.
Quantization Mimic
《Quantization Mimic: Towards Very Tiny CNN for Object Detection》
Tsinghua University1 & The Chinese University of Hong Kong2 &SenseTime3
arxiv: https://arxiv.org/abs/1805.02152
注:看一下这篇paper联名的机构......2018-05-06发布在arXiv(热乎乎的还烫手)
Abstract:In this paper, we propose a simple and general framework for training very tiny CNNs for object detection. Due to limited representation ability, it is challenging to train very tiny networks for complicated tasks like detection. To the best of our knowledge, our method, called Quantization Mimic, is the first one focusing on very tiny networks. We utilize two types of acceleration methods: mimic and quantization. Mimic improves the performance of a student network by transfering knowledge from a teacher network. Quantization converts a full-precision network to a quantized one without large degradation of performance. If the teacher network is quantized, the search scope of the student network will be smaller. Using this property of quantization, we propose Quantization Mimic. It first quantizes the large network, then mimic a quantized small network. We suggest the operation of quantization can help student network to match the feature maps from teacher network. To evaluate the generalization of our hypothesis, we carry out experiments on various popular CNNs including VGG and Resnet, as well as different detection frameworks including Faster R-CNN and R-FCN. Experiments on Pascal VOC and WIDER FACE verify our Quantization Mimic algorithm can be applied on various settings and outperforms state-of-the-art model acceleration methods given limited computing resouces.
总结
awesome-object-detection这个库的目的是为了尽量介绍最新的关于目标检测(Object Detection)相关的工做(paper and code)。因为Amusi仍是初学者,因此整理很差/不规范的地方,还请你们及时指出。由于该库直接copy了handong大神的内容,因此若是有版权侵犯,我会当即删除/修改(正在联系handong大神ing)。
若是你们以为awesome-object-detection对本身有一丢丢帮助,那么欢迎你们star和fork,哈哈,更欢迎你们pull~
打开“阅读原文”,能够直接访问awesome-object-detection
link:https://github.com/amusi/awesome-object-detection