项目地址:Mask_RCNNpython
语言框架:Python 3, Keras, and TensorFlowgit
Python 3.4, TensorFlow 1.3, Keras 2.0.8 其余依赖见:requirements.txt
github
基础网络:Feature Pyramid Network (FPN) and a ResNet101 backboneapi
如下是模型主体文件,bash
demo.ipynb Is the easiest way to start. It shows an example of using a model pre-trained on MS COCO to segment objects in your own images. It includes code to run object detection and instance segmentation on arbitrary images.网络
train_shapes.ipynb shows how to train Mask R-CNN on your own dataset. This notebook introduces a toy dataset (Shapes) to demonstrate training on a new dataset.框架
(model.py, utils.py, config.py): These files contain the main Mask RCNN implementation.ide
下面的几个文件是观察理解模型所用,post
inspect_data.ipynb. This notebook visualizes the different pre-processing steps to prepare the training data.学习
inspect_model.ipynb This notebook goes in depth into the steps performed to detect and segment objects. It provides visualizations of every step of the pipeline.
inspect_weights.ipynb This notebooks inspects the weights of a trained model and looks for anomalies and odd patterns.
可视化展现:
下面图片展现了RPN网络的输出,即同时包含了积极锚框和消极锚框的proposal们:
下面展现了送入回归器以前的(第一步中的proposal)和最终输出的定位框坐标:
mask展现,
查看不一样层的激活特征,
权重分布直方图,
做者已经提供了一个预训练好的基于COCO数据的权重以便咱们快速启动训练任务,相关代码位于samples/coco/coco.py
,
咱们既能够将之做为包导入到本身的代码中,也能够以下在命令行直接调用该部分代码,
# Train a new model starting from pre-trained COCO weights python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=coco # Train a new model starting from ImageNet weights python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=imagenet # Continue training a model that you had trained earlier python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5 # Continue training the last model you trained. This will find # the last trained weights in the model directory. python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=last
若是咱们但愿验证COCO数据于最后保存的模型之上,
# Run COCO evaluation on the last trained model python3 samples/coco/coco.py evaluate --dataset=/path/to/coco/ --model=last
相关参数仍然位于samples/coco/coco.py
。
做者推荐了一篇博客blog post about the balloon color splash sample,介绍了从标注图片到训练它们再到将结果应用于一个小例子中的流程,其源码也位于本项目内,见samples/coco/
balloon。
为了训练本身的数据,咱们须要修改继承两个类,
Config
This class contains the default configuration.Subclass it and modify the attributes you need to change.
Dataset
This class provides a consistent way to work with any dataset. // 一个class能够处理不一样数据集It allows you to use new datasets for training without having to change the code of the model. // 经过这个class咱们能够最大程度避免修改model文件自己
It also supports loading multiple datasets at the same time, which is useful if the objects you want to detect are not all available in one dataset. // 能够同时处理不一样数据集用于一次训练,知足特殊需求
有关使用示例见这四个文件:
samples/shapes/train_shapes.ipynb,
samples/coco/coco.py,
samples/balloon/balloon.py
,
samples/nucleus/nucleus.py
。
为了代码的简单和可扩展性,做者对工程进行了小幅度调整,其统计以下:
Image Resizing:本工程的图像尺寸预调整不一样于原论文中的,以COCO数据集为例,做者将数据调整为1024*1024的大小,为了保证长宽比不变(语义不受影响),做者采起了填充0将图片变为1:1长宽比的方式,而非直接裁剪插值。
Bounding Boxes:数据集中的数据标签,除了带有mask标签以外,还有gt box标签,做者为了统一并简化操做,舍弃了box标签,彻底使用mask图,使用它们生成一个彻底覆盖所有标记像素的框做为gt box,这一步骤很大程度上简化了图像加强操做,不少例如旋转处理同样的预处理操一旦涉及gt box就会变得很麻烦,采用mash生成方式可使得预处理操做更为方便。
为了验证计算出来的gt box和原数据gt box的差别,做者进行了比较,效果以下:
We found that ~2% of bounding boxes differed by 1px or more, ~0.05% differed by 5px or more, and only 0.01% differed by 10px or more.
Learning Rate:原论文使用0.02的学习率,做者实验以为该学习率偏大,常常性形成梯度爆炸,特别是当batch很小时,做者有两点猜想:这多是由于caffe与TensorFlow在多GPU上传播更新梯度的策略不一样所致(sum vs mean across batches and GPUs);或者这是因为paper团队使用了clip梯度的方式规避了梯度爆照的发生,可是做者提到他采用的clipping操做效果并不显著。
一、Install dependencies
pip3 install -r requirements.txt
二、Clone this repository
三、Run setup from the repository root directory
python3 setup.py install
四、Download pre-trained COCO weights (mask_rcnn_coco.h5) from the releases page.
五、(Optional) To train or test on MS COCO install pycocotools
from one of these repos. They are forks of the original pycocotools with fixes for Python3 and Windows (the official repo doesn't seem to be active anymore).
最后,做者展现了几个使用了本框架的工程,这里再也不引用。