Mask_RCNN学习记录（matterport版本）

###资源连接git

Mask R-CNN论文
matterport版本的GitHub 基于Keras和Tensorflow，关于程序的安装与使用，readme文件中写得很清楚
GitHub上还有Facebook的官方实现版本：Detectron maskrcnn-benchmark

###安装github

参考matterport版本的GitHub的README.md中requirements
另外若是要在MS COCO数据集上训练、测试，还需pycocotools

###相关博客学习Mask RCNN网络结构，并构建颜色填充器应用该版本以ResNet101 + FPN为backbone，heads包括检测和Mask预测两部分，其中检测部分包括类别预测和bbox回归。 English Version 中文版网络

###网络介绍 Mask R-CNN是用于实例分割和目标检测的，在目标检测网络Faster R-CNN基础上增长Mask预测分支 (图片来源：http://www.javashuo.com/article/p-oodjvowl-gg.html)ide

###Mask RCNN改进函数

Mask Scoring R-CNN: 给Mask也打个分
Faster Training of Mask R-CNN by Focusing on Instance Boundaries: 利用实例边缘信息加速训练：训练过程当中，对预测的Mask和GT的Mask进行边缘检测，计算二者的均方偏差(Mean Square Error, MSE),将其做为损失函数的一部分（我把Paper中Edge Argument Head简单实现了）。我我的理解是：该文章更可能是加速了网络训练的速度，所以精度有必定的提升（在训练过程当中用边缘信息指明了一条道路，所以在梯度降低的过程当中快了一些）

<details> <summary>Code is Here: 点击查看详细内容</summary> <p>在mrcnn/model.py中添加edge_loss函数项</p> <pre><code>def mrcnn_edge_loss_graph(target_masks, target_class_ids, pred_masks): """Edge L2 loss for mask edge head学习

target_masks: [batch, num_rois, height, width].
    A float32 tensor of Value 0 or 1(boolean?). Use zero padding to fill array
target_class_ids: [batch, num_rois]. Integer class IDs. Zeros padded.
pred_masks: [batch, proposal, height, width, num_classes] float32 tensor
            with value from 0 to 1(soft mask)(more information)    
"""
# Reshape for simplicity. Merge first two dimensions into one
# 即将batch 和 num_rois 合并为一项
target_class_ids = K.reshape(target_class_ids, (-1,))
mask_shape = tf.shape(target_masks)
target_masks = K.reshape(target_masks, (-1, mask_shape[2], mask_shape[3]))
pred_shape = tf.shape(pred_masks)
pred_masks = K.reshape(pred_masks,
                       (-1, pred_shape[2], pred_shape[3], pred_shape[4]))
#Permute predicted masks to [N, num_classes, height, width]
pred_masks = tf.transpose(pred_masks, [0, 3, 1, 2])

# Only positive ROIs contribute to the loss. (正的ROI是相对BG而言吗)
# And only the class specific mask of each ROI
# tf.where 得到索引值
# tf.gather 根据索引值从张量中得到元素构成新张量Tensor
# tf.cast 类型转换
# tf.stack
positive_ix = tf.where(target_class_ids > 0)[:, 0]
positive_class_ids = tf.cast(
    tf.gather(target_class_ids, positive_ix), tf.int64)
indices = tf.stack([positive_ix, positive_class_ids], axis=1)

# Gather the masks (predicted and true) that contribute to loss
y_true = tf.gather(target_masks, positive_ix)
y_pred = tf.gather_nd(pred_masks, indices)
    
# shape: [batch * rois, height, width, 1]
y_true = tf.expand_dims(y_true, -1)
y_pred = tf.expand_dims(y_pred, -1)

y_true = 255 * y_true
y_pred = 255 * y_pred

# shape: [3, 3, 1, 2]
sobel_kernel = tf.constant([[[[1, 1]], [[0, 2]], [[-1, 1]]],
                            [[[2, 0]], [[0, 0]], [[-2, 0]]],
                            [[[1,-1]], [[0,-2]], [[-1,-1]]]], dtype=tf.float32)
                            
# Conv2D with kernel
edge_true = tf.nn.conv2d(y_true, sobel_kernel, strides=[1, 1, 1, 1], padding="SAME")    
edge_pred = tf.nn.conv2d(y_pred, sobel_kernel, strides=[1, 1, 1, 1], padding="SAME")

# abs and clip
edge_true = tf.clip_by_value(abs(edge_true), 0, 255)
edge_pred = tf.clip_by_value(abs(edge_pred), 0, 255)    

# Mean Square Error(MSE) Loss
return tf.reduce_mean(tf.square(edge_true/255. - edge_pred/255.))</code>  </pre>

</details>测试

###说明ui

Keras中fit函数中，每一个Epoch训练的数目是 batch_size × steps per epoch，故每一个Epoch不必定是把整个train_set所有训练一遍。原帖子
使用conda install命令安装tensorflow-gpu教程
若是用不习惯.ipynb文件，能够用Jupyter NoteBook将其保存为.py文件命令行输入jupyter notebook启动，打开文件后Download as Python(.py)
另外，感受有一个idea仍是不够的，并且还要作充分的实验进行验证（好比说为何边缘检测用的是Sobel，而是Laplace或是Canny）（又有点实践指导理论的意思。。。）。
最后，欢迎批评指正！