PaddleHub 肺炎CT影像分析

时间 2020-04-09

标签 paddlehub 肺炎影像分析繁體版

原文原文链接

肺炎CT影像分析模型（Pneumonia-CT-LKM-PP）能够高效地完成对患者CT影像的病灶检测识别、病灶轮廓勾画，经过必定的后处理代码，能够分析输出肺部病灶的数量、体积、病灶占比等全套定量指标。值得强调的是，该系统采用的深度学习算法模型充分训练了所收集到的高分辨率和低分辨率的CT影像数据，能极好地适应不一样等级CT影像设备采集的检查数据，有望为医疗资源受限和医疗水平偏低的基层医院提供有效的肺炎辅助诊断工具。python

NOTE: 若是您在本地运行该项目示例，须要首先安装PaddleHub。若是您在线运行，须要首先fork该项目示例。以后按照该示例操做便可。git

In[2]

!pip install paddlehub==1.6.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
!pip install pydicom -i https://pypi.tuna.tsinghua.edu.cn/simple
!pip install nibabel -i https://pypi.tuna.tsinghua.edu.cn/simple
!pip install scikit-image==0.15.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

1、定义待预测数据

以本示例中dcm_data文件夹下demo.dcm医学影像为例。github

关于dcm医学影像参考：https://baike.baidu.com/item/DICOM/2171358?fr=aladdin算法

In[3]

# 读取数据
import os
import json
import numpy as np
from mate.load_input_data import load_input_data
from mate.preprocess_lung_part import preprocess_lung_part
from lib.threshold_function_module import windowlize_image
from lib.png_rw import npy_to_png
from lib.judge_mkdir import judge_mkdir
import cv2

image_raw, info_dict = load_input_data('./dcm_data')

Begin loading data
     searching ./dcm_data
     LKM 3 contains 1 slices
     Valid imaging slices: 1
Done loading data, runtime: 0.008

In[5]

# 1. 展现医学图像
import matplotlib.pyplot as plt 
import matplotlib.image as mpimg 

image = windowlize_image(image_raw, 1500, -500)[0]
image = npy_to_png(image)
image = (image - float(np.min(image))) / float(np.max(image)) * 255.

image = image[np.newaxis, :, :]
image = image.transpose((1, 2, 0)).astype('float32')
image = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
cv2.imwrite('demo.png', image)
img = mpimg.imread('demo.png') 
plt.figure(figsize=(10,10))
plt.imshow(img) 
plt.axis('off') 
plt.show()

In[6]

# 2. 进行肺部分割须要的前处理
lung_part, info_dict = preprocess_lung_part(image_raw, info_dict)

Begin to preprocess
data.min(), data.max() -1024 250
data.min(), data.max() 0.0 0.4212963
----------     begin crop2.5D     --------
----------     end crop2.5D     --------

In[7]

# 3. 进行病灶分割须要的前处理
ww, wc = (1500, -500)
lesion_part = windowlize_image(image_raw.copy(), ww, wc)
lesion_part = np.squeeze(lesion_part, 0)

In[8]

lesion_np_path = "lesion_part.npy"
lung_np_path = "lung_part.npy"
np.save(lung_np_path, lung_part)
np.save(lesion_np_path, lesion_part)
print('肺部分割输入：', lung_part.shape)
print('病灶分割输入：', lesion_part.shape)

肺部分割输入： (1, 320, 320, 3)
病灶分割输入： (512, 512)

2、加载预训练模型

PaddleHub提供了病灶分析和肺部分割的Module，即Pneumonia_CT_LKM_PP，包含病灶分割和肺部分割2个模块，都是基于UNet进行一系列优化。json

In[9]

import paddlehub as hub

pneumonia = hub.Module(name="Pneumonia_CT_LKM_PP")

[2020-03-23 11:49:36,360] [    INFO] - Installing Pneumonia_CT_LKM_PP module

Downloading Pneumonia_CT_LKM_PP
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpiad_bch0/Pneumonia_CT_LKM_PP
[==================================================] 100.00%

[2020-03-23 11:49:40,364] [    INFO] - Successfully installed Pneumonia_CT_LKM_PP-1.0.0

3、预测

PaddleHub对于支持一键预测的module，能够调用module的相应预测API，完成预测功能。数组

In[10]

input_dict = {"image_np_path": [[lesion_np_path, lung_np_path]] }

results = pneumonia.segmentation(data=input_dict)

[2020-03-23 11:49:45,049] [    INFO] - 0 pretrained paramaters loaded by PaddleHub
[2020-03-23 11:49:45,055] [    INFO] - Installing Pneumonia_CT_LKM_PP_lung module

Downloading Pneumonia_CT_LKM_PP_lung
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmp3pohjqws/Pneumonia_CT_LKM_PP_lung
[==================================================] 100.00%

[2020-03-23 11:49:48,510] [    INFO] - Successfully installed Pneumonia_CT_LKM_PP_lung-1.0.0
[2020-03-23 11:49:48,723] [    INFO] - 0 pretrained paramaters loaded by PaddleHub

In[11]

# 输出结果包含input_lesion_np_path与output_lesion_np
print(results[0])

{'input_lesion_np_path': 'lesion_part.npy', 'output_lesion_np': array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int64), 'output_lung_np': array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]]), 'input_lung_np_path': 'lung_part.npy'}

以上运行结果中：babel

input_lesion_np_path：存放用于病灶分析的numpy数组路径工具

output_lesion_np：存放病灶分析结果post

input_lesion_np_path：存放用于肺部分割的numpy数组路径学习

output_lung_np：存放肺部分割结果

4、后处理

经过必定的后处理，将肺部分割结果映射到原图上，再将病灶分割和肺部分割融合到一张图上可视化。

In[12]

from PIL import Image as PILImage
from mate.postprocess_lung_part import postprocess_lung_part
from mate.merge_process import merge_process
from lib.remove_small_obj_module import remove_small_obj

# 将类别转换为可视化的像素点值
def get_color_map_list(num_classes):
    """ Returns the color map for visualizing the segmentation mask, which can support arbitrary number of classes. Args: num_classes: Number of classes Returns: The color map """
    color_map = num_classes * [0, 0, 0]
    for i in range(0, num_classes):
        j = 0
        lab = i
        while lab:
            color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
            color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
            color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
            j += 1
            lab >>= 3

    return color_map
    
color_map_lesion = get_color_map_list(num_classes=2)
color_map_lung = get_color_map_list(num_classes=3)

lung_part = postprocess_lung_part(results[0]['output_lung_np'], info_dict)
        
lesion_part = results[0]['output_lesion_np'].astype(np.uint8)
for i in range(len(lesion_part)):
    lesion_part[i] = remove_small_obj(lesion_part[i], 10)
    
# 对肺部分割结果和病灶分割结果进行后处理
lung_part, lesion_part = merge_process(image_raw, lung_part, lesion_part)

process lung part post process:   0%|          | 0/1 [00:00<?, ?it/s]/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/skimage/morphology/misc.py:211: UserWarning: the min_size argument is deprecated and will be removed in 0.16. Use area_threshold instead.
  warn("the min_size argument is deprecated and will be removed in " +
process lung part post process: 100%|██████████| 1/1 [00:00<00:00, 68.80it/s]

Begin to postprocess
Done to postprocess

In[13]

# 展现肺部分割的图片，黄色表示右肺，绿色表示左肺
pred_mask = PILImage.fromarray(np.argmax(lung_part, -1)[0].astype(np.uint8), mode='P')
pred_mask.putpalette(color_map_lung)
pred_mask = pred_mask.convert('RGB')

lung_merge_img = np.where(pred_mask, pred_mask, img)
fig, axarr = plt.subplots(1, 1, figsize=(10, 10))

axarr.axis('off')
axarr.imshow(lung_merge_img)

2020-03-23 11:50:09,206-WARNING: Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

<matplotlib.image.AxesImage at 0x7f73a0086210>

In[14]

# 展现病灶分割的图片，能够看到左下角非肺部部分存在误检。
# 后续咱们与肺部分割相结合，去除误检
resmap = results[0]['output_lesion_np']
pred_mask = PILImage.fromarray(resmap.astype(np.uint8), mode='P')
pred_mask.putpalette(color_map_lesion)

pred_mask = pred_mask.convert('RGB')

lesion_merge_img = np.where(pred_mask, pred_mask, img)

fig, axarr = plt.subplots(1, 1, figsize=(10, 10))

axarr.axis('off')
axarr.imshow(lesion_merge_img)

2020-03-23 11:50:12,670-WARNING: Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

<matplotlib.image.AxesImage at 0x7f73a0063a90>

In[15]

#将两个结果合并，排除非肺部的病灶分割
import json
import numpy as np
from lib.info_dict_module import InfoDict
from mate.save_merged_png_cv2 import save_merged_png_cv2
from lib.judge_mkdir import judge_mkdir

In[16]

# 融合肺部分割结果和病灶分割结果

image = windowlize_image(image_raw, 1500, -500)[0]
image = npy_to_png(image)
image = (image - float(np.min(image))) / float(np.max(image)) * 255.

lung = lung_part[0,..., 1] + lung_part[0,..., 2]
binary = lung * 255
binary = binary.astype(np.uint8)
try:
    _, lung_contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
except:
    lung_contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

binary = lesion_part[0] * 255
binary = binary.astype(np.uint8)

try:
    _, lesion_contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
except:
    lesion_contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

image = image[np.newaxis, :, :]
image = image.transpose((1, 2, 0)).astype('float32')
image = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)

cv2.drawContours(image, lesion_contours, -1, (0, 0, 255), 2)
cv2.drawContours(image, lung_contours, -1, (0, 255, 0), 2)

cv2.imwrite('merged.png', image)

True

In[17]

# 能够看到此时误检的非肺部分已经被去除
img = mpimg.imread('merged.png') 
plt.figure(figsize=(10,10))
plt.imshow(img) 
plt.axis('off') 
plt.show()

In[18]

# 最后咱们根据预测计算一下病灶占比，病灶体积，病灶个数
from lib.c1_cal_lesion_percent import cal_lesion_percent
from lib.c2_cal_lesion_volume import cal_lesion_volume
from lib.c3_cal_lesion_num import cal_lesion_num
from lib.c4_cal_histogram import cal_histogram
from lib.c5_normal_statistics import normal_statistics

def cal_metrics(image_raw, lung_part, lesion_part, spacing_list):
    """ 进行指标计算 总体流程： 1. 分别获得左右肺和左右病灶 2. 计算病灶占比 3. 计算病灶体积 4. 计算病灶个数 5. 计算直方图 """
    print('cal the statistics metrics')
    # 1. 分别获得左右肺和左右病灶
    lung_l = lung_part[..., 1]
    lung_r = lung_part[..., 2]
    lesion_l = lesion_part.copy() * lung_l
    lesion_r = lesion_part.copy() * lung_r

    lung_tuple = (lung_l, lung_r, lung_part)
    lesion_tuple = (lesion_l, lesion_r, lesion_part)

    # 2. 计算病灶占比
    lesion_percent_dict = cal_lesion_percent(lung_tuple, lesion_tuple)

    # 3. 计算病灶体积
    lesion_volume_dict = cal_lesion_volume(lesion_tuple, spacing_list)

    # 4. 计算病灶个数
    lesion_num_dict = cal_lesion_num(lesion_tuple)

    # 5. 计算直方图
    hu_statistics_dict = cal_histogram(image_raw, lung_tuple)

    metrics_dict = {
        'lesion_num': lesion_num_dict,
        'lesion_volume': lesion_volume_dict,
        'lesion_percent': lesion_percent_dict,
        'hu_statistics': hu_statistics_dict,
        'normal_statistics': normal_statistics
    }

    return metrics_dict
    
# 进行指标计算
metrics_dict = cal_metrics(image_raw, lung_part, lesion_part, info_dict.spacing_list)

cal the statistics metrics

In[19]

# 打印一下各项指标, 'lung_l'为左肺，'lung_r'为右肺， 'lung_all'为两个肺。
print('病灶个数', metrics_dict['lesion_num'])
print('病灶体积', metrics_dict['lesion_volume'])
print('病灶占比', metrics_dict['lesion_percent'])

病灶个数 {'lung_l': 0, 'lung_r': 0, 'lung_all': 0}
病灶体积 {'lung_l': 0.0, 'lung_r': 0.0, 'lung_all': 0.0}
病灶占比 {'lung_l': 0.0, 'lung_r': 0.0, 'lung_all': 0.0}

如您在运行本教程有任何疑问，能够经过如下两种方式提问：

飞桨官方技术交流QQ群：703252161
PaddleHub issues https://github.com/PaddlePaddle/PaddleHub/issues

使用AI Studio一键上手实践项目吧：https://aistudio.baidu.com/aistudio/projectdetail/312508

PaddleHub 肺炎CT影像分析

1、定义待预测数据

2、加载预训练模型

3、预测

4、 后处理

4、后处理