最全总结 | 聊聊 Python 办公自动化之 PPT（下）

点击上方 “AirPython”，选择 “加为星标”html

第一时间关注 Python 技术干货！python

1. 前言

做为办公自动化 PPT 系列篇的最后一篇文章，咱们将 PPT 中的高级功能及经常使用点web

文章内容将覆盖：
算法

预设形状 Shape编程
图表 Chart
api
读取文字内容微信
保存全部图片网络

2. 预设形状 Shape

实际上，PPT 文档的内容区就是由各种形状 Shape 组成，包含：图片、文本框、视频、表格、预设形状app

其中，预设的普通形状也至关丰富，能够查看下面连接dom

使用下面的方法，能够向幻灯片中插入一个形状

slide.shapes.add_shape(autoshape_type_id, left, top, width, height)

参数分别是：

autoshape_type_id 形状类型
left 左边距
top 上边距
width 形状宽度
height 形状高度

咱们以插入一个简单的圆角矩形框为例

2-1 插入形状

2-2 设置形状属性

上面方法返回的形状对象，咱们能够进一步设置它的背景颜色及边框属性

好比：设置背景色为白色；边框颜色为红色，宽度为 0.5 厘米

更多形状能够参考下面连接

https://python-pptx.readthedocs.io/en/latest/api/enum/MsoAutoShapeType.html

from pptx.enum.shapes import MSO_SHAPE, MSO_SHAPE_TYPE

def insert_shape(slide, left, top, width, height, autoshape_type_id=MSO_SHAPE.CHEVRON, unit=Inches):
    """
    幻灯片中添加形状
    :param unit: 单位，默认为Inches
    :param autoshape_type_id: 形状类型
    :param slide:幻灯片
    :param left:左边距
    :param top:上边距
    :param width:宽度
    :param height:高度
    :return:
    """
    # 添加一个形状
    # add_shape(self, autoshape_type_id, left, top, width, height)
    # 参数分别为：形状类型、左边距、上边距、宽度、高度
    shape = slide.shapes.add_shape(autoshape_type_id=autoshape_type_id,
                                   left=unit(left),
                                   top=unit(top),
                                   width=unit(width),
                                   height=unit(height))
    return shape

# 一、添加一个圆角矩形
rectangle = insert_shape(slide, 2, 2, 16, 8, autoshape_type_id=MSO_SHAPE.ROUNDED_RECTANGLE, unit=Cm)
# 二、设置形状属性
# 2.1 背景颜色
set_widget_bg(rectangle, bg_rgb_color=[255, 255, 255])

# 2.2 边框属性
set_widget_frame(rectangle, frame_rgb_color=[255, 0, 0],frame_width=0.5)

3. 图表 Chart

图表 Chart 是 PPT 中使用很频繁的一块内容，使用 python-pptx 能够建立各类类型的图表，包含：柱状图、饼图、折线图、散点图、3D 图等

建立图表的方式以下：

slide.shapes.add_shape(autoshape_type_id, left, top, width, height)

参数分别是：

autoshape_type_id 图表样式
left 左边距
top 上边距
width 图表显示宽度
height 图表显示高度

3-1 建立一个折线图

首先，建立一个图表数据对象 ChartData

from pptx.chart.data import ChartData

slide = add_slide(self.presentation, 6)

# 建立一个图表数据对象
chart_data = ChartData()

接着，准备图表数据

# 数据类别（x轴数据）
chart_data.categories = [2000, 2005, 2010, 2015, 2020]

# 每年各维度的数据（3个纬度）
# 经济
chart_data.add_series("经济", [60, 65, 75, 90, 95])

# 环境
chart_data.add_series("环境", [95, 88, 84, 70, 54])

# 文化
chart_data.add_series("军事",[40, 65, 80, 95, 98])

最后，指定图表类型为折线图 XL_CHART_TYPE.LINE，按照图表数据绘制图表

若是须要绘制其余图表，能够参考下面连接：

https://python-pptx.readthedocs.io/en/latest/api/enum/XlChartType.html

def insert_chart(slide, left, top, width, height, data, unit=Inches, chart_type=XL_CHART_TYPE.COLUMN_CLUSTERED):
    """
    插入图表
    :param slide: 幻灯片
    :param left: 左边距
    :param top: 上边距
    :param width: 宽度
    :param height: 高度
    :param data: 图表数据
    :param unit: 数据单位，默认为：Inches
    :param chart_type: 图表类型，默认是：柱状图
    :return:
    """
    chart_result = slide.shapes.add_chart(chart_type=chart_type,
                                          x=unit(left), y=unit(top),
                                          cx=unit(width), cy=unit(height),
                                          chart_data=data)
    # 返回图表
    return chart_result.chart

# 添加图表
chart = insert_chart(slide, 4, 5, 20, 9, chart_data, unit=Cm, chart_type=XL_CHART_TYPE.LINE)

3-2 设置图表显示属性

以设置图表图例、图表是否显示平滑、设置图表文字样式为例

# 设置图表显示属性
# 显示图例
chart.has_legend = True

# 图例是否在绘图区以外显示
chart.legend.include_in_layout = False

# 设置图表是否显示平滑
chart.series[0].smooth = True
chart.series[1].smooth = True
chart.series[2].smooth = True

# 设置图表中文字的样式
set_font_style(chart.font, font_size=12, font_color=[255, 0, 0])

最后生成的折线图效果图以下：

4. 读取内容

PPT 文档的内容区由各类 Shape 组成，而且 shape.has_text_frame 可用于判断形状内部是否包含文本框

所以，只须要遍历全部形状，就能够获取 PPT 中全部的文本内容

def read_ppt_content(presentation):
    """
    读取PPT中全部的内容
    :param presentation:
    :return:
    """
    # 全部内容
    results = []

    # 遍历全部幻灯片，获取文本框中的值
    for slide in presentation.slides:
        for shape in slide.shapes:
            # 判断形状是否包含文本框
            if shape.has_text_frame:
                content = get_shape_content(shape)
                if content:
                    results.append(content)

    return results

presentation = Presentation("./raw.pptx")

# 一、普通形状内容的全部文本内容
contents = read_ppt_content(presentation)
print(contents)

可是，对于图表 Table 单元格中的文本数据，无法利用这种方式获取到

咱们只能过滤出形状类型为 TABLE 的形状，遍历表中全部行及单元格，获取文本数据

def read_ppt_file_table(self):
    """
    读取PPT中的数据
    :return:
    """
    # 打开待读取的ppt
    presentation = Presentation("./raw.pptx")

    for slide in presentation.slides:
        # 遍历素有形状
        # 形状：有内容的形状、无内容的形状
        for shape in slide.shapes:
            # print('当前形状名称:', shape.shape_type)
            # 只取表格中的数据，按照行读取内容
            if shape.shape_type == MSO_SHAPE_TYPE.TABLE:
                # 获取表格行（shape.table.rows）
                for row in shape.table.rows:
                    # 某一行全部的单元格(row.cells)
                    for cell in row.cells:
                        # 单元格文本框中的内容(cell.text_frame.text)
                        print(cell.text_frame.text)

5. 保存图片

有时候，咱们须要将 PPT 文档中的全部图片保存到本地

只须要下面 3 步便可完成

遍历幻灯片内容区全部形状
过滤出形状类型为 MSO_SHAPE_TYPE.PICTURE 的图片形状，获取图片形状的二进制字节流
将图片字节流写入到文件中

def save_ppt_images(presentation, output_path):
    """
     保存ppt中全部图片
    [Python批量导出PPT中的图片素材](https://www.pythonf.cn/read/49552)
    :param presentation:
    :param output_path 保存目录
    :return:
    """

    print('幻灯片数目:', len(presentation.slides))

    # 遍历全部幻灯片
    for index_slide, slide in enumerate(presentation.slides):
        # 遍历全部形状
        for index_shape, shape in enumerate(slide.shapes):
            # 形状包含：文字形状、图片、普通形状等

            # 过滤出图片形状
            if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
                # 获取图片二进制字符流
                image_data = shape.image.blob

                # image/jpeg、image/png等
                image_type_pre = shape.image.content_type

                # 图片后缀名
                image_suffix = image_type_pre.split('/')[1]

                # 建立image文件夹保存抽出图片
                if not os.path.exists(output_path):
                    os.makedirs(output_path)

                # 图片保存路径
                output_image_path = output_path + random_str(10) + "." + image_suffix

                print(output_image_path)

                # 写入到新的文件中
                with open(output_image_path, 'wb') as file:
                    file.write(image_data)

6. 最后

至此，Python 办公自动化 PPT 系列篇就正式结束了！在实际项目中，若是你有遇到其余问题，欢迎在评论区留言！

我已经将所有源码上传到后台，关注公众号，后台回复「 ppt 」便可得到所有源码

若是你以为文章还不错，请你们 点赞、分享、留言 下，由于这将是我持续输出更多优质文章的最强动力！

留言送书

本周赠书：《 Python 最优算法实战 》

内容简介：本书以理论结合编程开发为原则，使用 Python 做为开发语言，讲解优化算法的原理和应用，详细介绍了 Python 基础、Gurobi 优化器、线性规划、整数规划、多目标优化、动态规划、图与网络分析、智能优化算法

对于算法部分的每一种算法都包含原理和编程实践，使读者对优化算法的认识更加深刻

PS：中奖名单将于下周一在交流群公布

推荐阅读

最全总结 | 聊聊 Python 办公自动化之 Excel（上）

最全总结 | 聊聊 Python 办公自动化之 Excel（中）

最全总结 | 聊聊 Python 办公自动化之 Excel（下）

最全总结 | 聊聊 Python 办公自动化之 Word（上）

最全总结 | 聊聊 Python 办公自动化之 Word（中）

最全总结 | 聊聊 Python 办公自动化之 Word（下）

最全总结 | 聊聊 Python 办公自动化之 PPT（上）

最全总结 | 聊聊 Python 办公自动化之 PPT（中）

本文分享自微信公众号 - AirPython（AirPython）。
若有侵权，请联系 support@oschina.cn 删除。
本文参与“OSC源创计划”，欢迎正在阅读的你也加入，一块儿分享。