Matplotlib绘图库入门（七）：高效使用

原文地址： !()[http://www.bugingcode.com/blog/Matplotlib_7_Effectively_Using.html]html

这是一篇关于如何高效的使用Matplotlib 的文章，文章的地址在原文，可是这里不许备一行一行的对文章的内容进行翻译，而是把主要的步骤和思想都记录下来，在进行项目绘制的时候可以有很好的帮助。python

获取数据和数据格式

要进行数据的绘制时，数据通常存放在文档里，好比cvs或者是excel中，读取时使用 pandas 进行操做，这里下有专门的介绍，这里不在详细的介绍了。git

数据从 https://github.com/chris1610/pbpython/blob/master/data/sample-salesv3.xlsx?raw=true 中获得，这里看看数据的读取和格式：github

import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import  pandas as pd
from matplotlib.ticker import  FuncFormatter

df = pd.read_excel("https://github.com/chris1610/pbpython/blob/master/data/sample-salesv3.xlsx?raw=true")
print df.head()

这里打印出sample-salesv3.xlsx 中前面4行的数据，数据的格式以下所示：编程

商品id，商品名称，商品sku，销售数量，销售单价，产品销售额，时间。canvas

数据排行和展现

咱们想知道哪些商品的销售最高，须要对数据进行排序，并取到前10位：ide

import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import  pandas as pd
from matplotlib.ticker import  FuncFormatter

df = pd.read_excel("https://github.com/chris1610/pbpython/blob/master/data/sample-salesv3.xlsx?raw=true")
#对ext price 进行排序，并取得前10位
top_10 = (df.groupby('name')['ext price', 'quantity'].agg({'ext price': 'sum', 'quantity': 'count'})
          .sort_values(by='ext price', ascending=False))[:10].reset_index()
#从新命名 ext price 换成 Sales，quantity 换成 Purchases
top_10.rename(columns={'name': 'Name', 'ext price': 'Sales', 'quantity': 'Purchases'}, inplace=True)

print top_10

结果以下：svg

Name      Sales  Purchases
0                     Kulas Inc  137351.96         94
1                 White-Trantow  135841.99         86
2               Trantow-Barrows  123381.38         94
3                 Jerde-Hilpert  112591.43         89
4  Fritsch, Russel and Anderson  112214.71         81
5                    Barton LLC  109438.50         82
6                      Will LLC  104437.60         74
7                     Koepp Ltd  103660.54         82
8      Frami, Hills and Schmidt  103569.59         72
9                   Keeling LLC  100934.30         74

取出了前面10名的销售额，第一位为 Kulas Inc，第二位White-Trantow ，等。函数

在图表上对这些数据进行绘制：url

import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import  pandas as pd
from matplotlib.ticker import  FuncFormatter



df = pd.read_excel("https://github.com/chris1610/pbpython/blob/master/data/sample-salesv3.xlsx?raw=true")
#对ext price 进行排序，并取得前10位
top_10 = (df.groupby('name')['ext price', 'quantity'].agg({'ext price': 'sum', 'quantity': 'count'})
          .sort_values(by='ext price', ascending=False))[:10].reset_index()
#从新命名 ext price 换成 Sales，quantity 换成 Purchases
top_10.rename(columns={'name': 'Name', 'ext price': 'Sales', 'quantity': 'Purchases'}, inplace=True)

plt.style.use('ggplot')

top_10.plot(kind='barh', y="Sales", x="Name")

plt.show()

获得各个商品的销售额：

图表自定义

自定义在图表的绘制中起到了美化和加强可读性的做用，对图表的说明和样式的改变，可以使你的图表看上去专业不少。

对图表的说明和坐标范围限制：

把上面 top_10.plot(kind='barh', y="Sales", x="Name") 代码更换为如下的代码，代码的功能也很清楚，限制x轴坐标，设置标题，x轴说明：

fig, ax = plt.subplots(figsize=(5, 6))
top_10.plot(kind='barh', y="Sales", x="Name", ax=ax)
ax.set_xlim([-10000, 140000])
ax.set(title='2014 Revenue', xlabel='Total Revenue')
ax.legend().set_visible(False)

获得的结果图以下：

要想修改这个图像，你可能须要执行不少操做。图中最碍眼的多是总收益额的格式。Matplotlib 能够使用 FuncFormatter 解决这一问题。该函数用途多样，容许用户定义的函数应用到值，并返回格式美观的字符串。

如下是货币格式化函数，用于处理数十万美圆区间的数值：

def currency(x, pos):
    'The two args are the value and tick position'
    if x >= 1000000:
        return '${:1.1f}M'.format(x*1e-6)
    return '${:1.0f}K'.format(x*1e-3)

对x轴数据格式进行说明：

formatter = FuncFormatter(currency)
ax.xaxis.set_major_formatter(formatter)

一样的总的代码是把plot的代码替换为以下：

fig, ax = plt.subplots()
top_10.plot(kind='barh', y="Sales", x="Name", ax=ax)
ax.set_xlim([-10000, 140000])
ax.set(title='2014 Revenue', xlabel='Total Revenue', ylabel='Customer')
formatter = FuncFormatter(currency)
ax.xaxis.set_major_formatter(formatter)
ax.legend().set_visible(False)

对x轴进行修饰之后的图表：

多图表对比

各个销售的对比和各个商品在总体中是处于哪一个地位是较为关心的话题。把平均值也绘制在图表中，能够很方便的进行对比。

# Create the figure and the axes
fig, ax = plt.subplots()

# Plot the data and get the averaged
top_10.plot(kind='barh', y="Sales", x="Name", ax=ax)
avg = top_10['Sales'].mean()

# Set limits and labels
ax.set_xlim([-10000, 140000])
ax.set(title='2014 Revenue', xlabel='Total Revenue', ylabel='Customer')

# Add a line for the average
ax.axvline(x=avg, color='b', label='Average', linestyle='--', linewidth=1)

# Format the currency
formatter = FuncFormatter(currency)
ax.xaxis.set_major_formatter(formatter)

# Hide the legend
ax.legend().set_visible(False)

图表以下：

目前，咱们所作的全部改变都是针对单个图表。咱们还可以在图像上添加多个表，使用不一样的选项保存整个图像。

在这个例子中，我使用 nrows 和 ncols 指定大小，这对新用户来讲比较清晰易懂。我还使用 sharey=True 以使 y 轴共享相同的标签。

该示例很灵活，由于不一样的轴能够解压成 ax0 和 ax1。如今咱们有了这些轴，就能够像上述示例中那样绘图，而后把一个图放在 ax0 上，另外一个图放在 ax1。

# Get the figure and the axes
fig, (ax0, ax1) = plt.subplots(nrows=1,ncols=2, sharey=True, figsize=(7, 4))
top_10.plot(kind='barh', y="Sales", x="Name", ax=ax0)
ax0.set_xlim([-10000, 140000])
ax0.set(title='Revenue', xlabel='Total Revenue', ylabel='Customers')

# Plot the average as a vertical line
avg = top_10['Sales'].mean()
ax0.axvline(x=avg, color='b', label='Average', linestyle='--', linewidth=1)

# Repeat for the unit plot
top_10.plot(kind='barh', y="Purchases", x="Name", ax=ax1)
avg = top_10['Purchases'].mean()
ax1.set(title='Units', xlabel='Total Units', ylabel='')
ax1.axvline(x=avg, color='b', label='Average', linestyle='--', linewidth=1)

# Title the figure
fig.suptitle('2014 Sales Analysis', fontsize=14, fontweight='bold');

# Hide the legends
ax1.legend().set_visible(False)
ax0.legend().set_visible(False)

保存图表

Matplotlib 支持多种不一样文件保存格式。你能够使用 fig.canvas.get_supported_filetypes() 查看系统支持的文件格式：

fig.canvas.get_supported_filetypes()

	{'eps': 'Encapsulated Postscript',
 'jpeg': 'Joint Photographic Experts Group',
 'jpg': 'Joint Photographic Experts Group',
 'pdf': 'Portable Document Format',
 'pgf': 'PGF code for LaTeX',
 'png': 'Portable Network Graphics',
 'ps': 'Postscript',
 'raw': 'Raw RGBA bitmap',
 'rgba': 'Raw RGBA bitmap',
 'svg': 'Scalable Vector Graphics',
 'svgz': 'Scalable Vector Graphics',
 'tif': 'Tagged Image File Format',
 'tiff': 'Tagged Image File Format'}

咱们有 fig 对象，所以咱们能够将图像保存成多种格式：

fig.savefig('sales.png', transparent=False, dpi=80, bbox_inches="tight")

转载请标明来之：http://www.bugingcode.com/

更多教程：阿猫学编程