目录html
数据表示->数据清洗->数据统计->数据可视化->数据挖掘->人工智能python
人工智能:数据/语言/图像/视觉等方面深度分析与决策git
Python库之机器学习github
Numpy: 表达N维数组的最基础库,http://www.numpy.org算法
import numpy as np def np_sum(): a = np.array([0, 1, 2, 3, 4]) b = np.array([9, 8, 7, 6, 5]) c = a**2 + b**3 return c print(np_sum())
[729 513 347 225 141]
def py_sum(): a = [0, 1, 2, 3, 4] b = [9, 8, 7, 6, 5] c = [] for i in range(len(a)): c.append(a[i]**2 + b[i]**3) return c print(py_sum())
[729, 513, 347, 225, 141]
Pandas: Python数据分析高层次应用库,http://pandas.pydata.orgsql
能操做sql、json、pickle、csv、excel、ini等文件apache
DataFrame = 行列索引 + 二维数据json
SciPy: 数学、科学和工程计算功能库,http://www.scipy.org数组
Matplotlib: 高质量的二维数据可视化功能库,http://matplotlib.org网络
Seaborn: 统计类数据可视化功能库,http://seaborn.pydata.org/
Mayavi:三维科学数据可视化功能库,http://docs.enthought.com/mayavi/mayavi/
PyPDF2:用来处理pdf文件的工具集,http://mstamy2.github.io/PyPDF2
from PyPDF2 import PdfFileReader, PdfFileMerger merger = PdfFileMerger() input1 = open("document1.pdf", "rb") input2 = open("document2.pdf", "rb") merger.append(fileobj=input1, pages=(0, 3)) merger.merge(position=2, fileobj=input2, pages=(0, 1)) output = open("document-output.pdf", "wb") merger.write(output)
NLTK:天然语言文本处理第三方库,http://www.nltk.org/
from nltk.corpus import treebank t = treebank.parsed_sents('wsj_0001.mrg')[0] t.draw()
Python-docx:建立或更新Microsoft Word文件的第三方库,http://python-docx.readthedocs.io/en/latest/index.html
from docx import Document document = Document() document.add_heading('Document Title', 0) p = document.add_paragraph('A plain paragraph having some ') document.add_page_break() document.save('demo.docx')
Scikit-learn:机器学习方法工具集,与数据处理相关的第三方库,http://scikit-learn.org/
TensorFlow:AlphaGo背后的机器学习计算框架,https://www.tensorflow.org/
import tensorflow as tf init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) res = sess.run(result) print('result:', res)
MXNet:基于神经网络的深度学习计算框架,https://mxnet.incubator.apache.org/