机器学习之路： python 支持向量机 LinearSVC 手写字体识别

时间 2019-11-16

标签机器学习之路 python 支持向量 linearsvc 手写字体识别栏目 Python 繁體版

原文原文链接

使用python3 学习sklearn中支持向量机api的使用python

能够来到个人git下载源代码：https://github.com/linyi0604/MachineLearninggit

 1 # 导入手写字体加载器
 2 from sklearn.datasets import load_digits  3 from sklearn.cross_validation import train_test_split  4 from sklearn.preprocessing import StandardScaler  5 from sklearn.svm import LinearSVC  6 from sklearn.metrics import classification_report  7 
 8 '''
 9 支持向量机 10 根据训练样本的分布，搜索全部可能的线性分类器最佳的一个。 11 从高纬度的数据中筛选最有效的少许训练样本。 12 节省数据内存，提升预测性能 13 可是付出更多的cpu和计算时间 14 '''
15 
16 '''
17 1 获取数据 18 '''
19 # 经过数据加载器得到手写字体数字的数码图像数据并存储在digits变量中
20 digits = load_digits() 21 # 查看数据的特征维度和规模
22 # print(digits.data.shape) # (1797, 64)
23 
24 '''
25 2 分割训练集合和测试集合 26 '''
27 x_train, x_test, y_train, y_test = train_test_split(digits.data, 28  digits.target, 29                                                     test_size=0.25, 30                                                     random_state=33) 31 
32 '''
33 3 使用支持向量机分类模型对数字图像进行识别 34 '''
35 # 对训练数据和测试数据进行标准化
36 ss = StandardScaler() 37 x_train = ss.fit_transform(x_train) 38 x_test = ss.fit_transform(x_test) 39 
40 # 初始化线性假设的支持向量机分类器
41 lsvc = LinearSVC() 42 # 进行训练
43 lsvc.fit(x_train, y_train) 44 # 利用训练好的模型对测试集合进行预测 测试结果存储在y_predict中
45 y_predict = lsvc.predict(x_test) 46 
47 '''
48 4 支持向量机分类器 模型能力评估 49 '''
50 print("准确率：", lsvc.score(x_test, y_test)) 51 print("其余评估数据：\n", classification_report(y_test, y_predict, target_names=digits.target_names.astype(str))) 52 '''
53 准确率： 0.9488888888888889 54 其余评估数据： 精确率 召回率 f1指标 数据个数 55  precision recall f1-score support 56 
57  0 0.92 0.97 0.94 35 58  1 0.95 0.98 0.96 54 59  2 0.98 1.00 0.99 44 60  3 0.93 0.93 0.93 46 61  4 0.97 1.00 0.99 35 62  5 0.94 0.94 0.94 48 63  6 0.96 0.98 0.97 51 64  7 0.90 1.00 0.95 35 65  8 0.98 0.83 0.90 58 66  9 0.95 0.91 0.93 44 67 
68 avg / total 0.95 0.95 0.95 450 69 '''