【机器学习】--模型评估指标之混淆矩阵，ROC曲线和AUC面积

时间 2020-05-05

标签机器学习模型评估指标混淆矩阵 roc 曲线 auc 面积栏目应用数学繁體版

原文原文链接

1、前述python

怎么样对训练出来的模型进行评估是有必定指标的，本文就相关指标作一个总结。git

2、具体dom

一、混淆矩阵函数

混淆矩阵如图：post

第一个参数true，false是指预测的正确性。测试

第二个参数true,postitives是指预测的结果。fetch

相关公式：spa

检测正列的效果：3d

检测负列的效果：rest

公式解释：

fp_rate：

tp_rate:

recall:（召回率）

值越大越好

presssion:（准确率）

TP:原本是正例，经过模型预测出来是正列

TP+FP：经过模型预测出来的全部正列数（其中包括原本是负例，但预测出来是正列）

值越大越好

F1_Score:

准确率和召回率是负相关的。如图所示：

通俗解释：

实际上很是简单，精确率是针对咱们预测结果而言的，它表示的是预测为正的样本中有多少是真正的正样本。那么预测为正就有两种可能了，一种就是把正类预测为正类(TP)，另外一种就是把负类预测为正类(FP)，也就是

而召回率是针对咱们原来的样本而言的，它表示的是样本中的正例有多少被预测正确了。那也有两种可能，一种是把原来的正类预测成正类(TP)，另外一种就是把原来的正类预测为负类(FN)。

其实就是分母不一样，一个分母是预测为正的样本数，另外一个是原来样本中全部的正样本数。

二、ROC曲线

过程：对第一个样例，预测对，阈值是0.9，因此曲线向上走，以此类推。

对第三个样例，预测错，阈值是0.7 ，因此曲线向右走，以此类推。

几种状况：

因此得出结论，曲线在对角线以上，则准确率好。

三、AUC面积

M是样本中正例数

N是样本中负例数

其中累加解释是把预测出来的全部几率结果按照分值升序排序，而后取正例所对应的索引号进行累加

经过AUC面积预测出来的能够知道好到底有多好，坏到底有多坏。由于正例的索引比较大，则AUC面积越大。

总结：

四、交叉验证

为在实际的训练中，训练的结果对于训练集的拟合程度一般仍是挺好的（初试条件敏感），可是对于训练集以外的数据的拟合程度一般就不那么使人满意了。所以咱们一般并不会把全部的数据集都拿来训练，而是分出一部分来（这一部分不参加训练）对训练集生成的参数进行测试，相对客观的判断这些参数对训练集以外的数据的符合程度。这种思想就称为交叉验证。

通常3折或者5折交叉验证就足够了。

3、代码

#!/usr/bin/python # -*- coding: UTF-8 -*- # 文件名: mnist_k_cross_validate.py

from sklearn.datasets import fetch_mldata import matplotlib import matplotlib.pyplot as plt import numpy as np from sklearn.linear_model import SGDClassifier from sklearn.model_selection import StratifiedKFold from sklearn.base import clone from sklearn.model_selection import cross_val_score from sklearn.base import BaseEstimator #评估指标
from sklearn.model_selection import cross_val_predict from sklearn.metrics import confusion_matrix from sklearn.metrics import precision_score from sklearn.metrics import recall_score from sklearn.metrics import f1_score from sklearn.metrics import precision_recall_curve from sklearn.metrics import roc_curve from sklearn.metrics import roc_auc_score from sklearn.ensemble import RandomForestClassifier # Alternative method to load MNIST, if mldata.org is down
from scipy.io import loadmat #利用Matlib加载本地数据
mnist_raw = loadmat("mnist-original.mat") mnist = { "data": mnist_raw["data"].T, "target": mnist_raw["label"][0], "COL_NAMES": ["label", "data"], "DESCR": "mldata.org dataset: mnist_k_cross_validate-original", } print("Success!") # mnist_k_cross_validate = fetch_mldata('MNIST_original', data_home='test_data_home')
print(mnist) X, y = mnist['data'], mnist['target'] # X 是70000行 784个特征 y是70000行 784个像素点
print(X.shape, y.shape) # some_digit = X[36000] print(some_digit) some_digit_image = some_digit.reshape(28, 28)#调整矩阵 28*28=784 784个像素点调整成28*28的矩阵 图片是一个28*28像素的图片 每个像素点是一个rgb的值
print(some_digit_image) # plt.imshow(some_digit_image, cmap=matplotlib.cm.binary, interpolation='nearest') plt.axis('off') plt.show() # X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[:60000]#6/7做为训练，1/7做为测试
shuffle_index = np.random.permutation(60000)#返回一组随机的数据 shuffle 打乱60000中每行的值 即每一个编号的值不是原先的对应的值
X_train, y_train = X_train[shuffle_index], y_train[shuffle_index] # Shuffle以后的取值 # #
y_train_5 = (y_train == 5)# 是5就标记为True,不是5就标记为false
y_test_5 = (y_test == 5) print(y_test_5) #这里能够直接写成LogGression
sgd_clf = SGDClassifier(loss='log', random_state=42)# log 表明逻辑回归 random_state或者random_seed 随机种子 写死之后生成的随机数就是同样的
sgd_clf.fit(X_train, y_train_5)#构建模型
print(sgd_clf.predict([some_digit]))# 测试模型 最终为5 # # ### K折交叉验证 ##总共会运行3次
skfolds = StratifiedKFold(n_splits=3, random_state=42)# 交叉验证 3折 跑三次 在训练集中的开始1/3 中测试，中间1/3 ，最后1/3作验证
for train_index, test_index in skfolds.split(X_train, y_train_5): #能够把sgd_clf = SGDClassifier(loss='log', random_state=42)这一行放入进来，传不一样的超参数 这里就不用克隆了
    clone_clf = clone(sgd_clf)# clone一个上一个同样的模型 让它不变了 每次初始随机参数w0,w1,w2都同样，因此设定随机种子是同样
    X_train_folds = X_train[train_index]#对应的是训练集中训练的X 没有阴影的
    y_train_folds = y_train_5[train_index]# 对应的是训练集中的训练y 没有阴影的
    X_test_folds = X_train[test_index]#对应的是训练集中的测试的X 阴影部分的
    y_test_folds = y_train_5[test_index]#对应的是训练集中的测试的Y 阴影部分的
 clone_clf.fit(X_train_folds, y_train_folds)#构建模型
    y_pred = clone_clf.predict(X_test_folds)#验证
    print(y_pred) n_correct = sum(y_pred == y_test_folds)# 如若预测对了加和 由于true=1 false=0
    print(n_correct / len(y_pred))#获得预测对的精度 #用判断正确的数/总共预测的 获得一个精度 # #PS：这里能够把上面的模型生成直接放在交叉验证里面传一些超参数好比阿尔法，看最后的准确率则知道什么超参数最好。

#这是Sk_learn里面的实现的函数cv是几折，score评估什么指标这里是准确率，结果相似上面一大推代码
print(cross_val_score(sgd_clf, X_train, y_train_5, cv=3, scoring='accuracy')) #这是Sk_learn里面的实现的函数cv是几折，score评估什么指标这里是准确率


class Never5Classifier(BaseEstimator):#给定一个分类器，永远不会分红5这个类别 由于正负列样本不均匀，因此得出的结果是90%，因此只拿精度是不许确的。
    def fit(self, X, y=None): pass

    def predict(self, X): return np.zeros((len(X), 1), dtype=bool) never_5_clf = Never5Classifier() print(cross_val_score(never_5_clf, X_train, y_train_5, cv=3, scoring='accuracy'))#给每个结果一个结果 # # # # ##混淆矩阵 能够准确地知道哪个类别判断的不许
y_train_pred = cross_val_predict(sgd_clf, X_train, y_train_5, cv=3)#给每个结果预测一个几率
print(confusion_matrix(y_train_5, y_train_pred)) # #
y_train_perfect_prediction = y_train_5 print(confusion_matrix(y_train_5, y_train_5)) #准确率，召回率，F1Score
print(precision_score(y_train_5, y_train_pred)) print(recall_score(y_train_5, y_train_pred)) print(sum(y_train_pred)) print(f1_score(y_train_5, y_train_pred)) sgd_clf.fit(X_train, y_train_5) y_scores = sgd_clf.decision_function([some_digit]) print(y_scores) threshold = 0 # Z的大小 wT*x的结果
y_some_digit_pred = (y_scores > threshold) print(y_some_digit_pred) threshold = 200000 y_some_digit_pred = (y_scores > threshold) print(y_some_digit_pred) y_scores = cross_val_predict(sgd_clf, X_train, y_train_5, cv=3, method='decision_function') print(y_scores)#直接得出Score
 precisions, recalls, thresholds = precision_recall_curve(y_train_5, y_scores) print(precisions, recalls, thresholds) def plot_precision_recall_vs_threshold(precisions, recalls, thresholds): plt.plot(thresholds, precisions[:-1], 'b--', label='Precision') plt.plot(thresholds, recalls[:-1], 'r--', label='Recall') plt.xlabel("Threshold") plt.legend(loc='upper left') plt.ylim([0, 1]) # plot_precision_recall_vs_threshold(precisions, recalls, thresholds) # plt.savefig('./temp_precision_recall')
 y_train_pred_90 = (y_scores > 70000) print(precision_score(y_train_5, y_train_pred_90)) print(recall_score(y_train_5, y_train_pred_90)) fpr, tpr, thresholds = roc_curve(y_train_5, y_scores) def plot_roc_curve(fpr, tpr, label=None): plt.plot(fpr, tpr, linewidth=2, label=label) plt.plot([0, 1], [0, 1], 'k--') plt.axis([0, 1, 0, 1]) plt.xlabel('False Positive Rate') plt.ylabel('True positive Rate') plot_roc_curve(fpr, tpr) plt.show() # plt.savefig('img_roc_sgd')

print(roc_auc_score(y_train_5, y_scores)) forest_clf = RandomForestClassifier(random_state=42) y_probas_forest = cross_val_predict(forest_clf, X_train, y_train_5, cv=3, method='predict_proba') y_scores_forest = y_probas_forest[:, 1] fpr_forest, tpr_forest, thresholds_forest = roc_curve(y_train_5, y_scores_forest) plt.plot(fpr, tpr, 'b:', label='SGD') plt.plot(fpr_forest, tpr_forest, label='Random Forest') plt.legend(loc='lower right') plt.show() # plt.savefig('./img_roc_forest')

print(roc_auc_score(y_train_5, y_scores_forest)) # #

acc 看中总体

auc看中正例