AUC及TensorFlow AUC计算相关

时间 2019-11-11

标签 auc tensorflow 计算相关繁體版

原文原文链接

最近在打天池的比赛，里面须要用AUC来评测模型的性能，因此这里介绍一下AUC的相关概念，并介绍TensorFlow含有的函数来计算AUC。python

先介绍一些前置的概念。在一个二分类问题中，若是自己是正例（positive），预测正确也预测成正例，则称为真正例（true positive），简称TP，而预测错误预测成了反例，则称为假反例（false negative），简称FN，若是自己是反例（negative），预测正确也预测成反例，则称为真反例（true negative），简称TN，而预测错误预测成了正例，则称为假正例（false positive），简称FP。查准率、查全率以及F1值都是根据上述四个值计算出来的，这里不作赘述。apache

真正例率（True Positive Rate，简称TPR），计算公式为TPR = TP / (TP + FN)，和查全率的公式一致，表示预测为正例且自己是正例的样本数占全部自己是正例的样本数的比重。假正例率（False Positive Rate，简称FPR），计算公式为FPR = FP / (TN + FP)，表示预测为正例且自己是反例的样本数占全部自己是反例的样本数的比重。bash

ROC全称是受访者工做特征（Receiver Operating Characteristic）曲线，用来研究通常状况下模型的泛化性能。先根据模型的预测结果将样本进行排序，将最多是正例，也就是预测出是正例的几率最高的样本排在前面，而后几率依次下降，将最不多是正例也就是预测时正例几率最低的样本排在最后。而后ROC曲线以真正例率做为纵轴，假正例率做为横轴，按顺序逐个把样本预测成正例，在每一个样本预测后TPR、FPR的值都会改变，就在图像上增长一个新的点，直到全部点都预测为正例为止。能够考虑一种极端状况做为例子，若是模型很是完美，泛化性能很好，则在排序后前面的全是预测正例实际上也是正例，后面的全是反例，实际上也是反例。一开始将全部样本都预测为反例，此时TP和FP都是0，因此曲线从原点（0,0）开始，将第一个样本预测为正例，此时它自己是正例，预测也是正例，因此TP为1，TPR此时为1/正例样本数，而FP仍是为0，因此曲线下一个点沿y轴向上。以此类推，一直预测到最后一个正例，此时TP为正例样本数，TPR为1，因此曲线延伸到了（0,1），而后将第一个反例预测成了正例，此时TP值不变，FP变为1，FPR此时为1/反例样本数，因此曲线在y值仍然为1的状况下沿x轴正方向增长一个点进行延伸，以此类推一直到把全部的反例都预测正正例，此时FP为反例样本数，FPR值也为1。上面的例子是一个完美的模型，而若是有预测错误的，即按顺序预测将全部的正例预测为正例前遇到了反例预测为正例，则FP值会增长，此时仍有正例没有被预测为正例，因此TP不为1，而FP会变为1，即曲线没有达到（0,1）点后就会向右延伸。下图为一个ROC曲线的实例。app

而直接对比两个交叉的ROC曲线，仍然没法很好地评测模型的性能，因此用曲线下的面积来表明模型的性能，也就是本文要介绍的AUC（Area Under ROC Curve）。从上文完美模型的例子可知，AUC的面积上限为1。随机猜想时AUC的值即为0.5，因此在深度学习中通常模型的AUC都会大于0.5，若是模型的值远远小于0.5，多是你的标签弄反了，我在天池一个比赛中，一开始AUC只有0.24，比胡乱猜想的0.5都要低，一开始百思不得其解，后来发现题目要求上传的是反例的几率，我上传的是正例的几率，因此实际上我模型的AUC是0.76，这点要注意。函数

因为个人模型是用TensorFlow 代码生成的，因此AUC也天然使用TensorFlow提供的函数来计算。网上不少的资料是用用tf.contrib.metrics.streaming_auc这个函数来计算的，但访问官方文档会提示该函数已经弃用，在将来版本会删去，应该使用tf.metrics.auc函数，函数体以下：性能

tf.metrics.auc(
    labels,
    predictions,
    weights=None,
    num_thresholds=200,
    metrics_collections=None,
    updates_collections=None,
    curve='ROC',
    name=None,
    summation_method='trapezoidal'
)

Args:

labels: A Tensor whose shape matches predictions. Will be cast to bool.
predictions: A floating point Tensor of arbitrary shape and whose values are in the range [0, 1].
weights: Optional Tensor whose rank is either 0, or the same rank as labels, and must be broadcastable to labels (i.e., all dimensions must be either 1, or the same as the corresponding labels dimension).
num_thresholds: The number of thresholds to use when discretizing the roc curve.
metrics_collections: An optional list of collections that auc should be added to.
updates_collections: An optional list of collections that update_op should be added to.
curve: Specifies the name of the curve to be computed, 'ROC' [default] or 'PR' for the Precision-Recall-curve.
name: An optional variable_scope name.
summation_method: Specifies the Riemann summation method used (https://en.wikipedia.org/wiki/Riemann_sum): 'trapezoidal' [default] that applies the trapezoidal rule; 'careful_interpolation', a variant of it differing only by a more correct interpolation scheme for PR-AUC - interpolating (true/false) positives but not the ratio that is precision; 'minoring' that applies left summation for increasing intervals and right summation for decreasing intervals; 'majoring' that does the opposite. Note that 'careful_interpolation' is strictly preferred to 'trapezoidal' (to be deprecated soon) as it applies the same method for ROC, and a better one (see Davis & Goadrich 2006 for details) for the PR curve.
Returns:

auc: A scalar Tensor representing the current area-under-curve.
update_op: An operation that increments the true_positives, true_negatives, false_positives and false_negatives variables appropriately and whose value matches auc.

即最简单的使用方法是直接传两个参数labels和predictions，也就是样本的标签和预测的几率，会获得返回的auc的值，num_thresholds的值默认为200，而越大auc的值会越精确，一直到你的样本数量为止，以后再增大不会改变，因此样本数大于200须要对num_thresholds进行传参。可是实际使用上会遇到一些问题。首先，在写好该公式，在已经运行过sess.run(tf.global_variables_initializer())后进行sess.run该auc，会提示下列错误：学习

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value auc/true_negatives

查阅网上相关资料后发现要在运行前添加这么一句：编码

sess.run(tf.local_variables_initializer()) 或 sess.run(tf.initialize_local_variables())

第二种方式运行时会建议你使用第一种。scala

这样编码后程序能顺利运行不报错，但这样运行后的auc的值始终是0.0，怎么调整参数都没有用，我一度怀疑是代码的问题，后来在stackoverflow，发如今sess.run(auc_value)之前，因为tf.metrics.auc会返回两个参数，第一个参数auc_value是auc的值，第二个参数auc_op是auc的更新操做，要先运行sess.run(auc_op)后再运行计算auc的值，才会正确显示auc的值。个人代码例子以下： code

 prediction_tensor = tf.convert_to_tensor(prediction_list)
        label_tensor = tf.convert_to_tensor(label_list)
        auc_value, auc_op = tf.metrics.auc(label_tensor, prediction_tensor, num_thresholds=2000)
        sess.run(tf.global_variables_initializer())
        sess.run(tf.local_variables_initializer())
        sess.run(auc_op)
        value = sess.run(auc_value)
 
        print(prediction_tensor)
        print(label_tensor)
        print("AUC:" + str(value))

其中prediction_list和label_list都是Python list类型，prediction_list每一个元素都是0~1的几率值，label_list每一个元素的值都是True或False，转化为tensor后便可计算对应的AUC，运行结果以下。

Tensor("Const:0", shape=(1544,), dtype=float32)
Tensor("Const_1:0", shape=(1544,), dtype=bool)
AUC:0.97267157

这样就能成功运行并显示AUC了。