逻辑回归 —— Yes Or No

时间 2019-12-08

标签逻辑回归 yes 繁體版

原文原文链接

逻辑回归解决的即是一个分类的问题。就是须要一段代码回答YES或者NO。好比辣鸡邮件的分类，当一封邮件过来，须要识别这封邮件是不是垃圾邮件。python

一个简单的例子

笔者借用Andrew Ng给的学生成绩与申请大学的例子来说述Logistic Regression算法实现。假设你有学生的两门课的历史成绩与是否被录取的记录，须要你预测一批新的学生是否会被大学录取。其中部分数据以下:算法

exam1	exam2	录取(0:failed; 1: passed)
34.62365962451697	78.0246928153624	0
30.28671076822607	43.89499752400101	0
35.84740876993872	72.90219802708364	0
60.18259938620976	86.30855209546826	1
79.0327360507101	75.3443764369103	1
45.08327747668339	56.3163717815305	0
61.10666453684766	96.51142588489624	1

这里exam1和exam2的分数做为模型的输入，即X。是否被录取则是模型的输出，即Y, 其中 $Y\in{1, 0}$ 函数

假设函数

咱们定义个假设函数 $h_{\theta}(x)$ , 经过这个函数来预测是否会被录取的几率。因此咱们但愿 $h_{\theta}(x)$ 的取值范围是[0, 1]。sigmoid函数是一个匹配度很高的函数。以下是sigmoid函数的图像:优化

咱们利用python来实现这个函数:spa

import numpy as np

def sigmoid(z):
    g = np.zeros(z.size)
    g = 1 / (1 + np.exp(-z))
    return g
复制代码

咱们假设 $g(\theta)$ 的定义为：.net

g(\theta) = \theta_0 + \theta_1 * X_1 + \theta_2 * X_2 = \theta^TX

咱们能够令 $h_{\theta}(x)$ 的定义为：3d

h_{\theta}(x) = h_{\theta}(g(\theta)) = 1 / (1 + e^{-\theta^TX})

代价函数 $J(\theta)$

给出了 $h(\theta)$ 的定义之后，咱们就能够定义 $J(\theta)$ 了code

J(\theta) = 1 / m \sum_{i=1}^mCost(h_{\theta}(x^i), y^i)

为了在进行梯度降低时找到全局的最优解， $J(\theta)$ 的函数必须是一个凸函数。因此咱们能够对Cost进行以下定义:cdn

Cost(h_{\theta}(x), y) = -log(h_{\theta}(x)) if y = 1

Cost(h_{\theta}(x), y) = -log(1 - h_{\theta}(x)) if y = 0

最后求出梯度降低算法所须要使用的微分便可：blog

\frac{\partial}{\partial \theta_j}J(\theta) = 1 / m * \sum_{i=1}^m(h_{\theta}(x^{i}) - y^{i}) * x_{j}^{i}

利用python的实现以下：

import numpy as np
from sigmoid import *


def cost_function(theta, X, y):
    m = y.size
    cost = 0
    grad = np.zeros(theta.shape)

    item1 = -y * np.log(sigmoid(X.dot(theta)))
    item2 = (1 - y) * np.log(1 - sigmoid(X.dot(theta)))

    cost = (1 / m) * np.sum(item1 - item2)

    grad = (1 / m) * ((sigmoid(X.dot(theta)) - y).dot(X))

    return cost, grad
复制代码

咱们使用scipy来对该算法的求解作必定程度的优化:

import scipy.optimize as opt
def cost_func(t):
    return cost_function(t, X, y)[0]


def grad_func(t):
    return cost_function(t, X, y)[1]


theta, cost, *unused = opt.fmin_bfgs(f=cost_func, fprime=grad_func,
                                     x0=initial_theta, maxiter=400, full_output=True, disp=False)
复制代码

经过这个方法，能够求出在模型中所须要使用的 $\theta$ 。从而将该模型训练好。

可视化

为了方便观察数据之间的规律，咱们能够将数据进行可视化出来

def plot_data(X, y):
    x1 = X[y == 1]
    x2 = X[y == 0]

    plt.scatter(x1[:, 0], x1[:, 1], marker='+', label='admitted')
    plt.scatter(x2[:, 0], x2[:, 1], marker='.', label='Not admitted')
    plt.legend()

def plot_decision_boundary(theta, X, y):
    plot_data(X[:, 1:3], y)

    # Only need two points to define a line, so choose two endpoints
    plot_x = np.array([np.min(X[:, 1]) - 2, np.max(X[:, 1]) + 2])

    # Calculate the decision boundary line
    plot_y = (-1/theta[2]) * (theta[1]*plot_x + theta[0])

    plt.plot(plot_x, plot_y)

    plt.legend(['Decision Boundary', 'Admitted', 'Not admitted'], loc=1)
    plt.axis([30, 100, 30, 100])
    plt.show()
复制代码

效果是这样的:

预测

当求出了 $\theta$ 以后咱们固然能够利用这个来进行预测了，因而能够编写predict函数

import numpy as np
from sigmoid import *


def predict(theta, X):
    m = X.shape[0]
    p = np.zeros(m)

    prob = sigmoid(X.dot(theta))
    p = prob > 0.5
    return p
复制代码

对于X中的数据，当几率大于0.5时，咱们预测为能经过大学申请。

以上即是逻辑回归算法的基本实现，逻辑回归在ML中是一个很基本也很强大的算法，但愿这篇文章对你有所帮助