【学习笔记】神经网络和深度学习-吴恩达-第一周

1.Supervised learning for Neural Network

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map(映射) input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

在有监督的学习中,我们给出了一个数据集,并且已经知道我们的输出应该是什么样的,并且认为输入和输出之间存在关系。

监督学习问题分为“回归”和“分类”问题。 在回归问题中,我们试图以连续输出的方式预测结果。 在分类问题中,我们试图在离散输出中预测结果。 换句话说,我们正在尝试将输入变量映射到离散类别。

There are different types of neural network, for example Convolution Neural Network (CNN) used often for image application and Recurrent Neural Network (RNN) used for one-dimensional sequence data such as translating English to Chinses or a temporal component such as text transcript. As for the autonomous driving, it is a hybrid neural network architecture.

有不同类型的神经网络,比如卷积神经网络(CNN)经常用于图像应用和复原神经网络(RNN)用于一维序列数据,例如将英语翻译成中文或者诸如文本转录本的时间成分。 至于自动驾驶,它是一种混合神经网络架构。

2.Why is deep learning taking off?

Deep learning is taking off due to a large amount of data available through the digitization of the society, faster computation and innovation in the development of neural network algorithm.

由于通过社会数字化提供了大量数据,在神经网络算法的开发中更快的计算和创新,正在进行深度学习。

Two things have to be considered to get to the high level of performance:
1. Being able to train a big enough neural network
2. Huge amount of labeled data

在高水平的表现中必须考虑两件事:

1.能够训练足够大的神经网络

2.大量标记数据

The process of training a neural network is iterative(迭代的)

It could take a good amount of time to train a neural network, which affects your productivity. Faster computation helps to iterate and improve new algorithm.

训练神经网络需要花费大量时间,这会影响您的工作效率。 更快的计算有助于迭代和改进新算法。

3.Binary Classification

In a binary classification problem, the result is a discrete value output.
For example - account hacked (1) or compromised (0)
- a tumor malign (1) or benign (0)

Example: Cat vs Non-Cat
The goal is to train a classifier that the input is an image represented by a feature vector,

在二进制分类问题中,结果是离散值输出。

例如 - 帐户被黑客攻击(1)或被入侵(0)

 - 恶性肿瘤(1)或良性肿瘤(0)

示例:Cat vs. Non-Cat

目标是训练一个分类器,输入是由一个特征向量表示的图像

An image is store in the computer in three separate matrices corresponding to the Red, Green, and Blue color channels of the image. The three matrices have the same size as the image, for example, the resolution of the cat image is 64 pixels X 64 pixels, the three matrices (RGB) are 64 X 64 each.

The value in a cell represents the pixel intensity which will be used to create a feature vector of n dimension. In pattern recognition and machine learning, a feature vector represents an object, in this case, a cat or no cat.

图像在计算机中的三个单独的矩阵中对应于图像的红色,绿色和蓝色通道。 三个矩阵具有与图像相同的大小。

例如,图像的分辨率是64像素×64像素,三个矩阵(RGB)各64×64。

单元格中的值表示用于创建n维度特征向量的像素强度。 在模式识别和机器学习中,特征向量表示对象,在这种情况下,是猫或没有猫。

  要创建特征向量,像素强度值将为每种颜色“展开”或“重塑”。该输入特征向量x的维数是nx=64X64X3=12288.

4.Logistic Regression

Logistic regression is a learning algorithm used in a supervised learning problem when the output y are all either zero or one. The goal of logistic regression is to minimize the error between its predictions and training data.

逻辑回归是当输出为零或一时,在监督学习问题中使用的学习算法。 逻辑回归的目标是最小化其预测和训练数据之间的误差。

5.Logistic Regression: Cost Function

To train the parameters w and b, we need to define a cost function.

Loss (error) function:

损失函数测量预测X和期望输出Y之间的差异。换句话说,损失函数计算单个训练示例的误差。

the loss function measures how well your algorithms

Cost function

The cost function is the average of the loss function of the entire training set. We are going to find the parameters w and b that minimize the overall cost function.

成本函数是整个训练集的损失函数的平均值。 我们将找到参数w和b,并最大限度地降低总体成本函数。

The cost function measures how well your parameters w and b are doing on our entire training set.

 

use the gradient descent algorithm to train or to learn the parameters w and b on your training set.