Machine Learning Notes - Introduction

Introduction


机器学习英文做笔记,顺便学英语。


What is Machine Learning?

There isn’t a well accepted definition of what is and what isn’t machine learning.

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

— Tom


Example

Let’s say your email program watches which emails you do or do not mark as spam. So in an email client like this, you might click the Spam button to report some email as spam but not other emails. And based on which emails you mark as spam, say your email program learns better how to filter spam email.

  • classifying emails is the task T.
  • watching you label emails as spam or not spam is the experience E.
  • the fraction of emails correctly classified, that might be a performance measure P.

There are several different types of learning algorithms.

The main two types are what we call supervised learning and unsupervised learning.

I hope to make you one of the best people in knowing how to design and build serious machine learning and AI systems.


Supervised Learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Supervised learning problems are categorized into “regression”(回归) and “classification”(分类) problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

An example of a regression problem.

put a straight line through the data, also fit a straight line to the data. (purple)

And there might be a better one. For example, instead of fitting a straight line to the data, we might decide that it’s better to fit a quadratic function, or a second-order polynomial to this data. (blue)

在这里插入图片描述

An example of a classification problem.

在这里插入图片描述

We’re trying to predict a discrete value output zero or one. sometimes you can have more than two possible values for the output.

Another classification problem.

在这里插入图片描述
The learning algorithm can deal with an infinite number of features.

Unsupervised Learning

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don’t necessarily know the effect of the variables.

We can derive this structure by clustering(聚集) the data based on relationships among the variables in the data.

With unsupervised learning there is no feedback based on the prediction results.

Example

Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.
在这里插入图片描述

Non-clustering: The “Cocktail Party Algorithm”, allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).

在这里插入图片描述

words and phrases

English 中文 English 中文
practitioner 实践者;实习者 the field of 领域
claim 宣称 remarkable 卓越的;非凡的
occasionally 偶尔;间或 the fraction of xx的比例
make sense 有意义;言之有理 properly 适当地;正确地;
regression 回归 classification 分类
discrete 离散的 categorize 分类
discrete 离散的 categorize 分类
lifespan 寿命 discrete 离散的