ML Note Course 1 Week 1-1: Machine Learning

机器学习 = 找出一个函数

设定范围: 找出候选候选函数集合(Deep Learning(CNN, Transformer…), Decision Tree, etc.)
设定标准: 找出评量函数好坏的标准(Supervised Learning, Semi-supervised Learning, RL, etc.)
目标: 找出最好的函数(利用Gradient Descent(Adam, AdmaW…),Genetic Alogorithm, Backpropagation…)

Supervised learning 监督学习

使用最多，学习进步最快。

learning algorithms learns from “right answers”

映射

x==>y

input=>output

predict any number out of infinitly possible numbers

f(x) = wx + b

w,b: parameters(参数/coefficients(系数)/weights(权重)

最常用于线性回归的成本函数Squared error cost function(平方误差成本函数):

$J(w,b)=\frac{1}{2m} \sum_1^m (\hat{y}^{(i)}-y^{(i)})^2$

其中,

$\hat{y}-y^{(i)}$ 称为error(误差);

$m$ 为训练集规模;

分母多除2为了使后续计算更简洁

将 $\hat{y}^{(i)}$ 替换为 $f_{w,b}(x^{(i)}$ 等价于:

$J(w,b)=\frac{1}{2m} \sum_1^m (f_{w,b}(x^{(i)})-y^{(i)})^2$

predict categories/classes out of small number of possible outputs

Data only comes with inputs x, but not ouput labels y.

Algorithm has to find something interesting(pattern/structure) in unlabeled data

Group similar data points together.

Algorithm takes data without labels and tries to automatically group them into clusters/groups

Examples:

Google News；

DNA microarray types；

Grouping customers

Find unusual data points.

Compress big dataset using fewer numbers , losing as little information as possible.