ML Note Course 1 Week 1-1: Machine Learning
机器学习 = 找出一个函数
- 设定范围: 找出候选候选函数集合(Deep Learning(CNN, Transformer…), Decision Tree, etc.)
- 设定标准: 找出评量函数好坏的标准(Supervised Learning, Semi-supervised Learning, RL, etc.)
- 目标: 找出最好的函数(利用Gradient Descent(Adam, AdmaW…),Genetic Alogorithm, Backpropagation…)
Supervised learning 监督学习
使用最多,学习进步最快。
learning algorithms learns from “right answers”
映射
x==>y
input=>output
Regression 回归
predict any number out of infinitly possible numbers
f(x) = wx + b
w,b: parameters(参数/coefficients(系数)/weights(权重)
Cost Function
最常用于线性回归的成本函数Squared error cost function(平方误差成本函数):
其中,
称为error(误差);
为训练集规模;
分母多除2为了使后续 计算更简洁
将替换为等价于:
Classification 分类
predict categories/classes out of small number of possible outputs
Unsupervised learning 无监督学习
Data only comes with inputs x, but not ouput labels y.
Algorithm has to find something interesting(pattern/structure) in unlabeled data
Clustering 聚类
Group similar data points together.
Algorithm takes data without labels and tries to automatically group them into clusters/groups
Examples:
Google News;
DNA microarray types;
Grouping customers
Anomaly detection 异常检测
Find unusual data points.
Deimensionality reduction 降维
Compress big dataset using fewer numbers , losing as little information as possible.