Perception¶
感知器模型¶
假设我们的模型中有 N 个特征,那么输入向量的大小就是 N。感知器是一个**二分类**模型,也就是说,它可以区分两类输入数据。我们假设对于每个输入向量 x,感知器的输出要么是 +1,要么是 -1,具体取决于类别。输出将使用以下公式计算:
y(x) = f(wTx)
其中 f 是阶跃激活函数(step activation function)
训练感知器¶
为了训练感知器,我们需要找到一个权重向量 w,它能够对大多数值进行正确分类,即误差 E 最小。 This error E is defined by perceptron criterion in the following manner:
E(w) = -∑wTxiti
where:
- the sum is taken on those training data points i that result in the wrong classification
- xi is the input data, and ti is either -1 or +1 for negative and positive examples accordingly.
This criteria is considered as a function of weights w, and we need to minimize it. Often, a method called gradient descent is used, in which we start with some initial weights w0, and then at each step update the weights according to the formula:
wt+1 = wt - η∇E(w)
Here η is the so-called learning rate, and ∇E(w) denotes the gradient of E. After we calculate the gradient, we end up with
wt+1 = wt + ∑ηxiti
在 wt+1 = wt + ∑ηxiti 这一式中,如果 η 为常数,则可以把公因子提取至前面。从而写成 wt+1 = wt + η∑xiti
但是我们常用 optimizer 来动态调整学习率,此时 η 不是常数。
The algorithm in Python looks like this:
def train(positive_examples, negative_examples, num_iterations = 100, eta = 1):
weights = [0,0,0] # Initialize weights (almost randomly :)
for i in range(num_iterations):
pos = random.choice(positive_examples)
neg = random.choice(negative_examples)
z = np.dot(pos, weights) # compute perceptron output
if z < 0: # positive example classified as negative
weights = weights + eta*weights.shape
z = np.dot(neg, weights)
if z >= 0: # negative example classified as positive
weights = weights - eta*weights.shape
return weights