Skip to content

Commit

Permalink
Update weight-decay.md
Browse files Browse the repository at this point in the history
  • Loading branch information
astonzhang authored Jul 8, 2020
1 parent eb8bea8 commit a574b7b
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion chapter_deep-learning-basics/weight-decay.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ $$\ell(w_1, w_2, b) = \frac{1}{n} \sum_{i=1}^n \frac{1}{2}\left(x_1^{(i)} w_1 +

为例,其中$w_1, w_2$是权重参数,$b$是偏差参数,样本$i$的输入为$x_1^{(i)}, x_2^{(i)}$,标签为$y^{(i)}$,样本数为$n$。将权重参数用向量$\boldsymbol{w} = [w_1, w_2]$表示,带有$L_2$范数惩罚项的新损失函数为

$$\ell(w_1, w_2, b) + \frac{\lambda}{2n} \|\boldsymbol{w}\|^2,$$
$$\ell(w_1, w_2, b) + \frac{\lambda}{2} \|\boldsymbol{w}\|^2,$$

其中超参数$\lambda > 0$。当权重参数均为0时,惩罚项最小。当$\lambda$较大时,惩罚项在损失函数中的比重较大,这通常会使学到的权重参数的元素较接近0。当$\lambda$设为0时,惩罚项完全不起作用。上式中$L_2$范数平方$\|\boldsymbol{w}\|^2$展开后得到$w_1^2 + w_2^2$。有了$L_2$范数惩罚项后,在小批量随机梯度下降中,我们将[“线性回归”](linear-regression.md)一节中权重$w_1$和$w_2$的迭代方式更改为

Expand Down

0 comments on commit a574b7b

Please sign in to comment.