Skip to content

Commit

Permalink
2024-02-08 17:54:16
Browse files Browse the repository at this point in the history
  • Loading branch information
wizardforcel committed Feb 8, 2024
1 parent 0101269 commit 486c458
Show file tree
Hide file tree
Showing 2 changed files with 287 additions and 0 deletions.
20 changes: 20 additions & 0 deletions totrans/dl-scr_2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1410,6 +1410,9 @@
id: totrans-160
prefs: []
type: TYPE_NORMAL
zh: 做“一堆线性回归”是什么意思?做一个线性回归涉及使用一组参数进行矩阵乘法:如果我们的数据*X*的维度是`[batch_size, num_features]`,那么我们将它乘以一个维度为`[num_features,
1]`的权重矩阵*W*,得到一个维度为`[batch_size, 1]`的输出;对于批次中的每个观察值,这个输出只是原始特征的一个*加权和*。要做多个线性回归,我们只需将我们的输入乘以一个维度为`[num_features,
num_outputs]`的权重矩阵,得到一个维度为`[batch_size, num_outputs]`的输出;现在,*对于每个观察值*,我们有`num_outputs`个不同的原始特征的加权和。
- en: What are these weighted sums? We should think of each of them as a “learned
feature”—a combination of the original features that, once the network is trained,
will represent its attempt to learn combinations of features that help it accurately
Expand All @@ -1418,26 +1421,31 @@
id: totrans-161
prefs: []
type: TYPE_NORMAL
zh: 这些加权和是什么?我们应该将它们中的每一个看作是一个“学习到的特征”——原始特征的组合,一旦网络训练完成,将代表其尝试学习的特征组合,以帮助准确预测房价。我们应该创建多少个学习到的特征?让我们创建13个,因为我们创建了13个原始特征。
- en: 'Step 2: A Nonlinear Function'
id: totrans-162
prefs:
- PREF_H2
type: TYPE_NORMAL
zh: 步骤2:一个非线性函数
- en: Next, we’ll feed each of these weighted sums through a *non*linear function;
the first function we’ll try is the `sigmoid` function that was mentioned in [Chapter 1](ch01.html#foundations).
As a refresher, [Figure 2-9](#fig_02-09) plots the `sigmoid` function.
id: totrans-163
prefs: []
type: TYPE_NORMAL
zh: 接下来,我们将通过一个非线性函数来处理这些加权和;我们将尝试的第一个函数是在第1章中提到的`sigmoid`函数。作为提醒,[图2-9](#fig_02-09)展示了`sigmoid`函数。
- en: '![Sigmoid](assets/dlfs_0209.png)'
id: totrans-164
prefs: []
type: TYPE_IMG
zh: '![Sigmoid](assets/dlfs_0209.png)'
- en: Figure 2-9\. Sigmoid function plotted from x = –5 to x = 5
id: totrans-165
prefs:
- PREF_H6
type: TYPE_NORMAL
zh: 图2-9。从x = -5到x = 5绘制的Sigmoid函数
- en: Why is using this nonlinear function a good idea? Why not the `square` function
*f*(*x*) = *x*², for example? There are a couple of reasons. First, we want the
function we use here to be *monotonic* so that it “preserves” information about
Expand All @@ -1450,17 +1458,20 @@
id: totrans-166
prefs: []
type: TYPE_NORMAL
zh: 为什么使用这个非线性函数是个好主意?为什么不使用`square`函数*f*(*x*) = *x*²,例如?有几个原因。首先,我们希望在这里使用的函数是*单调*的,以便“保留”输入的数字的信息。假设,给定输入的日期,我们的两个线性回归分别产生值-3和3。然后通过`square`函数传递这些值将为每个产生一个值9,因此任何接收这些数字作为输入的函数在它们通过`square`函数传递后将“丢失”一个原始为-3,另一个为3的信息。
- en: The second reason, of course, is that the function is nonlinear; this nonlinearity
will enable our neural network to model the inherently nonlinear relationship
between the features and the target.
id: totrans-167
prefs: []
type: TYPE_NORMAL
zh: 当然,第二个原因是这个函数是非线性的;这种非线性将使我们的神经网络能够建模特征和目标之间固有的非线性关系。
- en: 'Finally, the `sigmoid` function has the nice property that its derivative can
be expressed in terms of the function itself:'
id: totrans-168
prefs: []
type: TYPE_NORMAL
zh: 最后,`sigmoid`函数有一个很好的性质,即它的导数可以用函数本身来表示:
- en: <math display="block"><mrow><mfrac><mrow><mi>∂</mi><mi>σ</mi></mrow> <mrow><mi>∂</mi><mi>u</mi></mrow></mfrac>
<mrow><mo>(</mo> <mi>x</mi> <mo>)</mo></mrow> <mo>=</mo> <mi>σ</mi> <mrow><mo>(</mo>
<mi>x</mi> <mo>)</mo></mrow> <mo>×</mo> <mrow><mo>(</mo> <mn>1</mn> <mo>-</mo>
Expand All @@ -1477,18 +1488,21 @@
id: totrans-170
prefs: []
type: TYPE_NORMAL
zh: 我们将很快在神经网络的反向传播中使用`sigmoid`函数时使用它。
- en: 'Step 3: Another Linear Regression'
id: totrans-171
prefs:
- PREF_H2
type: TYPE_NORMAL
zh: 步骤3:另一个线性回归
- en: Finally, we’ll take the resulting 13 elements—each of which is a combination
of the original features, fed through the `sigmoid` function so that they all
have values between 0 and 1—and feed them into a regular linear regression, using
them the same way we used our original features previously.
id: totrans-172
prefs: []
type: TYPE_NORMAL
zh: 最后,我们将得到的13个元素——每个元素都是原始特征的组合,通过`sigmoid`函数传递,使它们的值都在0到1之间——并将它们输入到一个常规线性回归中,使用它们的方式与我们之前使用原始特征的方式相同。
- en: 'Then, we’ll try training the *entire* resulting function in the same way we
trained the standard linear regression earlier in this chapter: we’ll feed data
through the model, use the chain rule to figure out how much increasing the weights
Expand All @@ -1499,31 +1513,37 @@
id: totrans-173
prefs: []
type: TYPE_NORMAL
zh: 然后,我们将尝试训练*整个*得到的函数,方式与本章前面训练标准线性回归的方式相同:我们将数据通过模型,使用链式法则来计算增加权重会增加(或减少)损失多少,然后在每次迭代中更新权重,以减少损失。随着时间的推移(我们希望),我们将得到比以前更准确的模型,一个已经“学会”了特征和目标之间固有非线性关系的模型。
- en: It might be tough to wrap your mind around what’s going on based on this description,
so let’s look at an illustration.
id: totrans-174
prefs: []
type: TYPE_NORMAL
zh: 根据这个描述,可能很难理解正在发生的事情,所以让我们看一个插图。
- en: Diagrams
id: totrans-175
prefs:
- PREF_H2
type: TYPE_NORMAL
zh: 图表
- en: '[Figure 2-10](#fig_02-10) is a diagram of what our more complicated model now
looks like.'
id: totrans-176
prefs: []
type: TYPE_NORMAL
zh: '[图2-10](#fig_02-10)是我们更复杂模型的图表。'
- en: '![Neural network forward pass](assets/dlfs_0210.png)'
id: totrans-177
prefs: []
type: TYPE_IMG
zh: '![神经网络前向传播](assets/dlfs_0210.png)'
- en: Figure 2-10\. Steps 1–3 translated into a computational graph of the kind we
saw in [Chapter 1](ch01.html#foundations)
id: totrans-178
prefs:
- PREF_H6
type: TYPE_NORMAL
zh: 图2-10。将步骤1-3翻译成我们在第1章中看到的计算图的一种类型
- en: 'You’ll see that we start with matrix multiplication and matrix addition, as
before. Now let’s formalize some terminology that was mentioned previously: when
we apply these operations in the course of a nested function, we’ll call the first
Expand Down
Loading

0 comments on commit 486c458

Please sign in to comment.