2024-02-08 17:54:16

OpenDocCN · Feb 8, 2024 · 486c458 · 486c458
1 parent 0101269
commit 486c458
Show file tree

Hide file tree

Showing 2 changed files with 287 additions and 0 deletions.
diff --git a/totrans/dl-scr_2.yaml b/totrans/dl-scr_2.yaml
@@ -1410,6 +1410,9 @@
   id: totrans-160
   prefs: []
   type: TYPE_NORMAL
+  zh: 做“一堆线性回归”是什么意思？做一个线性回归涉及使用一组参数进行矩阵乘法：如果我们的数据*X*的维度是`[batch_size, num_features]`，那么我们将它乘以一个维度为`[num_features,
+    1]`的权重矩阵*W*，得到一个维度为`[batch_size, 1]`的输出；对于批次中的每个观察值，这个输出只是原始特征的一个*加权和*。要做多个线性回归，我们只需将我们的输入乘以一个维度为`[num_features,
+    num_outputs]`的权重矩阵，得到一个维度为`[batch_size, num_outputs]`的输出；现在，*对于每个观察值*，我们有`num_outputs`个不同的原始特征的加权和。
 - en: What are these weighted sums? We should think of each of them as a “learned
     feature”—a combination of the original features that, once the network is trained,
     will represent its attempt to learn combinations of features that help it accurately
@@ -1418,26 +1421,31 @@
   id: totrans-161
   prefs: []
   type: TYPE_NORMAL
+  zh: 这些加权和是什么？我们应该将它们中的每一个看作是一个“学习到的特征”——原始特征的组合，一旦网络训练完成，将代表其尝试学习的特征组合，以帮助准确预测房价。我们应该创建多少个学习到的特征？让我们创建13个，因为我们创建了13个原始特征。
 - en: 'Step 2: A Nonlinear Function'
   id: totrans-162
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: 步骤2：一个非线性函数
 - en: Next, we’ll feed each of these weighted sums through a *non*linear function;
     the first function we’ll try is the `sigmoid` function that was mentioned in [Chapter 1](ch01.html#foundations).
     As a refresher, [Figure 2-9](#fig_02-09) plots the `sigmoid` function.
   id: totrans-163
   prefs: []
   type: TYPE_NORMAL
+  zh: 接下来，我们将通过一个非线性函数来处理这些加权和；我们将尝试的第一个函数是在第1章中提到的`sigmoid`函数。作为提醒，[图2-9](#fig_02-09)展示了`sigmoid`函数。
 - en: '![Sigmoid](assets/dlfs_0209.png)'
   id: totrans-164
   prefs: []
   type: TYPE_IMG
+  zh: '![Sigmoid](assets/dlfs_0209.png)'
 - en: Figure 2-9\. Sigmoid function plotted from x = –5 to x = 5
   id: totrans-165
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图2-9。从x = -5到x = 5绘制的Sigmoid函数
 - en: Why is using this nonlinear function a good idea? Why not the `square` function
     *f*(*x*) = *x*², for example? There are a couple of reasons. First, we want the
     function we use here to be *monotonic* so that it “preserves” information about
@@ -1450,17 +1458,20 @@
   id: totrans-166
   prefs: []
   type: TYPE_NORMAL
+  zh: 为什么使用这个非线性函数是个好主意？为什么不使用`square`函数*f*(*x*) = *x*²，例如？有几个原因。首先，我们希望在这里使用的函数是*单调*的，以便“保留”输入的数字的信息。假设，给定输入的日期，我们的两个线性回归分别产生值-3和3。然后通过`square`函数传递这些值将为每个产生一个值9，因此任何接收这些数字作为输入的函数在它们通过`square`函数传递后将“丢失”一个原始为-3，另一个为3的信息。
 - en: The second reason, of course, is that the function is nonlinear; this nonlinearity
     will enable our neural network to model the inherently nonlinear relationship
     between the features and the target.
   id: totrans-167
   prefs: []
   type: TYPE_NORMAL
+  zh: 当然，第二个原因是这个函数是非线性的；这种非线性将使我们的神经网络能够建模特征和目标之间固有的非线性关系。
 - en: 'Finally, the `sigmoid` function has the nice property that its derivative can
     be expressed in terms of the function itself:'
   id: totrans-168
   prefs: []
   type: TYPE_NORMAL
+  zh: 最后，`sigmoid`函数有一个很好的性质，即它的导数可以用函数本身来表示：
 - en: <math display="block"><mrow><mfrac><mrow><mi>∂</mi><mi>σ</mi></mrow> <mrow><mi>∂</mi><mi>u</mi></mrow></mfrac>
     <mrow><mo>(</mo> <mi>x</mi> <mo>)</mo></mrow> <mo>=</mo> <mi>σ</mi> <mrow><mo>(</mo>
     <mi>x</mi> <mo>)</mo></mrow> <mo>×</mo> <mrow><mo>(</mo> <mn>1</mn> <mo>-</mo>
@@ -1477,18 +1488,21 @@
   id: totrans-170
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们将很快在神经网络的反向传播中使用`sigmoid`函数时使用它。
 - en: 'Step 3: Another Linear Regression'
   id: totrans-171
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: 步骤3：另一个线性回归
 - en: Finally, we’ll take the resulting 13 elements—each of which is a combination
     of the original features, fed through the `sigmoid` function so that they all
     have values between 0 and 1—and feed them into a regular linear regression, using
     them the same way we used our original features previously.
   id: totrans-172
   prefs: []
   type: TYPE_NORMAL
+  zh: 最后，我们将得到的13个元素——每个元素都是原始特征的组合，通过`sigmoid`函数传递，使它们的值都在0到1之间——并将它们输入到一个常规线性回归中，使用它们的方式与我们之前使用原始特征的方式相同。
 - en: 'Then, we’ll try training the *entire* resulting function in the same way we
     trained the standard linear regression earlier in this chapter: we’ll feed data
     through the model, use the chain rule to figure out how much increasing the weights
@@ -1499,31 +1513,37 @@
   id: totrans-173
   prefs: []
   type: TYPE_NORMAL
+  zh: 然后，我们将尝试训练*整个*得到的函数，方式与本章前面训练标准线性回归的方式相同：我们将数据通过模型，使用链式法则来计算增加权重会增加（或减少）损失多少，然后在每次迭代中更新权重，以减少损失。随着时间的推移（我们希望），我们将得到比以前更准确的模型，一个已经“学会”了特征和目标之间固有非线性关系的模型。
 - en: It might be tough to wrap your mind around what’s going on based on this description,
     so let’s look at an illustration.
   id: totrans-174
   prefs: []
   type: TYPE_NORMAL
+  zh: 根据这个描述，可能很难理解正在发生的事情，所以让我们看一个插图。
 - en: Diagrams
   id: totrans-175
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: 图表
 - en: '[Figure 2-10](#fig_02-10) is a diagram of what our more complicated model now
     looks like.'
   id: totrans-176
   prefs: []
   type: TYPE_NORMAL
+  zh: '[图2-10](#fig_02-10)是我们更复杂模型的图表。'
 - en: '![Neural network forward pass](assets/dlfs_0210.png)'
   id: totrans-177
   prefs: []
   type: TYPE_IMG
+  zh: '![神经网络前向传播](assets/dlfs_0210.png)'
 - en: Figure 2-10\. Steps 1–3 translated into a computational graph of the kind we
     saw in [Chapter 1](ch01.html#foundations)
   id: totrans-178
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图2-10。将步骤1-3翻译成我们在第1章中看到的计算图的一种类型
 - en: 'You’ll see that we start with matrix multiplication and matrix addition, as
     before. Now let’s formalize some terminology that was mentioned previously: when
     we apply these operations in the course of a nested function, we’ll call the first