diff --git a/totrans/gen-dl_11.yaml b/totrans/gen-dl_11.yaml
index d125ed2..fd0b17a 100644
--- a/totrans/gen-dl_11.yaml
+++ b/totrans/gen-dl_11.yaml
@@ -1,7 +1,9 @@
 - en: Chapter 8\. Diffusion Models
+  id: totrans-0
   prefs:
   - PREF_H1
   type: TYPE_NORMAL
+  zh: 第8章。扩散模型
 - en: Alongside GANs, diffusion models are one of the most influential and impactful
     generative modeling techniques for image generation to have been introduced over
     the last decade. Across many benchmarks, diffusion models now outperform previously
@@ -10,16 +12,21 @@
     2 and Google’s ImageGen for text-to-image generation). Recently, there has been
     an explosion of diffusion models being applied across wide range of tasks, reminiscent
     of the GAN proliferation that took place between 2017–2020.
+  id: totrans-1
   prefs: []
   type: TYPE_NORMAL
+  zh: 与GANs并驾齐驱，扩散模型是过去十年中引入的最具影响力和影响力的生成建模技术之一。在许多基准测试中，扩散模型现在胜过以前的最先进GANs，并迅速成为生成建模从业者的首选选择，特别是对于视觉领域（例如，OpenAI的DALL.E
+    2和Google的ImageGen用于文本到图像生成）。最近，扩散模型在广泛任务中的应用呈现爆炸性增长，类似于2017年至2020年间GAN的普及。
 - en: 'Many of the core ideas that underpin diffusion models share similarities with
     earlier types of generative models that we have already explored in this book
     (e.g., denoising autoencoders, energy-based models). Indeed, the name *diffusion*
     takes inspiration from the well-studied property of thermodynamic diffusion: an
     important link was made between this purely physical field and deep learning in
     2015.^([1](ch08.xhtml#idm45387010500320))'
+  id: totrans-2
   prefs: []
   type: TYPE_NORMAL
+  zh: 许多支撑扩散模型的核心思想与本书中已经探索过的早期类型的生成模型（例如，去噪自动编码器，基于能量的模型）有相似之处。事实上，名称*扩散*灵感来自热力学扩散的深入研究：在2015年，这一纯物理领域与深度学习之间建立了重要联系。^([1](ch08.xhtml#idm45387010500320))
 - en: Important progress was also being made in the field of score-based generative
     models,^([2](ch08.xhtml#idm45387010496240))^,^([3](ch08.xhtml#idm45387010494000))
     a branch of energy-based modeling that directly estimates the gradient of the
@@ -28,141 +35,211 @@
     Stefano Ermon used multiple scales of noise perturbations applied to the raw data
     to ensure the model—a *noise conditional score network* (NCSN)—performs well on
     regions of low data density.
+  id: totrans-3
   prefs: []
   type: TYPE_NORMAL
+  zh: 在基于分数的生成模型领域也取得了重要进展，^([2](ch08.xhtml#idm45387010496240))^,^([3](ch08.xhtml#idm45387010494000))这是能量基模型的一个分支，直接估计对数分布的梯度（也称为分数函数），以训练模型，作为使用对比散度的替代方法。特别是，杨松和斯特凡诺·厄尔蒙使用多个尺度的噪声扰动应用于原始数据，以确保模型-一个*噪声条件分数网络*（NCSN）在低数据密度区域表现良好。
 - en: The breakthrough diffusion model paper came in the summer of 2020.^([4](ch08.xhtml#idm45387010490880))
     Standing on the shoulders of earlier works, the paper uncovers a deep connection
     between diffusion models and score-based generative models, and the authors use
     this fact to train a diffusion model that can rival GANs across several datasets,
     called the *Denoising Diffusion Probabilistic Model* (DDPM).
+  id: totrans-4
   prefs: []
   type: TYPE_NORMAL
+  zh: 突破性的扩散模型论文于2020年夏天发表。^([4](ch08.xhtml#idm45387010490880))在前人的基础上，该论文揭示了扩散模型和基于分数的生成模型之间的深刻联系，作者利用这一事实训练了一个可以在几个数据集上与GANs匹敌的扩散模型，称为*去噪扩散概率模型*（DDPM）。
 - en: This chapter will walk through the theoretical requirements for understanding
     how a denoising diffusion model works. You will then learn how to build your own
     denoising diffusion model using Keras.
+  id: totrans-5
   prefs: []
   type: TYPE_NORMAL
+  zh: 本章将介绍理解去噪扩散模型工作原理的理论要求。然后，您将学习如何使用Keras构建自己的去噪扩散模型。
 - en: Introduction
+  id: totrans-6
   prefs:
   - PREF_H1
   type: TYPE_NORMAL
+  zh: 介绍
 - en: To help explain the key ideas that underpin diffusion models, let’s begin with
     a short story!
+  id: totrans-7
   prefs: []
   type: TYPE_NORMAL
+  zh: 为了帮助解释支撑扩散模型的关键思想，让我们从一个简短的故事开始！
 - en: The DiffuseTV story describes the general idea behind a diffusion model. Now
     let’s dive into the technicalities of how we build such a model using Keras.
+  id: totrans-8
   prefs: []
   type: TYPE_NORMAL
+  zh: DiffuseTV故事描述了扩散模型背后的一般思想。现在让我们深入探讨如何使用Keras构建这样一个模型的技术细节。
 - en: Denoising Diffusion Models (DDM)
+  id: totrans-9
   prefs:
   - PREF_H1
   type: TYPE_NORMAL
+  zh: 去噪扩散模型（DDM）
 - en: The core idea behind a denoising diffusion model is simple—we train a deep learning
     model to denoise an image over a series of very small steps. If we start from
     pure random noise, in theory we should be able to keep applying the model until
     we obtain an image that looks as if it were drawn from the training set. What’s
     amazing is that this simple concept works so well in practice!
+  id: totrans-10
   prefs: []
   type: TYPE_NORMAL
+  zh: 去噪扩散模型背后的核心思想很简单-我们训练一个深度学习模型，在一系列非常小的步骤中去噪图像。如果我们从纯随机噪音开始，在理论上我们应该能够不断应用该模型，直到获得一个看起来好像是从训练集中绘制出来的图像。令人惊奇的是，这个简单的概念在实践中效果如此出色！
 - en: Let’s first get set up with a dataset and then walk through the forward (noising)
     and backward (denoising) diffusion processes.
+  id: totrans-11
   prefs: []
   type: TYPE_NORMAL
+  zh: 让我们首先准备一个数据集，然后逐步介绍前向（加噪）和后向（去噪）扩散过程。
 - en: Running the Code for This Example
+  id: totrans-12
   prefs:
   - PREF_H1
   type: TYPE_NORMAL
+  zh: 运行此示例的代码
 - en: The code for this example can be found in the Jupyter notebook located at *notebooks/08_diffusion/01_ddm/ddm.ipynb*
     in the book repository.
+  id: totrans-13
   prefs: []
   type: TYPE_NORMAL
+  zh: 此示例的代码可以在书籍存储库中位于*notebooks/08_diffusion/01_ddm/ddm.ipynb*的Jupyter笔记本中找到。
 - en: The code is adapted from the excellent [tutorial on denoising diffusion implicit
     models](https://oreil.ly/srPCe) created by András Béres available on the Keras
     website.
+  id: totrans-14
   prefs: []
   type: TYPE_NORMAL
+  zh: 该代码改编自András Béres在Keras网站上创建的优秀[去噪扩散隐式模型教程](https://oreil.ly/srPCe)。
 - en: The Flowers Dataset
+  id: totrans-15
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: 花卉数据集
 - en: We’ll be using the [Oxford 102 Flower dataset](https://oreil.ly/HfrKV) that
     is available through Kaggle. This is a set of over 8,000 color images of a variety
     of flowers.
+  id: totrans-16
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们将使用通过Kaggle提供的[牛津102花卉数据集](https://oreil.ly/HfrKV)。这是一组包含各种花卉的8000多张彩色图像。
 - en: You can download the dataset by running the Kaggle dataset downloader script
     in the book repository, as shown in [Example 8-1](#downloading-flower-dataset).
     This will save the flower images to the */data* folder.
+  id: totrans-17
   prefs: []
   type: TYPE_NORMAL
+  zh: 您可以通过在书籍存储库中运行Kaggle数据集下载脚本来下载数据集，如[示例8-1](#downloading-flower-dataset)所示。这将把花卉图像保存到*/data*文件夹中。
 - en: Example 8-1\. Downloading the Oxford 102 Flower dataset
+  id: totrans-18
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例8-1。下载牛津102花卉数据集
 - en: '[PRE0]'
+  id: totrans-19
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE0]'
 - en: '`As usual, we’ll load the images in using the Keras `image_dataset_from_directory`
     function, resize the images to 64 × 64 pixels, and scale the pixel values to the
     range [0, 1]. We’ll also repeat the dataset five times to increase the epoch length
     and batch the data into groups of 64 images, as shown in [Example 8-2](#flower-preprocessing-ex).'
+  id: totrans-20
   prefs: []
   type: TYPE_NORMAL
+  zh: “通常情况下，我们将使用Keras的`image_dataset_from_directory`函数加载图像，将图像调整为64×64像素，并将像素值缩放到范围[0,
+    1]。我们还将数据集重复五次，以增加时代长度，并将数据分成64张图像一组，如[示例8-2](#flower-preprocessing-ex)所示。
 - en: Example 8-2\. Loading the Oxford 102 Flower dataset
+  id: totrans-21
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例8-2。加载牛津102花卉数据集
 - en: '[PRE1]'
+  id: totrans-22
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE1]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO1-1)'
+  id: totrans-23
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![1](Images/1.png)](#co_diffusion_models_CO1-1)'
 - en: Load dataset (when required during training) using the Keras `image_dataset_from_directory`
     function.
+  id: totrans-24
   prefs: []
   type: TYPE_NORMAL
+  zh: 使用Keras的`image_dataset_from_directory`函数加载数据集（在训练期间需要时）。
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO1-2)'
+  id: totrans-25
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![2](Images/2.png)](#co_diffusion_models_CO1-2)'
 - en: Scale the pixel values to the range [0, 1].
+  id: totrans-26
   prefs: []
   type: TYPE_NORMAL
+  zh: 将像素值缩放到范围[0, 1]。
 - en: '[![3](Images/3.png)](#co_diffusion_models_CO1-3)'
+  id: totrans-27
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![3](Images/3.png)](#co_diffusion_models_CO1-3)'
 - en: Repeat the dataset five times.
+  id: totrans-28
   prefs: []
   type: TYPE_NORMAL
+  zh: 将数据集重复五次。
 - en: '[![4](Images/4.png)](#co_diffusion_models_CO1-4)'
+  id: totrans-29
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![4](Images/4.png)](#co_diffusion_models_CO1-4)'
 - en: Batch the dataset into groups of 64 images.
+  id: totrans-30
   prefs: []
   type: TYPE_NORMAL
+  zh: 将数据集分成64张图像一组。
 - en: Example images from the dataset are shown in [Figure 8-2](Images/#flower_example_images).
+  id: totrans-31
   prefs: []
   type: TYPE_NORMAL
+  zh: 数据集中的示例图像显示在[图8-2](Images/#flower_example_images)中。
 - en: '![](Images/gdl2_0802.png)'
+  id: totrans-32
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0802.png)'
 - en: Figure 8-2\. Example images from the Oxford 102 Flower dataset
+  id: totrans-33
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-2。牛津102花卉数据集中的示例图像
 - en: Now that we have our dataset we can explore how we should add noise to the images,
     using a forward diffusion process.`  `## The Forward Diffusion Process
+  id: totrans-34
   prefs: []
   type: TYPE_NORMAL
+  zh: 现在我们有了数据集，我们可以探讨如何向图像添加噪声，使用前向扩散过程。` `##前向扩散过程
 - en: Suppose we have an image <math alttext="bold x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math>
     that we want to corrupt gradually over a large number of steps (say, <math alttext="upper
     T equals 1 comma 000"><mrow><mi>T</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>000</mn></mrow></math>
     ), so that eventually it is indistinguishable from standard Gaussian noise (i.e.,
     <math alttext="bold x Subscript upper T"><msub><mi>𝐱</mi> <mi>T</mi></msub></math>
     should have zero mean and unit variance). How should we go about doing this?
+  id: totrans-35
   prefs: []
   type: TYPE_NORMAL
+  zh: 假设我们有一幅图像<math alttext="bold x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math>，我们希望在大量步骤（比如，<math
+    alttext="upper T equals 1 comma 000"><mrow><mi>T</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo>
+    <mn>000</mn></mrow></math>）中逐渐损坏，以至于最终与标准高斯噪声不可区分（即，<math alttext="bold x Subscript
+    upper T"><msub><mi>𝐱</mi> <mi>T</mi></msub></math>应具有零均值和单位方差）。我们应该如何做到这一点呢？
 - en: We can define a function <math alttext="q"><mi>q</mi></math> that adds a small
     amount of Gaussian noise with variance <math alttext="beta Subscript t"><msub><mi>β</mi>
     <mi>t</mi></msub></math> to an image <math alttext="bold x Subscript t minus 1"><msub><mi>𝐱</mi>
@@ -172,20 +249,34 @@
     noisier images ( <math alttext="bold x 0 comma ellipsis comma bold x Subscript
     upper T Baseline"><mrow><msub><mi>𝐱</mi> <mn>0</mn></msub> <mo>,</mo> <mo>...</mo>
     <mo>,</mo> <msub><mi>𝐱</mi> <mi>T</mi></msub></mrow></math> ), as shown in [Figure 8-3](#forward_diffusion_q).
+  id: totrans-36
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们可以定义一个函数<math alttext="q"><mi>q</mi></math>，它向图像<math alttext="bold x Subscript
+    t minus 1"><msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math>添加方差为<math
+    alttext="beta Subscript t"><msub><mi>β</mi> <mi>t</mi></msub></math>的少量高斯噪声，以生成新图像<math
+    alttext="bold x Subscript t"><msub><mi>𝐱</mi> <mi>t</mi></msub></math>。如果我们不断应用这个函数，我们将生成一系列逐渐嘈杂的图像（<math
+    alttext="bold x 0 comma ellipsis comma bold x Subscript upper T Baseline"><mrow><msub><mi>𝐱</mi>
+    <mn>0</mn></msub> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msub><mi>𝐱</mi> <mi>T</mi></msub></mrow></math>），如[图8-3](#forward_diffusion_q)所示。
 - en: '![](Images/gdl2_0803.png)'
+  id: totrans-37
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0803.png)'
 - en: Figure 8-3\. The forward diffusion process <math alttext="q"><mi>q</mi></math>
+  id: totrans-38
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-3。前向扩散过程<math alttext="q"><mi>q</mi></math>
 - en: 'We can write this update process mathematically as follows (here, <math alttext="epsilon
     Subscript t minus 1"><msub><mi>ϵ</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math>
     is a standard Gaussian with zero mean and unit variance):'
+  id: totrans-39
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们可以将这个更新过程数学地表示如下（这里，<math alttext="epsilon Subscript t minus 1"><msub><mi>ϵ</mi>
+    <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math>是具有零均值和单位方差的标准高斯分布）：
 - en: <math alttext="bold x Subscript t Baseline equals StartRoot 1 minus beta Subscript
     t Baseline EndRoot bold x Subscript t minus 1 Baseline plus StartRoot beta Subscript
     t Baseline EndRoot epsilon Subscript t minus 1" display="block"><mrow><msub><mi>𝐱</mi>
@@ -193,8 +284,16 @@
     <mi>t</mi></msub></mrow></msqrt> <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub>
     <mo>+</mo> <msqrt><msub><mi>β</mi> <mi>t</mi></msub></msqrt> <msub><mi>ϵ</mi>
     <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></mrow></math>
+  id: totrans-40
   prefs: []
   type: TYPE_NORMAL
+  zh: <math alttext="bold x Subscript t Baseline equals StartRoot 1 minus beta Subscript
+    t Baseline EndRoot bold x Subscript t minus 1 Baseline plus StartRoot beta Subscript
+    t Baseline EndRoot epsilon Subscript t minus 1" display="block"><mrow><msub><mi>𝐱</mi>
+    <mi>t</mi></msub> <mo>=</mo> <msqrt><mrow><mn>1</mn> <mo>-</mo> <msub><mi>β</mi>
+    <mi>t</mi></msub></mrow></msqrt> <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub>
+    <mo>+</mo> <msqrt><msub><mi>β</mi> <mi>t</mi></msub></msqrt> <msub><mi>ϵ</mi>
+    <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></mrow></math>
 - en: Note that we also scale the input image <math alttext="bold x Subscript t minus
     1"><msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math>
     , to ensure that the variance of the output image <math alttext="bold x Subscript
@@ -204,8 +303,14 @@
     x Subscript upper T"><msub><mi>𝐱</mi> <mi>T</mi></msub></math> will approximate
     a standard Gaussian distribution for large enough <math alttext="upper T"><mi>T</mi></math>
     , by induction, as follows.
+  id: totrans-41
   prefs: []
   type: TYPE_NORMAL
+  zh: 请注意，我们还要缩放输入图像<math alttext="bold x Subscript t minus 1"><msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math>，以确保输出图像<math
+    alttext="bold x Subscript t"><msub><mi>𝐱</mi> <mi>t</mi></msub></math>的方差随时间保持恒定。这样，如果我们将原始图像<math
+    alttext="bold x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math>归一化为零均值和单位方差，那么<math
+    alttext="bold x Subscript upper T"><msub><mi>𝐱</mi> <mi>T</mi></msub></math>将在足够大的<math
+    alttext="upper T"><mi>T</mi></math>时逼近标准高斯分布，通过归纳，如下所示。
 - en: If we assume that <math alttext="bold x Subscript t minus 1"><msub><mi>𝐱</mi>
     <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math> has zero mean and unit
     variance then <math alttext="StartRoot 1 minus beta Subscript t Baseline EndRoot
@@ -241,12 +346,43 @@
     we need, as we want to be able to easily sample <math alttext="bold x Subscript
     upper T"><msub><mi>𝐱</mi> <mi>T</mi></msub></math> and then apply a reverse diffusion
     process through our trained neural network model!
-  prefs: []
-  type: TYPE_NORMAL
+  id: totrans-42
+  prefs: []
+  type: TYPE_NORMAL
+  zh: 如果我们假设<math alttext="bold x Subscript t minus 1"><msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math>具有零均值和单位方差，那么<math
+    alttext="StartRoot 1 minus beta Subscript t Baseline EndRoot bold x Subscript
+    t minus 1"><mrow><msqrt><mrow><mn>1</mn> <mo>-</mo> <msub><mi>β</mi> <mi>t</mi></msub></mrow></msqrt>
+    <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></mrow></math>的方差将为<math
+    alttext="1 minus beta Subscript t"><mrow><mn>1</mn> <mo>-</mo> <msub><mi>β</mi>
+    <mi>t</mi></msub></mrow></math>，而<math alttext="StartRoot beta Subscript t Baseline
+    EndRoot epsilon Subscript t minus 1"><mrow><msqrt><msub><mi>β</mi> <mi>t</mi></msub></msqrt>
+    <msub><mi>ϵ</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></mrow></math>的方差将为<math
+    alttext="beta Subscript t"><msub><mi>β</mi> <mi>t</msub></math>，使用<math alttext="upper
+    V a r left-parenthesis a upper X right-parenthesis equals a squared upper V a
+    r left-parenthesis upper X right-parenthesis"><mrow><mi>V</mi> <mi>a</mi> <mi>r</mi>
+    <mrow><mo>(</mo> <mi>a</mi> <mi>X</mi> <mo>)</mo></mrow> <mo>=</mo> <msup><mi>a</mi>
+    <mn>2</mn></msup> <mi>V</mi> <mi>a</mi> <mi>r</mi> <mrow><mo>(</mo> <mi>X</mi>
+    <mo>)</mo></mrow></mrow></math>的规则。将这些加在一起，我们得到一个新的分布<math alttext="bold x Subscript
+    t"><msub><mi>𝐱</mi> <mi>t</mi></msub></math>，均值为零，方差为<math alttext="1 minus beta
+    Subscript t Baseline plus beta Subscript t Baseline equals 1"><mrow><mn>1</mn>
+    <mo>-</mo> <msub><mi>β</mi> <mi>t</mi></msub> <mo>+</mo> <msub><mi>β</mi> <mi>t</mi></msub>
+    <mo>=</mo> <mn>1</mn></mrow></math>，使用<math alttext="upper V a r left-parenthesis
+    upper X plus upper Y right-parenthesis equals upper V a r left-parenthesis upper
+    X right-parenthesis plus upper V a r left-parenthesis upper Y right-parenthesis"><mrow><mi>V</mi>
+    <mi>a</mi> <mi>r</mi> <mo>(</mo> <mi>X</mi> <mo>+</mo> <mi>Y</mi> <mo>)</mo> <mo>=</mo>
+    <mi>V</mi> <mi>a</mi> <mi>r</mi> <mo>(</mo> <mi>X</mi> <mo>)</mo> <mo>+</mo> <mi>V</mi>
+    <mi>a</mi> <mi>r</mi> <mo>(</mo> <mi>Y</mi> <mo>)</mo></mrow></math>的规则，对于独立的<math
+    alttext="upper X"><mi>X</mi></math>和<math alttext="upper Y"><mi>Y</mi></math>。因此，如果<math
+    alttext="bold x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math>被归一化为零均值和单位方差，那么我们保证对所有<math
+    alttext="bold x Subscript t"><msub><mi>𝐱</mi> <mi>t</mi></msub></math>都成立，包括最终图像<math
+    alttext="bold x Subscript upper T"><msub><mi>𝐱</mi> <mi>T</mi></msub></math>，它将近似为标准高斯分布。这正是我们需要的，因为我们希望能够轻松地对<math
+    alttext="bold x Subscript upper T"><msub><mi>𝐱</mi> <mi>T</mi></msub></math>进行采样，然后通过我们训练过的神经网络模型应用反向扩散过程！
 - en: 'In other words, our forward noising process <math alttext="q"><mi>q</mi></math>
     can also be written as follows:'
+  id: totrans-43
   prefs: []
   type: TYPE_NORMAL
+  zh: 换句话说，我们的前向噪声过程<math alttext="q"><mi>q</mi></math>也可以写成如下形式：
 - en: <math alttext="q left-parenthesis bold x Subscript t Baseline vertical-bar bold
     x Subscript t minus 1 Baseline right-parenthesis equals script upper N left-parenthesis
     bold x Subscript t Baseline semicolon StartRoot 1 minus beta Subscript t Baseline
@@ -257,20 +393,37 @@
     <mo>;</mo> <msqrt><mrow><mn>1</mn> <mo>-</mo> <msub><mi>β</mi> <mi>t</mi></msub></mrow></msqrt>
     <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub> <mo>,</mo>
     <msub><mi>β</mi> <mi>t</mi></msub> <mi>𝐈</mi> <mo>)</mo></mrow></mrow></math>
+  id: totrans-44
   prefs: []
   type: TYPE_NORMAL
+  zh: <math alttext="q left-parenthesis bold x Subscript t Baseline vertical-bar bold
+    x Subscript t minus 1 Baseline right-parenthesis equals script upper N left-parenthesis
+    bold x Subscript t Baseline semicolon StartRoot 1 minus beta Subscript t Baseline
+    EndRoot bold x Subscript t minus 1 Baseline comma beta Subscript t Baseline bold
+    upper I right-parenthesis" display="block"><mrow><mi>q</mi> <mrow><mo>(</mo> <msub><mi>𝐱</mi>
+    <mi>t</mi></msub> <mo>|</mo> <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub>
+    <mo>)</mo></mrow> <mo>=</mo> <mi>𝒩</mi> <mrow><mo>(</mo> <msub><mi>𝐱</mi> <mi>t</mi></msub>
+    <mo>;</mo> <msqrt><mrow><mn>1</mn> <mo>-</mo> <msub><mi>β</mi> <mi>t</mi></msub></mrow></msqrt>
+    <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub> <mo>,</mo>
+    <msub><mi>β</mi> <mi>t</mi></msub> <mi>𝐈</mi> <mo>)</mo></mrow></mrow></math>
 - en: The Reparameterization Trick
+  id: totrans-45
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: 重新参数化技巧
 - en: It would also be useful to be able to jump straight from an image <math alttext="bold
     x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math> to any noised version of the image
     <math alttext="bold x Subscript t"><msub><mi>𝐱</mi> <mi>t</mi></msub></math> without
     having to go through <math alttext="t"><mi>t</mi></math> applications of <math
     alttext="q"><mi>q</mi></math> . Luckily, there is a reparameterization trick that
     we can use to do this.
+  id: totrans-46
   prefs: []
   type: TYPE_NORMAL
+  zh: 从图像<math alttext="bold x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math>直接跳转到图像的任何噪声版本<math
+    alttext="bold x Subscript t"><msub><mi>𝐱</mi> <mi>t</mi></msub></math>也会很有用，而不必经过<math
+    alttext="t"><mi>t</mi></math>次<math alttext="q"><mi>q</mi></math>的应用。幸运的是，我们可以使用一种重新参数化技巧来实现这一点。
 - en: 'If we define <math alttext="alpha Subscript t Baseline equals 1 minus beta
     Subscript t"><mrow><msub><mi>α</mi> <mi>t</mi></msub> <mo>=</mo> <mn>1</mn> <mo>-</mo>
     <msub><mi>β</mi> <mi>t</mi></msub></mrow></math> and <math alttext="alpha overbar
@@ -279,8 +432,16 @@
     <mi>t</mi></msub> <mo>=</mo> <msubsup><mo>∏</mo> <mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow>
     <mi>t</mi></msubsup> <msub><mi>α</mi> <mi>i</mi></msub></mrow></math> , then we
     can write the following:'
+  id: totrans-47
   prefs: []
   type: TYPE_NORMAL
+  zh: 如果我们定义<math alttext="alpha Subscript t Baseline equals 1 minus beta Subscript
+    t"><mrow><msub><mi>α</mi> <mi>t</mi></msub> <mo>=</mo> <mn>1</mn> <mo>-</mo> <msub><mi>β</mi>
+    <mi>t</mi></msub></mrow></math>和<math alttext="alpha overbar Subscript t Baseline
+    equals product Underscript i equals 1 Overscript t Endscripts alpha Subscript
+    i"><mrow><msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub>
+    <mo>=</mo> <msubsup><mo>∏</mo> <mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow> <mi>t</mi></msubsup>
+    <msub><mi>α</mi> <mi>i</mi></msub></mrow></math>，那么我们可以写成以下形式：
 - en: <math alttext="StartLayout 1st Row 1st Column bold x Subscript t 2nd Column
     equals 3rd Column StartRoot alpha Subscript t Baseline EndRoot bold x Subscript
     t minus 1 plus StartRoot 1 minus alpha Subscript t Baseline EndRoot epsilon Subscript
@@ -306,8 +467,34 @@
     <mi>t</mi></msub></msqrt> <msub><mi>𝐱</mi> <mn>0</mn></msub> <mo>+</mo> <msqrt><mrow><mn>1</mn>
     <mo>-</mo> <msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></mrow></msqrt>
     <mi>ϵ</mi></mrow></mtd></mtr></mtable></math>
+  id: totrans-48
   prefs: []
   type: TYPE_NORMAL
+  zh: <math alttext="StartLayout 1st Row 1st Column bold x Subscript t 2nd Column
+    equals 3rd Column StartRoot alpha Subscript t Baseline EndRoot bold x Subscript
+    t minus 1 plus StartRoot 1 minus alpha Subscript t Baseline EndRoot epsilon Subscript
+    t minus 1 2nd Row 1st Column Blank 2nd Column equals 3rd Column StartRoot alpha
+    Subscript t Baseline alpha Subscript t minus 1 Baseline EndRoot bold x Subscript
+    t minus 2 plus StartRoot 1 minus alpha Subscript t Baseline alpha Subscript t
+    minus 1 Baseline EndRoot epsilon 3rd Row 1st Column Blank 2nd Column equals 3rd
+    Column  ellipsis 4th Row 1st Column Blank 2nd Column equals 3rd Column StartRoot
+    alpha overbar Subscript t Baseline EndRoot bold x 0 plus StartRoot 1 minus alpha
+    overbar Subscript t Baseline EndRoot epsilon EndLayout" display="block"><mtable
+    displaystyle="true"><mtr><mtd columnalign="right"><msub><mi>𝐱</mi> <mi>t</mi></msub></mtd>
+    <mtd><mo>=</mo></mtd> <mtd columnalign="left"><mrow><msqrt><msub><mi>α</mi> <mi>t</mi></msub></msqrt>
+    <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub> <mo>+</mo>
+    <msqrt><mrow><mn>1</mn> <mo>-</mo> <msub><mi>α</mi> <mi>t</mi></msub></mrow></msqrt>
+    <msub><mi>ϵ</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></mrow></mtd></mtr>
+    <mtr><mtd><mo>=</mo></mtd> <mtd columnalign="left"><mrow><msqrt><mrow><msub><mi>α</mi>
+    <mi>t</mi></msub> <msub><mi>α</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></mrow></msqrt>
+    <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>2</mn></mrow></msub> <mo>+</mo>
+    <msqrt><mrow><mn>1</mn> <mo>-</mo> <msub><mi>α</mi> <mi>t</mi></msub> <msub><mi>α</mi>
+    <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></mrow></msqrt> <mi>ϵ</mi></mrow></mtd></mtr>
+    <mtr><mtd><mo>=</mo></mtd> <mtd columnalign="left"><mo>⋯</mo></mtd></mtr> <mtr><mtd><mo>=</mo></mtd>
+    <mtd columnalign="left"><mrow><msqrt><msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover>
+    <mi>t</mi></msub></msqrt> <msub><mi>𝐱</mi> <mn>0</mn></msub> <mo>+</mo> <msqrt><mrow><mn>1</mn>
+    <mo>-</mo> <msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></mrow></msqrt>
+    <mi>ϵ</mi></mrow></mtd></mtr></mtable></math>
 - en: Note that the second line uses the fact that we can add two Gaussians to obtain
     a new Gaussian. We therefore have a way to jump from the original image <math
     alttext="bold x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math> to any step of the
@@ -323,12 +510,25 @@
     t"><mrow><mn>1</mn> <mo>-</mo> <msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover>
     <mi>t</mi></msub></mrow></math> is the variance due to the noise ( <math alttext="epsilon"><mi>ϵ</mi></math>
     ).
+  id: totrans-49
   prefs: []
   type: TYPE_NORMAL
+  zh: 请注意，第二行使用了我们可以将两个高斯函数相加以获得一个新高斯函数的事实。因此，我们有一种方法可以从原始图像<math alttext="bold x
+    0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math>跳转到前向扩散过程的任何步骤<math alttext="bold
+    x Subscript t"><msub><mi>𝐱</mi> <mi>t</mi></msub></math>。此外，我们可以使用<math alttext="alpha
+    overbar Subscript t"><msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover>
+    <mi>t</mi></msub></math>值来定义扩散进度表，而不是原始的<math alttext="beta Subscript t"><msub><mi>β</mi>
+    <mi>t</mi></msub></math>值，解释为<math alttext="alpha overbar Subscript t"><msub><mover
+    accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></math>是由信号（原始图像，<math
+    alttext="bold x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math>）引起的方差，而<math alttext="1
+    minus alpha overbar Subscript t"><mrow><mn>1</mn> <mo>-</mo> <msub><mover accent="true"><mi>α</mi>
+    <mo>¯</mo></mover> <mi>t</mi></msub></mrow></math>是由噪声（<math alttext="epsilon"><mi>ϵ</mi></math>）引起的方差。
 - en: 'The forward diffusion process <math alttext="q"><mi>q</mi></math> can therefore
     also be written as follows:'
+  id: totrans-50
   prefs: []
   type: TYPE_NORMAL
+  zh: 前向扩散过程<math alttext="q"><mi>q</mi></math>也可以写成如下形式：
 - en: <math alttext="q left-parenthesis bold x Subscript t Baseline vertical-bar bold
     x 0 right-parenthesis equals script upper N left-parenthesis bold x Subscript
     t Baseline semicolon StartRoot alpha overbar Subscript t Baseline EndRoot bold
@@ -340,20 +540,39 @@
     <msub><mi>𝐱</mi> <mn>0</mn></msub> <mo>,</mo> <mrow><mo>(</mo> <mn>1</mn> <mo>-</mo>
     <msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub> <mo>)</mo></mrow>
     <mi>𝐈</mi> <mo>)</mo></mrow></mrow></math>
+  id: totrans-51
   prefs: []
   type: TYPE_NORMAL
+  zh: <math alttext="q left-parenthesis bold x Subscript t Baseline vertical-bar bold
+    x 0 right-parenthesis equals script upper N left-parenthesis bold x Subscript
+    t Baseline semicolon StartRoot alpha overbar Subscript t Baseline EndRoot bold
+    x 0 comma left-parenthesis 1 minus alpha overbar Subscript t Baseline right-parenthesis
+    bold upper I right-parenthesis" display="block"><mrow><mi>q</mi> <mrow><mo>(</mo>
+    <msub><mi>𝐱</mi> <mi>t</mi></msub> <mo>|</mo> <msub><mi>𝐱</mi> <mn>0</mn></msub>
+    <mo>)</mo></mrow> <mo>=</mo> <mi>𝒩</mi> <mrow><mo>(</mo> <msub><mi>𝐱</mi> <mi>t</mi></msub>
+    <mo>;</mo> <msqrt><msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></msqrt>
+    <msub><mi>𝐱</mi> <mn>0</mn></msub> <mo>,</mo> <mrow><mo>(</mo> <mn>1</mn> <mo>-</mo>
+    <msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub> <mo>)</mo></mrow>
+    <mi>𝐈</mi> <mo>)</mo></mrow></mrow></math>
 - en: Diffusion Schedules
+  id: totrans-52
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: 扩散进度表
 - en: Notice that we are also free to choose a different <math alttext="beta Subscript
     t"><msub><mi>β</mi> <mi>t</mi></msub></math> at each timestep—they don’t all have
     be the same. How the <math alttext="beta Subscript t"><msub><mi>β</mi> <mi>t</mi></msub></math>
     (or <math alttext="alpha overbar Subscript t"><msub><mover accent="true"><mi>α</mi>
     <mo>¯</mo></mover> <mi>t</mi></msub></math> ) values change with <math alttext="t"><mi>t</mi></math>
     is called the *diffusion* *schedule*.
+  id: totrans-53
   prefs: []
   type: TYPE_NORMAL
+  zh: 请注意，我们也可以在每个时间步长选择不同的<math alttext="beta Subscript t"><msub><mi>β</mi> <mi>t</mi></msub></math>——它们不必全部相同。<math
+    alttext="beta Subscript t"><msub><mi>β</mi> <mi>t</mi></msub></math>（或<math alttext="alpha
+    overbar Subscript t"><msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover>
+    <mi>t</mi></msub></math>）值随着<math alttext="t"><mi>t</mi></math>的变化被称为*扩散进度表*。
 - en: In the original paper (Ho et al., 2020), the authors chose a *linear diffusion
     schedule* for <math alttext="beta Subscript t"><msub><mi>β</mi> <mi>t</mi></msub></math>
     —that is, <math alttext="beta Subscript t"><msub><mi>β</mi> <mi>t</mi></msub></math>
@@ -363,54 +582,91 @@
     <mi>T</mi></msub> <mo>=</mo></mrow></math> 0.02\. This ensures that in the early
     stages of the noising process we take smaller noising steps than in the later
     stages, when the image is already very noisy.
+  id: totrans-54
   prefs: []
   type: TYPE_NORMAL
+  zh: 在原始论文中（Ho等人，2020年），作者选择了一个*线性扩散进度表*用于<math alttext="beta Subscript t"><msub><mi>β</mi>
+    <mi>t</mi></msub></math>——即，<math alttext="beta Subscript t"><msub><mi>β</mi>
+    <mi>t</mi></msub></math>随着<math alttext="t"><mi>t</mi></math>线性增加，从<math alttext="beta
+    1 equals"><mrow><msub><mi>β</mi> <mn>1</mn></msub> <mo>=</mo></mrow></math>0.0001到<math
+    alttext="beta Subscript upper T Baseline equals"><mrow><msub><mi>β</mi> <mi>T</mi></msub>
+    <mo>=</mo></mrow></math>0.02。这确保在噪声过程的早期阶段，我们采取比在后期阶段更小的噪声步骤，当图像已经非常嘈杂时。
 - en: We can code up a linear diffusion schedule as shown in [Example 8-3](#linear_diffusion_schedule).
+  id: totrans-55
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们可以编写一个线性扩散进度表，如[示例8-3](#linear_diffusion_schedule)所示。
 - en: Example 8-3\. The linear diffusion schedule
+  id: totrans-56
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例8-3。线性扩散进度表
 - en: '[PRE2]'
+  id: totrans-57
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE2]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO2-1)'
+  id: totrans-58
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![1](Images/1.png)](#co_diffusion_models_CO2-1)'
 - en: The diffusion times are equally spaced steps between 0 and 1.
+  id: totrans-59
   prefs: []
   type: TYPE_NORMAL
+  zh: 扩散时间是0到1之间等间隔的步骤。
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO2-2)'
+  id: totrans-60
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![2](Images/2.png)](#co_diffusion_models_CO2-2)'
 - en: The linear diffusion schedule is applied to the diffusion times to produce the
     noise and signal rates.
+  id: totrans-61
   prefs: []
   type: TYPE_NORMAL
+  zh: 线性扩散进度表应用于扩散时间以产生噪声和信号速率。
 - en: 'In a later paper it was found that a *cosine diffusion schedule* outperformed
     the linear schedule from the original paper.^([5](ch08.xhtml#idm45387010764208))
     A cosine schedule defines the following values of <math alttext="alpha overbar
     Subscript t"><msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></math>
     :'
+  id: totrans-62
   prefs: []
   type: TYPE_NORMAL
+  zh: 在后续的一篇论文中发现，*余弦扩散进度表*优于原始论文中的线性进度表。余弦进度表定义了以下<math alttext="alpha overbar Subscript
+    t"><msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></math>值：
 - en: <math alttext="alpha overbar Subscript t Baseline equals cosine squared left-parenthesis
     StartFraction t Over upper T EndFraction dot StartFraction pi Over 2 EndFraction
     right-parenthesis" display="block"><mrow><msub><mover accent="true"><mi>α</mi>
     <mo>¯</mo></mover> <mi>t</mi></msub> <mo>=</mo> <msup><mo form="prefix">cos</mo>
     <mn>2</mn></msup> <mrow><mo>(</mo> <mfrac><mi>t</mi> <mi>T</mi></mfrac> <mo>·</mo>
     <mfrac><mi>π</mi> <mn>2</mn></mfrac> <mo>)</mo></mrow></mrow></math>
+  id: totrans-63
   prefs: []
   type: TYPE_NORMAL
+  zh: <math alttext="alpha overbar Subscript t Baseline equals cosine squared left-parenthesis
+    StartFraction t Over upper T EndFraction dot StartFraction pi Over 2 EndFraction
+    right-parenthesis" display="block"><mrow><msub><mover accent="true"><mi>α</mi>
+    <mo>¯</mo></mover> <mi>t</mi></msub> <mo>=</mo> <msup><mo form="prefix">cos</mo>
+    <mn>2</mn></msup> <mrow><mo>(</mo> <mfrac><mi>t</mi> <mi>T</mi></mfrac> <mo>·</mo>
+    <mfrac><mi>π</mi> <mn>2</mn></mfrac> <mo>)</mo></mrow></mrow></math>
 - en: 'The updated equation is therefore as follows (using the trigonometric identity
     <math alttext="cosine squared left-parenthesis x right-parenthesis plus sine squared
     left-parenthesis x right-parenthesis equals 1"><mrow><msup><mo form="prefix">cos</mo>
     <mn>2</mn></msup> <mrow><mo>(</mo> <mi>x</mi> <mo>)</mo></mrow> <mo>+</mo> <msup><mo
     form="prefix">sin</mo> <mn>2</mn></msup> <mrow><mo>(</mo> <mi>x</mi> <mo>)</mo></mrow>
     <mo>=</mo> <mn>1</mn></mrow></math> ):'
+  id: totrans-64
   prefs: []
   type: TYPE_NORMAL
+  zh: 因此，更新的方程如下（使用三角恒等式 <math alttext="cosine squared left-parenthesis x right-parenthesis
+    plus sine squared left-parenthesis x right-parenthesis equals 1"><mrow><msup><mo
+    form="prefix">cos</mo> <mn>2</mn></msup> <mrow><mo>(</mo> <mi>x</mi> <mo>)</mo></mrow>
+    <mo>+</mo> <msup><mo form="prefix">sin</mo> <mn>2</mn></msup> <mrow><mo>(</mo>
+    <mi>x</mi> <mo>)</mo></mrow> <mo>=</mo> <mn>1</mn></mrow></math>）：
 - en: <math alttext="bold x Subscript t Baseline equals cosine left-parenthesis StartFraction
     t Over upper T EndFraction dot StartFraction pi Over 2 EndFraction right-parenthesis
     bold x 0 plus sine left-parenthesis StartFraction t Over upper T EndFraction dot
@@ -420,35 +676,59 @@
     <msub><mi>𝐱</mi> <mn>0</mn></msub> <mo>+</mo> <mo form="prefix">sin</mo> <mrow><mo>(</mo>
     <mfrac><mi>t</mi> <mi>T</mi></mfrac> <mo>·</mo> <mfrac><mi>π</mi> <mn>2</mn></mfrac>
     <mo>)</mo></mrow> <mi>ϵ</mi></mrow></math>
+  id: totrans-65
   prefs: []
   type: TYPE_NORMAL
+  zh: <math alttext="bold x Subscript t Baseline equals cosine left-parenthesis StartFraction
+    t Over upper T EndFraction dot StartFraction pi Over 2 EndFraction right-parenthesis
+    bold x 0 plus sine left-parenthesis StartFraction t Over upper T EndFraction dot
+    StartFraction pi Over 2 EndFraction right-parenthesis epsilon" display="block"><mrow><msub><mi>𝐱</mi>
+    <mi>t</mi></msub> <mo>=</mo> <mo form="prefix">cos</mo> <mrow><mo>(</mo> <mfrac><mi>t</mi>
+    <mi>T</mi></mfrac> <mo>·</mo> <mfrac><mi>π</mi> <mn>2</mn></mfrac> <mo>)</mo></mrow>
+    <msub><mi>𝐱</mi> <mn>0</mn></msub> <mo>+</mo> <mo form="prefix">sin</mo> <mrow><mo>(</mo>
+    <mfrac><mi>t</mi> <mi>T</mi></mfrac> <mo>·</mo> <mfrac><mi>π</mi> <mn>2</mn></mfrac>
+    <mo>)</mo></mrow> <mi>ϵ</mi></mrow></math>
 - en: This equation is a simplified version of the actual cosine diffusion schedule
     used in the paper. The authors also add an offset term and scaling to prevent
     the noising steps from being too small at the beginning of the diffusion process.
     We can code up the cosine and offset cosine diffusion schedules as shown in [Example 8-4](#cosine_diffusion_schedule).
+  id: totrans-66
   prefs: []
   type: TYPE_NORMAL
+  zh: 这个方程是论文中使用的实际余弦扩散时间表的简化版本。作者还添加了一个偏移项和缩放，以防止扩散过程开始时噪声步骤太小。我们可以编写余弦和偏移余弦扩散时间表，如[示例8-4](#cosine_diffusion_schedule)所示。
 - en: Example 8-4\. The cosine and offset cosine diffusion schedules
+  id: totrans-67
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例8-4\. 余弦和偏移余弦扩散时间表
 - en: '[PRE3]'
+  id: totrans-68
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE3]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO3-1)'
+  id: totrans-69
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![1](Images/1.png)](#co_diffusion_models_CO3-1)'
 - en: The pure cosine diffusion schedule (without offset or rescaling).
+  id: totrans-70
   prefs: []
   type: TYPE_NORMAL
+  zh: 纯余弦扩散时间表（不包括偏移或重新缩放）。
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO3-2)'
+  id: totrans-71
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![2](Images/2.png)](#co_diffusion_models_CO3-2)'
 - en: The offset cosine diffusion schedule that we will be using, which adjusts the
     schedule to ensure the noising steps are not too small at the start of the noising
     process.
+  id: totrans-72
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们将使用的偏移余弦扩散时间表会调整时间表，以确保在扩散过程开始时噪声步骤不会太小。
 - en: We can compute the <math alttext="alpha overbar Subscript t"><msub><mover accent="true"><mi>α</mi>
     <mo>¯</mo></mover> <mi>t</mi></msub></math> values for each <math alttext="t"><mi>t</mi></math>
     to show how much signal ( <math alttext="alpha overbar Subscript t"><msub><mover
@@ -457,36 +737,55 @@
     <msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></mrow></math>
     ) is let through at each stage of the process for the linear, cosine, and offset
     cosine diffusion schedules, as shown in [Figure 8-4](#signal_and_noise_linear).
+  id: totrans-73
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们可以计算每个 <math alttext="t"><mi>t</mi></math> 的 <math alttext="alpha overbar
+    Subscript t"><msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></math>
+    值，以显示在线性、余弦和偏移余弦扩散时间表的每个阶段中有多少信号（ <math alttext="alpha overbar Subscript t"><msub><mover
+    accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></math> ）和噪声（ <math
+    alttext="1 minus alpha overbar Subscript t"><mrow><mn>1</mn> <mo>-</mo> <msub><mover
+    accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></mrow></math> ）通过，如[图8-4](#signal_and_noise_linear)所示。
 - en: '![](Images/gdl2_0804.png)'
+  id: totrans-74
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0804.png)'
 - en: Figure 8-4\. The signal and noise at each step of the noising process, for the
     linear, cosine, and offset cosine diffusion schedules
+  id: totrans-75
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-4\. 在扩散过程的每个步骤中的信号和噪声，对于线性、余弦和偏移余弦扩散时间表
 - en: Notice how the noise level ramps up more slowly in the cosine diffusion schedule.
     A cosine diffusion schedule adds noise to the image more gradually than a linear
     diffusion schedule, which improves training efficiency and generation quality.
     This can also be seen in images that have been corrupted by the linear and cosine
     schedules ([Figure 8-5](#diff_schedule_examples)).
+  id: totrans-76
   prefs: []
   type: TYPE_NORMAL
+  zh: 请注意，余弦扩散时间表中的噪声级别上升速度较慢。余弦扩散时间表将噪声逐渐添加到图像中，比线性扩散时间表更有效地提高了训练效率和生成质量。这也可以在被线性和余弦时间表破坏的图像中看到（[图8-5](#diff_schedule_examples)）。
 - en: '![](Images/gdl2_0805.png)'
+  id: totrans-77
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0805.png)'
 - en: 'Figure 8-5\. An image being corrupted by the linear (top) and cosine (bottom)
     diffusion schedules, at equally spaced values of t from 0 to T (source: [Ho et
     al., 2020](https://arxiv.org/abs/2006.11239))'
+  id: totrans-78
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-5\. 一个图像被线性（顶部）和余弦（底部）扩散时间表破坏，从0到T的等间距值（来源：[Ho等人，2020](https://arxiv.org/abs/2006.11239)）
 - en: The Reverse Diffusion Process
+  id: totrans-79
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: 反向扩散过程
 - en: Now let’s look at the reverse diffusion process. To recap, we are looking to
     build a neural network <math alttext="p Subscript theta Baseline left-parenthesis
     bold x Subscript t minus 1 Baseline vertical-bar bold x Subscript t Baseline right-parenthesis"><mrow><msub><mi>p</mi>
@@ -501,27 +800,47 @@
     upper I right-parenthesis"><mrow><mi>𝒩</mi> <mo>(</mo> <mn>0</mn> <mo>,</mo> <mi>𝐈</mi>
     <mo>)</mo></mrow></math> and then apply the reverse diffusion process multiple
     times in order to generate a novel image. This is visualized in [Figure 8-6](#reverse_diff).
+  id: totrans-80
   prefs: []
   type: TYPE_NORMAL
+  zh: 现在让我们看一下反向扩散过程。简而言之，我们要构建一个神经网络 <math alttext="p Subscript theta Baseline left-parenthesis
+    bold x Subscript t minus 1 Baseline vertical-bar bold x Subscript t Baseline right-parenthesis"><mrow><msub><mi>p</mi>
+    <mi>θ</mi></msub> <mrow><mo>(</mo> <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub>
+    <mo>|</mo> <msub><mi>𝐱</mi> <mi>t</mi></msub> <mo>)</mo></mrow></mrow></math>，它可以*撤销*扩散过程，即近似反向分布
+    <math alttext="q left-parenthesis bold x Subscript t minus 1 Baseline vertical-bar
+    bold x Subscript t Baseline right-parenthesis"><mrow><mi>q</mi> <mo>(</mo> <msub><mi>𝐱</mi>
+    <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub> <mo>|</mo> <msub><mi>𝐱</mi>
+    <mi>t</mi></msub> <mo>)</mo></mrow></math>。如果我们能做到这一点，我们可以从 <math alttext="script
+    upper N left-parenthesis 0 comma bold upper I right-parenthesis"><mrow><mi>𝒩</mi>
+    <mo>(</mo> <mn>0</mn> <mo>,</mo> <mi>𝐈</mi> <mo>)</mo></mrow></math> 中随机采样噪声，然后多次应用反向扩散过程以生成新颖的图像。这在[图8-6](#reverse_diff)中可视化。
 - en: '![](Images/gdl2_0806.png)'
+  id: totrans-81
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0806.png)'
 - en: Figure 8-6\. The reverse diffusion process <math alttext="p Subscript theta
     Baseline period left-parenthesis bold x Subscript t minus 1 Baseline vertical-bar
     bold x Subscript t Baseline right-parenthesis"><mrow><msub><mi>p</mi> <mi>θ</mi></msub>
     <mo>.</mo> <mrow><mo>(</mo> <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub>
     <mo>|</mo> <msub><mi>𝐱</mi> <mi>t</mi></msub> <mo>)</mo></mrow></mrow></math>
     tries to *undo* the noise produced by the forward diffusion process
+  id: totrans-82
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-6。反向扩散过程<math alttext="p Subscript theta Baseline period left-parenthesis
+    bold x Subscript t minus 1 Baseline vertical-bar bold x Subscript t Baseline right-parenthesis"><mrow><msub><mi>p</mi>
+    <mi>θ</mi></msub> <mo>.</mo> <mrow><mo>(</mo> <msub><mi>𝐱</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub>
+    <mo>|</mo> <msub><mi>𝐱</mi> <mi>t</mi></msub> <mo>)</mo></mrow></mrow></math>试图*撤消*由正向扩散过程产生的噪声
 - en: There are many similarities between the reverse diffusion process and the decoder
     of a variational autoencoder. In both, we aim to transform random noise into meaningful
     output using a neural network. The difference between diffusion models and VAEs
     is that in a VAE the forward process (converting images to noise) is part of the
     model (i.e., it is learned), whereas in a diffusion model it is unparameterized.
+  id: totrans-83
   prefs: []
   type: TYPE_NORMAL
+  zh: 反向扩散过程和变分自动编码器的解码器之间存在许多相似之处。 在两者中，我们的目标都是使用神经网络将随机噪声转换为有意义的输出。 扩散模型和VAE之间的区别在于，在VAE中，正向过程（将图像转换为噪声）是模型的一部分（即，它是学习的），而在扩散模型中，它是非参数化的。
 - en: Therefore, it makes sense to apply the same loss function as in a variational
     autoencoder. The original DDPM paper derives the exact form of this loss function
     and shows that it can be optimized by training a network <math alttext="epsilon
@@ -529,8 +848,13 @@
     <math alttext="epsilon"><mi>ϵ</mi></math> that has been added to a given image
     <math alttext="bold x bold 0"><msub><mi>𝐱</mi> <mn mathvariant="bold">0</mn></msub></math>
     at timestep <math alttext="t"><mi>t</mi></math> .
+  id: totrans-84
   prefs: []
   type: TYPE_NORMAL
+  zh: 因此，将与变分自动编码器中相同的损失函数应用是有意义的。 原始的DDPM论文推导出了这个损失函数的确切形式，并表明可以通过训练一个网络<math alttext="epsilon
+    Subscript theta"><msub><mi>ϵ</mi> <mi>θ</mi></msub></math>来预测已添加到给定图像<math alttext="bold
+    x bold 0"><msub><mi>𝐱</mi> <mn mathvariant="bold">0</mn></msub></math>的噪声<math
+    alttext="epsilon"><mi>ϵ</mi></math>在时间步<math alttext="t"><mi>t</mi></math>。
 - en: In other words, we sample an image <math alttext="bold x bold 0"><msub><mi>𝐱</mi>
     <mn mathvariant="bold">0</mn></msub></math> and transform it by <math alttext="t"><mi>t</mi></math>
     noising steps to get the image <math alttext="bold x Subscript t Baseline equals
@@ -547,8 +871,22 @@
     Baseline right-parenthesis"><mrow><msub><mi>ϵ</mi> <mi>θ</mi></msub> <mrow><mo>(</mo>
     <msub><mi>𝐱</mi> <mi>t</mi></msub> <mo>)</mo></mrow></mrow></math> and the true
     <math alttext="epsilon"><mi>ϵ</mi></math> .
+  id: totrans-85
   prefs: []
   type: TYPE_NORMAL
+  zh: 换句话说，我们对图像<math alttext="bold x bold 0"><msub><mi>𝐱</mi> <mn mathvariant="bold">0</mn></msub></math>进行采样，并通过<math
+    alttext="t"><mi>t</mi></math>个噪声步骤将其转换为图像<math alttext="bold x Subscript t Baseline
+    equals StartRoot alpha overbar Subscript t Baseline EndRoot bold x 0 plus StartRoot
+    1 minus alpha overbar Subscript t Baseline EndRoot epsilon"><mrow><msub><mi>𝐱</mi>
+    <mi>t</mi></msub> <mo>=</mo> <msqrt><msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover>
+    <mi>t</mi></msub></msqrt> <msub><mi>𝐱</mi> <mn>0</mn></msub> <mo>+</mo> <msqrt><mrow><mn>1</mn>
+    <mo>-</mo> <msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</mi></msub></mrow></msqrt>
+    <mi>ϵ</mi></mrow></math>。 我们将这个新图像和噪声率<math alttext="alpha overbar Subscript t"><msub><mover
+    accent="true"><mi>α</mi> <mo>¯</mo></mover> <mi>t</msub></math>提供给神经网络，并要求它预测<math
+    alttext="epsilon"><mi>ϵ</mi></math>，采取梯度步骤来计算预测<math alttext="epsilon Subscript
+    theta Baseline left-parenthesis bold x Subscript t Baseline right-parenthesis"><mrow><msub><mi>ϵ</mi>
+    <mi>θ</mi></msub> <mrow><mo>(</mo> <msub><mi>𝐱</mi> <mi>t</mi></msub> <mo>)</mo></mrow></mrow></math>和真实<math
+    alttext="epsilon"><mi>ϵ</mi></math>之间的平方误差。
 - en: 'We’ll take a look at the structure of the neural network in the next section.
     It is worth noting here that the diffusion model actually maintains two copies
     of the network: one that is actively trained used gradient descent and another
@@ -557,109 +895,170 @@
     not as susceptible to short-term fluctuations and spikes in the training process,
     making it more robust for generation than the actively trained network. We therefore
     use the EMA network whenever we want to produce generated output from the network.'
+  id: totrans-86
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们将在下一节中查看神经网络的结构。 值得注意的是，扩散模型实际上维护了两个网络副本：一个是通过梯度下降主动训练的网络，另一个是权重的指数移动平均（EMA）网络，该网络是在先前的训练步骤中对主动训练网络的权重进行指数移动平均。
+    EMA网络不太容易受到训练过程中的短期波动和峰值的影响，因此在生成方面比主动训练网络更稳健。 因此，每当我们想要从网络生成输出时，我们都会使用EMA网络。
 - en: The training process for the model is shown in [Figure 8-7](#diff_training_process).
+  id: totrans-87
   prefs: []
   type: TYPE_NORMAL
+  zh: 模型的训练过程如[图8-7](#diff_training_process)所示。
 - en: '![](Images/gdl2_0807.png)'
+  id: totrans-88
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0807.png)'
 - en: 'Figure 8-7\. The training process for a denoising diffusion model (source:
     [Ho et al., 2020](https://arxiv.org/abs/2006.11239))'
+  id: totrans-89
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-7。去噪扩散模型的训练过程（来源：[Ho等人，2020](https://arxiv.org/abs/2006.11239)）
 - en: In Keras, we can code up this training step as illustrated in [Example 8-5](#diffusion_train_step).
+  id: totrans-90
   prefs: []
   type: TYPE_NORMAL
+  zh: 在Keras中，我们可以将这个训练步骤编码为[示例8-5](#diffusion_train_step)所示。
 - en: Example 8-5\. The `train_step` function of the Keras diffusion model
+  id: totrans-91
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例8-5。Keras扩散模型的`train_step`函数
 - en: '[PRE4]'
+  id: totrans-92
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE4]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO4-1)'
+  id: totrans-93
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![1](Images/1.png)](#co_diffusion_models_CO4-1)'
 - en: We first normalize the batch of images to have zero mean and unit variance.
+  id: totrans-94
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们首先将图像批次归一化为零均值和单位方差。
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO4-2)'
+  id: totrans-95
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![2](Images/2.png)](#co_diffusion_models_CO4-2)'
 - en: Next, we sample noise to match the shape of the input images.
+  id: totrans-96
   prefs: []
   type: TYPE_NORMAL
+  zh: 接下来，我们对形状与输入图像匹配的噪声进行采样。
 - en: '[![3](Images/3.png)](#co_diffusion_models_CO4-3)'
+  id: totrans-97
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![3](Images/3.png)](#co_diffusion_models_CO4-3)'
 - en: We also sample random diffusion times…​
+  id: totrans-98
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们还对随机扩散时间进行采样…​
 - en: '[![4](Images/4.png)](#co_diffusion_models_CO4-4)'
+  id: totrans-99
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![4](Images/4.png)](#co_diffusion_models_CO4-4)'
 - en: …​and use these to generate the noise and signal rates according to the cosine
     diffusion schedule.
+  id: totrans-100
   prefs: []
   type: TYPE_NORMAL
+  zh: …并使用这些根据余弦扩散计划生成噪声和信号速率。
 - en: '[![5](Images/5.png)](#co_diffusion_models_CO4-5)'
+  id: totrans-101
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![5](Images/5.png)](#co_diffusion_models_CO4-5)'
 - en: Then we apply the signal and noise weightings to the input images to generate
     the noisy images.
+  id: totrans-102
   prefs: []
   type: TYPE_NORMAL
+  zh: 然后我们将信号和噪声权重应用于输入图像以生成嘈杂的图像。
 - en: '[![6](Images/6.png)](#co_diffusion_models_CO4-6)'
+  id: totrans-103
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![6](Images/6.png)](#co_diffusion_models_CO4-6)'
 - en: Next, we denoise the noisy images by asking the network to predict the noise
     and then undoing the noising operation, using the provided `noise_rates` and `signal_rates`.
+  id: totrans-104
   prefs: []
   type: TYPE_NORMAL
+  zh: 接下来，我们通过要求网络预测噪声然后撤消添加噪声的操作，使用提供的`noise_rates`和`signal_rates`来去噪嘈杂的图像。
 - en: '[![7](Images/7.png)](#co_diffusion_models_CO4-7)'
+  id: totrans-105
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![7](Images/7.png)](#co_diffusion_models_CO4-7)'
 - en: We can then calculate the loss (mean absolute error) between the predicted noise
     and the true noise…​
+  id: totrans-106
   prefs: []
   type: TYPE_NORMAL
+  zh: 然后我们可以计算预测噪声和真实噪声之间的损失（平均绝对误差）…​
 - en: '[![8](Images/8.png)](#co_diffusion_models_CO4-8)'
+  id: totrans-107
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![8](Images/8.png)](#co_diffusion_models_CO4-8)'
 - en: …​and take a gradient step against this loss function.
+  id: totrans-108
   prefs: []
   type: TYPE_NORMAL
+  zh: …​并根据这个损失函数采取梯度步骤。
 - en: '[![9](Images/9.png)](#co_diffusion_models_CO4-9)'
+  id: totrans-109
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![9](Images/9.png)](#co_diffusion_models_CO4-9)'
 - en: The EMA network weights are updated to a weighted average of the existing EMA
     weights and the trained network weights after the gradient step.
+  id: totrans-110
   prefs: []
   type: TYPE_NORMAL
+  zh: EMA网络权重更新为现有EMA权重和训练后的网络权重在梯度步骤后的加权平均值。
 - en: The U-Net Denoising Model
+  id: totrans-111
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: U-Net去噪模型
 - en: Now that we have seen the kind of neural network that we need to build (one
     that predicts the noise added to a given image), we can look at the architecture
     that makes this possible.
+  id: totrans-112
   prefs: []
   type: TYPE_NORMAL
+  zh: 现在我们已经看到了我们需要构建的神经网络的类型（一个预测添加到给定图像的噪声的网络），我们可以看一下使这种可能的架构。
 - en: The authors of the DDPM paper used a type of architecture known as a *U-Net*.
     A diagram of this network is shown in [Figure 8-8](#unet_diffusion), explicitly
     showing the shape of the tensor as it passes through the network.
+  id: totrans-113
   prefs: []
   type: TYPE_NORMAL
+  zh: DDPM论文的作者使用了一种称为*U-Net*的架构类型。这个网络的图表显示在[图8-8](#unet_diffusion)中，明确显示了张量在通过网络时的形状。
 - en: '![](Images/gdl2_0808.png)'
+  id: totrans-114
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0808.png)'
 - en: Figure 8-8\. U-Net architecture diagram
+  id: totrans-115
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-8\. U-Net架构图
 - en: 'In a similar manner to a variational autoencoder, a U-Net consists of two halves:
     the downsampling half, where input images are compressed spatially but expanded
     channel-wise, and the upsampling half, where representations are expanded spatially
@@ -669,129 +1068,198 @@
     the network from input to output, one layer after another. A U-Net is different,
     because the skip connections allow information to shortcut parts of the network
     and flow through to later layers.'
+  id: totrans-116
   prefs: []
   type: TYPE_NORMAL
+  zh: 类似于变分自动编码器，U-Net由两部分组成：下采样部分，其中输入图像在空间上被压缩但在通道上被扩展，以及上采样部分，其中表示在空间上被扩展，而通道数量减少。然而，与VAE不同的是，在网络的上采样和下采样部分之间还有*跳跃连接*。VAE是顺序的；数据从输入到输出依次通过网络的每一层。U-Net不同，因为跳跃连接允许信息绕过网络的部分并流向后续层。
 - en: A U-Net is particularly useful when we want the output to have the same shape
     as the input. In our diffusion model example, we want to predict the noise added
     to an image, which has exactly the same shape as the image itself, so a U-Net
     is the natural choice for the network architecture.
+  id: totrans-117
   prefs: []
   type: TYPE_NORMAL
+  zh: 当我们希望输出具有与输入相同的形状时，U-Net特别有用。在我们的扩散模型示例中，我们希望预测添加到图像中的噪声，这个噪声与图像本身的形状完全相同，因此U-Net是网络架构的自然选择。
 - en: First let’s take a look at the code that builds this U-Net in Keras, shown in
     [Example 8-6](#unet_keras).
+  id: totrans-118
   prefs: []
   type: TYPE_NORMAL
+  zh: 首先让我们看一下在Keras中构建这个U-Net的代码，显示在[示例8-6](#unet_keras)中。
 - en: Example 8-6\. A U-Net model in Keras
+  id: totrans-119
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例8-6\. Keras中的U-Net模型
 - en: '[PRE5]'
+  id: totrans-120
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE5]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO5-1)'
+  id: totrans-121
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![1](Images/1.png)](#co_diffusion_models_CO5-1)'
 - en: The first input to the U-Net is the image that we wish to denoise.
+  id: totrans-122
   prefs: []
   type: TYPE_NORMAL
+  zh: U-Net的第一个输入是我们希望去噪的图像。
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO5-2)'
+  id: totrans-123
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![2](Images/2.png)](#co_diffusion_models_CO5-2)'
 - en: This image is passed through a `Conv2D` layer to increase the number of channels.
+  id: totrans-124
   prefs: []
   type: TYPE_NORMAL
+  zh: 这个图像通过一个`Conv2D`层传递，以增加通道数量。
 - en: '[![3](Images/3.png)](#co_diffusion_models_CO5-3)'
+  id: totrans-125
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![3](Images/3.png)](#co_diffusion_models_CO5-3)'
 - en: The second input to the U-Net is the noise variance (a scalar).
+  id: totrans-126
   prefs: []
   type: TYPE_NORMAL
+  zh: U-Net的第二个输入是噪声方差（一个标量）。
 - en: '[![4](Images/4.png)](#co_diffusion_models_CO5-4)'
+  id: totrans-127
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![4](Images/4.png)](#co_diffusion_models_CO5-4)'
 - en: This is encoded using a sinusoidal embedding.
+  id: totrans-128
   prefs: []
   type: TYPE_NORMAL
+  zh: 这是使用正弦嵌入编码的。
 - en: '[![5](Images/5.png)](#co_diffusion_models_CO5-5)'
+  id: totrans-129
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![5](Images/5.png)](#co_diffusion_models_CO5-5)'
 - en: This embedding is copied across spatial dimensions to match the size of the
     input image.
+  id: totrans-130
   prefs: []
   type: TYPE_NORMAL
+  zh: 这个嵌入被复制到空间维度以匹配输入图像的大小。
 - en: '[![6](Images/6.png)](#co_diffusion_models_CO5-6)'
+  id: totrans-131
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![6](Images/6.png)](#co_diffusion_models_CO5-6)'
 - en: The two input streams are concatenated across channels.
+  id: totrans-132
   prefs: []
   type: TYPE_NORMAL
+  zh: 两个输入流在通道上连接。
 - en: '[![7](Images/7.png)](#co_diffusion_models_CO5-7)'
+  id: totrans-133
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![7](Images/7.png)](#co_diffusion_models_CO5-7)'
 - en: The `skips` list will hold the output from the `DownBlock` layers that we wish
     to connect to `UpBlock` layers downstream.
+  id: totrans-134
   prefs: []
   type: TYPE_NORMAL
+  zh: '`skips`列表将保存我们希望连接到下游`UpBlock`层的`DownBlock`层的输出。'
 - en: '[![8](Images/8.png)](#co_diffusion_models_CO5-8)'
+  id: totrans-135
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![8](Images/8.png)](#co_diffusion_models_CO5-8)'
 - en: The tensor is passed through a series of `DownBlock` layers that reduce the
     size of the image, while increasing the number of channels.
+  id: totrans-136
   prefs: []
   type: TYPE_NORMAL
+  zh: 张量通过一系列`DownBlock`层传递，这些层减小了图像的大小，同时增加了通道的数量。
 - en: '[![9](Images/9.png)](#co_diffusion_models_CO5-9)'
+  id: totrans-137
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![9](Images/9.png)](#co_diffusion_models_CO5-9)'
 - en: The tensor is then passed through two `ResidualBlock` layers that hold the image
     size and number of channels constant.
+  id: totrans-138
   prefs: []
   type: TYPE_NORMAL
+  zh: 然后，张量通过两个`ResidualBlock`层传递，这些层保持图像大小和通道数量恒定。
 - en: '[![10](Images/10.png)](#co_diffusion_models_CO5-10)'
+  id: totrans-139
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![10](Images/10.png)](#co_diffusion_models_CO5-10)'
 - en: Next, the tensor is passed through a series of `UpBlock` layers that increase
     the size of the image, while decreasing the number of channels. The skip connections
     incorporate output from the earlier `DownBlock` layers.
+  id: totrans-140
   prefs: []
   type: TYPE_NORMAL
+  zh: 接下来，张量通过一系列`UpBlock`层传递，这些层增加图像的大小，同时减少通道数。跳跃连接将输出与较早的`DownBlock`层的输出合并。
 - en: '[![11](Images/11.png)](#co_diffusion_models_CO5-11)'
+  id: totrans-141
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![11](Images/11.png)](#co_diffusion_models_CO5-11)'
 - en: The final `Conv2D` layer reduces the number of channels to three (RGB).
+  id: totrans-142
   prefs: []
   type: TYPE_NORMAL
+  zh: 最终的`Conv2D`层将通道数减少到三（RGB）。
 - en: '[![12](Images/12.png)](#co_diffusion_models_CO5-12)'
+  id: totrans-143
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![12](Images/12.png)](#co_diffusion_models_CO5-12)'
 - en: The U-Net is a Keras `Model` that takes the noisy images and noise variances
     as input and outputs a predicted noise map.
+  id: totrans-144
   prefs: []
   type: TYPE_NORMAL
+  zh: U-Net是一个Keras `Model`，它以嘈杂的图像和噪声方差作为输入，并输出预测的噪声图。
 - en: 'To understand the U-Net in detail, we need to explore four more concepts: the
     sinusoidal embedding of the noise variance, the `ResidualBlock`, the `DownBlock`,
     and the `UpBlock`.'
+  id: totrans-145
   prefs: []
   type: TYPE_NORMAL
+  zh: 要详细了解U-Net，我们需要探索四个概念：噪声方差的正弦嵌入、`ResidualBlock`、`DownBlock`和`UpBlock`。
 - en: Sinusoidal embedding
+  id: totrans-146
   prefs:
   - PREF_H3
   type: TYPE_NORMAL
+  zh: 正弦嵌入
 - en: '*Sinusoidal embedding* was first introduced in a paper by Vaswani et al.^([6](ch08.xhtml#idm45387008220416))
     We will be using an adaptation of that original idea as utilized in Mildenhall
     et al.’s paper titled “NeRF: Representing Scenes as Neural Radiance Fields for
     View Synthesis.”^([7](ch08.xhtml#idm45387008216736))'
+  id: totrans-147
   prefs: []
   type: TYPE_NORMAL
+  zh: '*正弦嵌入*最初是由Vaswani等人在一篇论文中引入的。我们将使用Mildenhall等人在题为“NeRF: Representing Scenes
+    as Neural Radiance Fields for View Synthesis”的论文中使用的这个原始想法的改编。'
 - en: The idea is that we want to be able to convert a scalar value (the noise variance)
     into a distinct higher-dimensional vector that is able to provide a more complex
     representation, for use downstream in the network. The original paper used this
     idea to encode the discrete position of words in a sentence into vectors; the
     NeRF paper extends this idea to continuous values.
+  id: totrans-148
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们希望能够将标量值（噪声方差）转换为一个不同的高维向量，能够提供更复杂的表示，以便在网络中下游使用。原始论文使用这个想法将句子中单词的离散位置编码为向量；NeRF论文将这个想法扩展到连续值。
 - en: 'Specifically, a scalar value *x* is encoded as shown in the following equation:'
+  id: totrans-149
   prefs: []
   type: TYPE_NORMAL
+  zh: 具体来说，标量值*x*被编码如下方程所示：
 - en: <math alttext="gamma left-parenthesis x right-parenthesis equals left-parenthesis
     sine left-parenthesis 2 pi e Superscript 0 f Baseline x right-parenthesis comma
     ellipsis comma sine left-parenthesis 2 pi e Superscript left-parenthesis upper
@@ -809,47 +1277,85 @@
     <mi>x</mi> <mo>)</mo></mrow> <mo>,</mo> <mo>⋯</mo> <mo>,</mo> <mo form="prefix">cos</mo>
     <mrow><mo>(</mo> <mn>2</mn> <mi>π</mi> <msup><mi>e</mi> <mrow><mo>(</mo><mi>L</mi><mo>-</mo><mn>1</mn><mo>)</mo><mi>f</mi></mrow></msup>
     <mi>x</mi> <mo>)</mo></mrow> <mo>)</mo></mrow></math>
+  id: totrans-150
   prefs: []
   type: TYPE_NORMAL
+  zh: <math alttext="gamma left-parenthesis x right-parenthesis equals left-parenthesis
+    sine left-parenthesis 2 pi e Superscript 0 f Baseline x right-parenthesis comma
+    ellipsis comma sine left-parenthesis 2 pi e Superscript left-parenthesis upper
+    L minus 1 right-parenthesis f right-parenthesis Baseline x right-parenthesis comma
+    cosine left-parenthesis 2 pi e Superscript 0 f Baseline x right-parenthesis comma
+    ellipsis comma cosine left-parenthesis 2 pi e Superscript left-parenthesis upper
+    L minus 1 right-parenthesis f Baseline x right-parenthesis right-parenthesis"
+    display="block"><mrow><mi>γ</mi> <mrow><mo>(</mo> <mi>x</mi> <mo>)</mo></mrow>
+    <mo>=</mo> <mo>(</mo> <mo form="prefix">sin</mo> <mrow><mo>(</mo> <mn>2</mn> <mi>π</mi>
+    <msup><mi>e</mi> <mrow><mn>0</mn><mi>f</mi></mrow></msup> <mi>x</mi> <mo>)</mo></mrow>
+    <mo>,</mo> <mo>⋯</mo> <mo>,</mo> <mo form="prefix">sin</mo> <mrow><mo>(</mo> <mn>2</mn>
+    <mi>π</mi> <msup><mi>e</mi> <mrow><mo>(</mo><mi>L</mi><mo>-</mo><mn>1</mn><mo>)</mo><mi>f</mi><mo>)</mo></mrow></msup>
+    <mi>x</mi> <mo>)</mo></mrow> <mo>,</mo> <mo form="prefix">cos</mo> <mrow><mo>(</mo>
+    <mn>2</mn> <mi>π</mi> <msup><mi>e</mi> <mrow><mn>0</mn><mi>f</mi></mrow></msup>
+    <mi>x</mi> <mo>)</mo></mrow> <mo>,</mo> <mo>⋯</mo> <mo>,</mo> <mo form="prefix">cos</mo>
+    <mrow><mo>(</mo> <mn>2</mn> <mi>π</mi> <msup><mi>e</mi> <mrow><mo>(</mo><mi>L</mi><mo>-</mo><mn>1</mn><mo>)</mo><mi>f</mi></mrow></msup>
+    <mi>x</mi> <mo>)</mo></mrow> <mo>)</mo></mrow></math>
 - en: where we choose <math alttext="upper L equals 16"><mrow><mi>L</mi> <mo>=</mo>
     <mn>16</mn></mrow></math> to be half the size of our desired noise embedding length
     and <math alttext="f equals StartFraction ln left-parenthesis 1000 right-parenthesis
     Over upper L minus 1 EndFraction"><mrow><mi>f</mi> <mo>=</mo> <mfrac><mrow><mo
     form="prefix">ln</mo><mo>(</mo><mn>1000</mn><mo>)</mo></mrow> <mrow><mi>L</mi><mo>-</mo><mn>1</mn></mrow></mfrac></mrow></math>
     to be the maximum scaling factor for the frequencies.
+  id: totrans-151
   prefs: []
   type: TYPE_NORMAL
+  zh: 其中我们选择<math alttext="上限L等于16"><mrow><mi>L</mi> <mo>=</mo> <mn>16</mn></mrow></math>，是我们期望的噪声嵌入长度的一半，<math
+    alttext="f等于ln(1000)/(L-1)"><mrow><mi>f</mi> <mo>=</mo> <mfrac><mrow><mo form="prefix">ln</mo><mo>(</mo><mn>1000</mn><mo>)</mo></mrow>
+    <mrow><mi>L</mi><mo>-</mo><mn>1</mn></mrow></mfrac></mrow></math>是频率的最大缩放因子。
 - en: This produces the embedding pattern shown in [Figure 8-9](#sinusoidal_embedding_image).
+  id: totrans-152
   prefs: []
   type: TYPE_NORMAL
+  zh: 这产生了[图8-9](#sinusoidal_embedding_image)中显示的嵌入模式。
 - en: '![](Images/gdl2_0809.png)'
+  id: totrans-153
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0809.png)'
 - en: Figure 8-9\. The pattern of sinusoidal embeddings for noise variances from 0
     to 1
+  id: totrans-154
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-9。噪声方差从0到1的正弦嵌入模式
 - en: We can code this sinusoidal embedding function as shown in [Example 8-7](#sinusoidal_embedding_diffusion).
     This converts a single noise variance scalar value into a vector of length 32.
+  id: totrans-155
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们可以将这个正弦嵌入函数编码如[示例8-7](#sinusoidal_embedding_diffusion)所示。这将一个单一的噪声方差标量值转换为长度为32的向量。
 - en: Example 8-7\. The `sinusoidal_embedding` function that encodes the noise variance
+  id: totrans-156
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例8-7。编码噪声方差的`sinusoidal_embedding`函数
 - en: '[PRE6]'
+  id: totrans-157
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE6]'
 - en: ResidualBlock
+  id: totrans-158
   prefs:
   - PREF_H3
   type: TYPE_NORMAL
+  zh: 残差块
 - en: Both the `DownBlock` and the `UpBlock` contain `ResidualBlock` layers, so let’s
     start with these. We already explored residual blocks in [Chapter 5](ch05.xhtml#chapter_autoregressive),
     when we built a PixelCNN, but we will recap here for completeness.
+  id: totrans-159
   prefs: []
   type: TYPE_NORMAL
+  zh: '`DownBlock`和`UpBlock`都包含`ResidualBlock`层，所以让我们从这些层开始。我们在[第5章](ch05.xhtml#chapter_autoregressive)中构建PixelCNN时已经探讨过残差块，但为了完整起见，我们将在这里进行回顾。'
 - en: A *residual block* is a group of layers that contains a skip connection that
     adds the input to the output. Residual blocks help us to build deeper networks
     that can learn more complex patterns without suffering as greatly from vanishing
@@ -859,77 +1365,100 @@
     that as neural networks become deeper, they are not necessarily as accurate as
     their shallower counterparts—accuracy seems to become saturated at a certain depth
     and then degrade rapidly.
+  id: totrans-160
   prefs: []
   type: TYPE_NORMAL
+  zh: '*残差块*是一组包含跳跃连接的层，将输入添加到输出中。残差块帮助我们构建更深的网络，可以学习更复杂的模式，而不会受到梯度消失和退化问题的严重影响。梯度消失问题是指随着网络变得更深，通过更深层传播的梯度很小，因此学习速度非常慢。退化问题是指随着神经网络变得更深，它们不一定像较浅的对应网络那样准确——准确性似乎在一定深度上饱和，然后迅速退化。'
 - en: Degradation
+  id: totrans-161
   prefs:
   - PREF_H1
   type: TYPE_NORMAL
+  zh: 退化
 - en: The degradation problem is somewhat counterintuitive, but observed in practice
     as the deeper layers must at least learn the identity mapping, which is not trivial—especially
     considering other problems deeper networks face, such as the vanishing gradient
     problem.
+  id: totrans-162
   prefs: []
   type: TYPE_NORMAL
+  zh: 退化问题有点反直觉，但在实践中观察到，因为更深的层至少必须学习恒等映射，这并不是微不足道的——尤其考虑到更深的网络面临的其他问题，比如梯度消失问题。
 - en: The solution, first introduced in the ResNet paper by He et al. in 2015,^([8](ch08.xhtml#idm45387008052288))
     is very simple. By including a skip connection *highway* around the main weighted
     layers, the block has the option to bypass the complex weight updates and simply
     pass through the identity mapping. This allows the network to be trained to great
     depth without sacrificing gradient size or network accuracy.
+  id: totrans-163
   prefs: []
   type: TYPE_NORMAL
 - en: A diagram of a `ResidualBlock` is shown in [Figure 8-10](#diffusion_residual).
     Note that in some residual blocks, we also include an extra `Conv2D` layer with
     kernel size 1 on the skip connection, to bring the number of channels in line
     with the rest of the block.
+  id: totrans-164
   prefs: []
   type: TYPE_NORMAL
 - en: '![](Images/gdl2_0810.png)'
+  id: totrans-165
   prefs: []
   type: TYPE_IMG
 - en: Figure 8-10\. The `ResidualBlock` in the U-Net
+  id: totrans-166
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
 - en: We can code a `ResidualBlock` in Keras as shown in [Example 8-8](#diffusion_residual_code).
+  id: totrans-167
   prefs: []
   type: TYPE_NORMAL
 - en: Example 8-8\. Code for the `ResidualBlock` in the U-Net
+  id: totrans-168
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
 - en: '[PRE7]'
+  id: totrans-169
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE7]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO6-1)'
+  id: totrans-170
   prefs: []
   type: TYPE_NORMAL
 - en: Check if the number of channels in the input matches the number of channels
     that we would like the block to output. If not, include an extra `Conv2D` layer
     on the skip connection to bring the number of channels in line with the rest of
     the block.
+  id: totrans-171
   prefs: []
   type: TYPE_NORMAL
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO6-2)'
+  id: totrans-172
   prefs: []
   type: TYPE_NORMAL
 - en: Apply a `BatchNormalization` layer.
+  id: totrans-173
   prefs: []
   type: TYPE_NORMAL
 - en: '[![3](Images/3.png)](#co_diffusion_models_CO6-3)'
+  id: totrans-174
   prefs: []
   type: TYPE_NORMAL
 - en: Apply two `Conv2D` layers.
+  id: totrans-175
   prefs: []
   type: TYPE_NORMAL
 - en: '[![4](Images/4.png)](#co_diffusion_models_CO6-4)'
+  id: totrans-176
   prefs: []
   type: TYPE_NORMAL
 - en: Add the original block input to the output to provide the final output from
     the block.
+  id: totrans-177
   prefs: []
   type: TYPE_NORMAL
 - en: DownBlocks and UpBlocks
+  id: totrans-178
   prefs:
   - PREF_H3
   type: TYPE_NORMAL
@@ -937,6 +1466,7 @@
     (=2 in our example) `ResidualBlock`s, while also applying a final `AveragePooling2D`
     layer in order to halve the size of the image. Each `ResidualBlock` is added to
     a list for use later by the `UpBlock` layers as skip connections across the U-Net.
+  id: totrans-179
   prefs: []
   type: TYPE_NORMAL
 - en: An `UpBlock` first applies an `UpSampling2D` layer that doubles the size of
@@ -944,128 +1474,173 @@
     the number of channels via `block_depth` (=2) `ResidualBlock`s, while also concatenating
     the outputs from the `DownBlock`s through skip connections across the U-Net. A
     diagram of this process is shown in [Figure 8-11](#diffusion_down_up_block).
+  id: totrans-180
   prefs: []
   type: TYPE_NORMAL
 - en: '![](Images/gdl2_0811.png)'
+  id: totrans-181
   prefs: []
   type: TYPE_IMG
 - en: Figure 8-11\. The `DownBlock` and corresponding `UpBlock` in the U-Net
+  id: totrans-182
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
 - en: We can code the `DownBlock` and `UpBlock` using Keras as illustrated in [Example 8-9](#diffusion_down_up_code).
+  id: totrans-183
   prefs: []
   type: TYPE_NORMAL
 - en: Example 8-9\. Code for the `DownBlock` and `UpBlock` in the U-Net model
+  id: totrans-184
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
 - en: '[PRE8]'
+  id: totrans-185
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE8]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO7-1)'
+  id: totrans-186
   prefs: []
   type: TYPE_NORMAL
 - en: The `DownBlock` increases the number of channels in the image using a `ResidualBlock`
     of a given `width`…​
+  id: totrans-187
   prefs: []
   type: TYPE_NORMAL
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO7-2)'
+  id: totrans-188
   prefs: []
   type: TYPE_NORMAL
 - en: …​each of which are saved to a list (`skips`) for use later by the `UpBlock`s.
+  id: totrans-189
   prefs: []
   type: TYPE_NORMAL
 - en: '[![3](Images/3.png)](#co_diffusion_models_CO7-3)'
+  id: totrans-190
   prefs: []
   type: TYPE_NORMAL
 - en: A final `AveragePooling2D` layer reduces the dimensionality of the image by
     half.
+  id: totrans-191
   prefs: []
   type: TYPE_NORMAL
 - en: '[![4](Images/4.png)](#co_diffusion_models_CO7-4)'
+  id: totrans-192
   prefs: []
   type: TYPE_NORMAL
 - en: The `UpBlock` begins with an `UpSampling2D` layer that doubles the size of the
     image.
+  id: totrans-193
   prefs: []
   type: TYPE_NORMAL
 - en: '[![5](Images/5.png)](#co_diffusion_models_CO7-5)'
+  id: totrans-194
   prefs: []
   type: TYPE_NORMAL
 - en: The output from a `DownBlock` layer is glued to the current output using a `Concatenate`
     layer.
+  id: totrans-195
   prefs: []
   type: TYPE_NORMAL
 - en: '[![6](Images/6.png)](#co_diffusion_models_CO7-6)'
+  id: totrans-196
   prefs: []
   type: TYPE_NORMAL
 - en: A `ResidualBlock` is used to reduce the number of channels in the image as it
     passes through the `UpBlock`.
+  id: totrans-197
   prefs: []
   type: TYPE_NORMAL
 - en: Training the Diffusion Model
+  id: totrans-198
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
 - en: We now have all the components in place to train our denoising diffusion model!
     [Example 8-10](#diffusion_train_code) creates, compiles, and fits the diffusion
     model.
+  id: totrans-199
   prefs: []
   type: TYPE_NORMAL
 - en: Example 8-10\. Code for training the `DiffusionModel`
+  id: totrans-200
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
 - en: '[PRE9]'
+  id: totrans-201
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE9]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO8-1)'
+  id: totrans-202
   prefs: []
   type: TYPE_NORMAL
 - en: Instantiate the model.
+  id: totrans-203
   prefs: []
   type: TYPE_NORMAL
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO8-2)'
+  id: totrans-204
   prefs: []
   type: TYPE_NORMAL
 - en: Compile the model, using the AdamW optimizer (similar to Adam but with weight
     decay, which helps stabilize the training process) and mean absolute error loss
     function.
+  id: totrans-205
   prefs: []
   type: TYPE_NORMAL
 - en: '[![3](Images/3.png)](#co_diffusion_models_CO8-3)'
+  id: totrans-206
   prefs: []
   type: TYPE_NORMAL
 - en: Calculate the normalization statistics using the training set.
+  id: totrans-207
   prefs: []
   type: TYPE_NORMAL
+  zh: 使用训练集计算归一化统计数据。
 - en: '[![4](Images/4.png)](#co_diffusion_models_CO8-4)'
+  id: totrans-208
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![4](Images/4.png)](#co_diffusion_models_CO8-4)'
 - en: Fit the model over 50 epochs.
+  id: totrans-209
   prefs: []
   type: TYPE_NORMAL
+  zh: 在50个时代内拟合模型。
 - en: The loss curve (noise mean absolute error [MAE]) is shown in [Figure 8-12](#diffusion_loss).
+  id: totrans-210
   prefs: []
   type: TYPE_NORMAL
+  zh: 损失曲线（噪音平均绝对误差[MAE]）显示在[图8-12](#diffusion_loss)中。
 - en: '![](Images/gdl2_0812.png)'
+  id: totrans-211
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0812.png)'
 - en: Figure 8-12\. The noise mean absolute error loss curve, by epoch
+  id: totrans-212
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-12。噪音平均绝对误差损失曲线，按时代
 - en: Sampling from the Denoising Diffusion Model
+  id: totrans-213
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: 从去噪扩散模型中采样
 - en: In order to sample images from our trained model, we need to apply the reverse
     diffusion process—that is, we need to start with random noise and use the model
     to gradually undo the noise, until we are left with a recognizable picture of
     a flower.
+  id: totrans-214
   prefs: []
   type: TYPE_NORMAL
+  zh: 为了从我们训练好的模型中采样图像，我们需要应用反向扩散过程-也就是说，我们需要从随机噪音开始，并使用模型逐渐消除噪音，直到我们得到一个可以识别的花朵图片。
 - en: We must bear in mind that our model is trained to predict the total amount of
     noise that has been added to a given noisy image from the training set, not just
     the noise that was added at the last timestep of the noising process. However,
@@ -1073,8 +1648,10 @@
     noise in one shot is clearly not going to work! We would rather mimic the forward
     process and undo the predicted noise gradually over many small steps, to allow
     the model to adjust to its own predictions.
+  id: totrans-215
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们必须记住，我们的模型是经过训练的，用于预测在训练集中添加到给定嘈杂图像的总噪音量，而不仅仅是在噪音过程的最后一个时间步骤中添加的噪音。然而，我们不希望一次性消除所有噪音-在一次预测中从纯随机噪音中预测图像显然不会奏效！我们宁愿模仿正向过程，并在许多小步骤中逐渐消除预测的噪音，以使模型能够适应自己的预测。
 - en: To achieve this, we can jump from <math alttext="x Subscript t"><msub><mi>x</mi>
     <mi>t</mi></msub></math> to <math alttext="x Subscript t minus 1"><msub><mi>x</mi>
     <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math> in two steps—first by
@@ -1084,26 +1661,40 @@
     <mo>-</mo> <mn>1</mn></mrow></math> timesteps, to produce <math alttext="x Subscript
     t minus 1"><msub><mi>x</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math>
     . This idea is shown in [Figure 8-13](#diffusion_one_step_sample).
+  id: totrans-216
   prefs: []
   type: TYPE_NORMAL
+  zh: 为了实现这一点，我们可以在两个步骤中从<math alttext="x Subscript t"><msub><mi>x</mi> <mi>t</mi></msub></math>跳到<math
+    alttext="x Subscript t minus 1"><msub><mi>x</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math>，首先使用我们模型的噪音预测来计算原始图像<math
+    alttext="x 0"><msub><mi>x</mi> <mn>0</mn></msub></math>的估计，然后重新应用预测的噪音到这个图像，但只在<math
+    alttext="t minus 1"><mrow><mi>t</mi> <mo>-</mo> <mn>1</mn></mrow></math>个时间步骤内，产生<math
+    alttext="x Subscript t minus 1"><msub><mi>x</mi> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></math>。这个想法在[图8-13](#diffusion_one_step_sample)中显示。
 - en: '![](Images/gdl2_0813.png)'
+  id: totrans-217
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0813.png)'
 - en: Figure 8-13\. One step of the sampling process for our diffusion model
+  id: totrans-218
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-13。扩散模型采样过程的一步
 - en: If we repeat this process over a number of steps, we’ll eventually get back
     to an estimate for <math alttext="x 0"><msub><mi>x</mi> <mn>0</mn></msub></math>
     that has been guided gradually over many small steps. In fact, we are free to
     choose the number of steps we take, and crucially, it doesn’t have to be the same
     as the large number of steps in the training noising process (i.e., 1,000). It
     can be much smaller—in this example we choose 20.
+  id: totrans-219
   prefs: []
   type: TYPE_NORMAL
+  zh: 如果我们重复这个过程多次，最终我们将得到一个经过许多小步骤逐渐引导的<math alttext="x 0"><msub><mi>x</mi> <mn>0</mn></msub></math>的估计。实际上，我们可以自由选择采取的步数，关键是，它不必与训练噪音过程中的大量步数（即1,000）相同。它可以小得多-在这个例子中，我们选择了20。
 - en: 'The following equation (Song et al., 2020) this process mathematically:'
+  id: totrans-220
   prefs: []
   type: TYPE_NORMAL
+  zh: 以下方程（Song等，2020）数学上描述了这个过程：
 - en: <math alttext="bold x Subscript t minus 1 Baseline equals StartRoot alpha overbar
     Subscript t minus 1 Baseline EndRoot ModifyingBelow left-parenthesis StartFraction
     bold x Subscript t Baseline minus StartRoot 1 minus alpha overbar Subscript t
@@ -1134,8 +1725,39 @@
     <mi>t</mi></msub></mrow></munder> <mo>+</mo> <munder><munder accentunder="true"><mrow><msub><mi>σ</mi>
     <mi>t</mi></msub> <msub><mi>ϵ</mi> <mi>t</mi></msub></mrow> <mo>︸</mo></munder>
     <mrow><mtext>random</mtext><mtext>noise</mtext></mrow></munder></mrow></math>
+  id: totrans-221
   prefs: []
   type: TYPE_NORMAL
+  zh: <math alttext="bold x Subscript t minus 1 Baseline equals StartRoot alpha overbar
+    Subscript t minus 1 Baseline EndRoot ModifyingBelow left-parenthesis StartFraction
+    bold x Subscript t Baseline minus StartRoot 1 minus alpha overbar Subscript t
+    Baseline EndRoot epsilon Subscript theta Superscript left-parenthesis t right-parenthesis
+    Baseline left-parenthesis bold x Subscript t Baseline right-parenthesis Over StartRoot
+    alpha overbar Subscript t Baseline EndRoot EndFraction right-parenthesis With
+    bottom-brace Underscript predicted bold x 0 Endscripts plus ModifyingBelow StartRoot
+    1 minus alpha overbar Subscript t minus 1 Baseline minus sigma Subscript t Superscript
+    2 Baseline EndRoot dot epsilon Subscript theta Superscript left-parenthesis t
+    right-parenthesis Baseline left-parenthesis bold x Subscript t Baseline right-parenthesis
+    With bottom-brace Underscript direction pointing to bold x Subscript t Baseline
+    Endscripts plus ModifyingBelow sigma Subscript t Baseline epsilon Subscript t
+    Baseline With bottom-brace Underscript random noise Endscripts" display="block"><mrow><msub><mi>𝐱</mi>
+    <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub> <mo>=</mo> <msqrt><msub><mover
+    accent="true"><mi>α</mi> <mo>¯</mo></mover> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></msqrt>
+    <munder><munder accentunder="true"><mfenced separators="" open="(" close=")"><mfrac><mrow><msub><mi>𝐱</mi>
+    <mi>t</mi></msub> <mo>-</mo><msqrt><mrow><mn>1</mn><mo>-</mo><msub><mover accent="true"><mi>α</mi>
+    <mo>¯</mo></mover> <mi>t</mi></msub></mrow></msqrt> <msubsup><mi>ϵ</mi> <mi>θ</mi>
+    <mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow></msubsup> <mrow><mo>(</mo><msub><mi>𝐱</mi>
+    <mi>t</mi></msub> <mo>)</mo></mrow></mrow> <msqrt><msub><mover accent="true"><mi>α</mi>
+    <mo>¯</mo></mover> <mi>t</mi></msub></msqrt></mfrac></mfenced> <mo>︸</mo></munder>
+    <mrow><mtext>predicted</mtext><msub><mi>𝐱</mi> <mn>0</mn></msub></mrow></munder>
+    <mo>+</mo> <munder><munder accentunder="true"><mrow><msqrt><mrow><mn>1</mn><mo>-</mo><msub><mover
+    accent="true"><mi>α</mi> <mo>¯</mo></mover> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub>
+    <mo>-</mo><msubsup><mi>σ</mi> <mi>t</mi> <mn>2</mn></msubsup></mrow></msqrt> <mo>·</mo><msubsup><mi>ϵ</mi>
+    <mi>θ</mi> <mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow></msubsup> <mrow><mo>(</mo><msub><mi>𝐱</mi>
+    <mi>t</mi></msub> <mo>)</mo></mrow></mrow> <mo>︸</mo></munder> <mrow><mtext>direction</mtext><mtext>pointing</mtext><mtext>to</mtext><msub><mi>𝐱</mi>
+    <mi>t</mi></msub></mrow></munder> <mo>+</mo> <munder><munder accentunder="true"><mrow><msub><mi>σ</mi>
+    <mi>t</mi></msub> <msub><mi>ϵ</mi> <mi>t</mi></msub></mrow> <mo>︸</mo></munder>
+    <mrow><mtext>random</mtext><mtext>noise</mtext></mrow></munder></mrow></math>
 - en: Let’s break this down. The first term inside the brackets on the righthand side
     of the equation is the estimated image <math alttext="x 0"><msub><mi>x</mi> <mn>0</mn></msub></math>
     , calculated using the noise predicted by our network <math alttext="epsilon Subscript
@@ -1155,8 +1777,23 @@
     is also added, with the factors <math alttext="sigma Subscript t"><msub><mi>σ</mi>
     <mi>t</mi></msub></math> determining how random we want our generation process
     to be.
-  prefs: []
-  type: TYPE_NORMAL
+  id: totrans-222
+  prefs: []
+  type: TYPE_NORMAL
+  zh: 让我们来分解一下。方程式右侧括号内的第一个项是估计的图像 <math alttext="x 0"><msub><mi>x</mi> <mn>0</mn></msub></math>，使用我们网络预测的噪声
+    <math alttext="epsilon Subscript theta Superscript left-parenthesis t right-parenthesis"><msubsup><mi>ϵ</mi>
+    <mi>θ</mi> <mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow></msubsup></math> 计算得到。然后我们通过
+    <math alttext="t minus 1"><mrow><mi>t</mi> <mo>-</mo> <mn>1</mn></mrow></math>
+    信号率 <math alttext="StartRoot alpha overbar Subscript t minus 1 Baseline EndRoot"><msqrt><msub><mover
+    accent="true"><mi>α</mi> <mo>¯</mo></mover> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub></msqrt></math>
+    缩放这个值，并重新应用预测的噪声，但这次是通过 <math alttext="t minus 1"><mrow><mi>t</mi> <mo>-</mo>
+    <mn>1</mn></mrow></math> 噪声率 <math alttext="StartRoot 1 minus alpha overbar Subscript
+    t minus 1 Baseline minus sigma Subscript t Superscript 2 Baseline EndRoot"><msqrt><mrow><mn>1</mn>
+    <mo>-</mo> <msub><mover accent="true"><mi>α</mi> <mo>¯</mo></mover> <mrow><mi>t</mi><mo>-</mo><mn>1</mn></mrow></msub>
+    <mo>-</mo> <msubsup><mi>σ</mi> <mi>t</mi> <mn>2</mn></msubsup></mrow></msqrt></math>
+    进行缩放。还添加了额外的高斯随机噪声 <math alttext="sigma Subscript t Baseline epsilon Subscript
+    t"><mrow><msub><mi>σ</mi> <mi>t</mi></msub> <msub><mi>ϵ</mi> <mi>t</mi></msub></mrow></math>，其中
+    <math alttext="sigma Subscript t"><msub><mi>σ</mi> <mi>t</msub></math> 确定了我们希望生成过程有多随机。
 - en: The special case <math alttext="sigma Subscript t Baseline equals 0"><mrow><msub><mi>σ</mi>
     <mi>t</mi></msub> <mo>=</mo> <mn>0</mn></mrow></math> for all <math alttext="t"><mi>t</mi></math>
     corresponds to a type of model known as a *Denoising Diffusion Implicit Model*
@@ -1165,139 +1802,217 @@
     random noise input will always give the same output. This is desirable as then
     we have a well-defined mapping between samples from the latent space and the generated
     outputs in pixel space.
+  id: totrans-223
   prefs: []
   type: TYPE_NORMAL
+  zh: 特殊情况 <math alttext="sigma Subscript t Baseline equals 0"><mrow><msub><mi>σ</mi>
+    <mi>t</mi></msub> <mo>=</mo> <mn>0</mn></mrow></math> 对于所有的 <math alttext="t"><mi>t</mi></math>
+    对应于一种称为*去噪扩散隐式模型*（DDIM）的模型，由Song等人在2020年提出。^([9](ch08.xhtml#idm45387007342688))
+    使用DDIM，生成过程完全是确定性的—也就是说，相同的随机噪声输入将始终产生相同的输出。这是可取的，因为这样我们在潜在空间的样本和像素空间中生成的输出之间有一个明确定义的映射。
 - en: In our example, we will implement a DDIM, thus making our generation process
     deterministic. The code for the DDIM sampling process (reverse diffusion) is shown
     in [Example 8-11](#diffusion_sampling).
+  id: totrans-224
   prefs: []
   type: TYPE_NORMAL
+  zh: 在我们的示例中，我们将实现一个DDIM，从而使我们的生成过程确定性。DDIM采样过程（反向扩散）的代码显示在[示例 8-11](#diffusion_sampling)中。
 - en: Example 8-11\. Sampling from the diffusion model
+  id: totrans-225
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例 8-11\. 从扩散模型中采样
 - en: '[PRE10]'
+  id: totrans-226
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE10]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO9-1)'
+  id: totrans-227
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![1](Images/1.png)](#co_diffusion_models_CO9-1)'
 - en: Look over a fixed number of steps (e.g., 20).
+  id: totrans-228
   prefs: []
   type: TYPE_NORMAL
+  zh: 观察固定数量的步骤（例如，20步）。
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO9-2)'
+  id: totrans-229
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![2](Images/2.png)](#co_diffusion_models_CO9-2)'
 - en: The diffusion times are all set to 1 (i.e., at the start of the reverse diffusion
     process).
+  id: totrans-230
   prefs: []
   type: TYPE_NORMAL
+  zh: 扩散时间都设置为1（即在反向扩散过程开始时）。
 - en: '[![3](Images/3.png)](#co_diffusion_models_CO9-3)'
+  id: totrans-231
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![3](Images/3.png)](#co_diffusion_models_CO9-3)'
 - en: The noise and signal rates are calculated according to the diffusion schedule.
+  id: totrans-232
   prefs: []
   type: TYPE_NORMAL
+  zh: 根据扩散计划计算噪声和信号率。
 - en: '[![4](Images/4.png)](#co_diffusion_models_CO9-4)'
+  id: totrans-233
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![4](Images/4.png)](#co_diffusion_models_CO9-4)'
 - en: The U-Net is used to predict the noise, allowing us to calculate the denoised
     image estimate.
+  id: totrans-234
   prefs: []
   type: TYPE_NORMAL
+  zh: U-Net用于预测噪声，从而使我们能够计算去噪图像的估计。
 - en: '[![5](Images/5.png)](#co_diffusion_models_CO9-5)'
+  id: totrans-235
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![5](Images/5.png)](#co_diffusion_models_CO9-5)'
 - en: The diffusion times are reduced by one step.
+  id: totrans-236
   prefs: []
   type: TYPE_NORMAL
+  zh: 扩散时间减少一步。
 - en: '[![6](Images/6.png)](#co_diffusion_models_CO9-6)'
+  id: totrans-237
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![6](Images/6.png)](#co_diffusion_models_CO9-6)'
 - en: The new noise and signal rates are calculated.
+  id: totrans-238
   prefs: []
   type: TYPE_NORMAL
+  zh: 计算新的噪声和信号率。
 - en: '[![7](Images/7.png)](#co_diffusion_models_CO9-7)'
+  id: totrans-239
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![7](Images/7.png)](#co_diffusion_models_CO9-7)'
 - en: The `t-1` images are calculated by reapplying the predicted noise to the predicted
     image, according to the `t-1` diffusion schedule rates.
+  id: totrans-240
   prefs: []
   type: TYPE_NORMAL
+  zh: 通过根据扩散计划率重新应用预测噪声到预测图像，计算出 `t-1` 图像。
 - en: '[![8](Images/8.png)](#co_diffusion_models_CO9-8)'
+  id: totrans-241
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![8](Images/8.png)](#co_diffusion_models_CO9-8)'
 - en: After 20 steps, the final <math alttext="bold x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math>
     predicted images are returned.
+  id: totrans-242
   prefs: []
   type: TYPE_NORMAL
+  zh: 经过20步，最终的 <math alttext="bold x 0"><msub><mi>𝐱</mi> <mn>0</mn></msub></math>
+    预测图像被返回。
 - en: Analysis of the Diffusion Model
+  id: totrans-243
   prefs:
   - PREF_H2
   type: TYPE_NORMAL
+  zh: 扩散模型的分析
 - en: 'We’ll now take a look at three different ways that we can use our trained model:
     for generation of new images, testing how the number of reverse diffusion steps
     affects quality, and interpolating between two images in the latent space.'
+  id: totrans-244
   prefs: []
   type: TYPE_NORMAL
+  zh: 现在我们将看一下我们训练模型的三种不同用法：用于生成新图像，测试反向扩散步数如何影响质量，以及在潜在空间中两个图像之间的插值。
 - en: Generating images
+  id: totrans-245
   prefs:
   - PREF_H3
   type: TYPE_NORMAL
+  zh: 生成图像
 - en: In order to produce samples from our trained model, we can simply run the reverse
     diffusion process, ensuring that we denormalize the output at the end (i.e., take
     the pixel values back into the range [0, 1]). We can achieve this using the code
     in [Example 8-12](#diffusion_generation) inside the `DiffusionModel` class.
+  id: totrans-246
   prefs: []
   type: TYPE_NORMAL
+  zh: 为了从我们训练的模型中生成样本，我们只需运行逆扩散过程，确保最终去标准化输出（即，将像素值带回范围[0, 1]）。我们可以在`DiffusionModel`类中使用[示例8-12](#diffusion_generation)中的代码来实现这一点。
 - en: Example 8-12\. Generating images using the diffusion model
+  id: totrans-247
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例8-12。使用扩散模型生成图像
 - en: '[PRE11]'
+  id: totrans-248
   prefs: []
   type: TYPE_PRE
+  zh: '[PRE11]'
 - en: '[![1](Images/1.png)](#co_diffusion_models_CO10-1)'
+  id: totrans-249
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![1](Images/1.png)](#co_diffusion_models_CO10-1)'
 - en: Generate some initial noise maps.
+  id: totrans-250
   prefs: []
   type: TYPE_NORMAL
+  zh: 生成一些初始噪声图。
 - en: '[![2](Images/2.png)](#co_diffusion_models_CO10-3)'
+  id: totrans-251
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![2](Images/2.png)](#co_diffusion_models_CO10-3)'
 - en: Apply the reverse diffusion process.
+  id: totrans-252
   prefs: []
   type: TYPE_NORMAL
+  zh: 应用逆扩散过程。
 - en: '[![3](Images/3.png)](#co_diffusion_models_CO10-4)'
+  id: totrans-253
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![3](Images/3.png)](#co_diffusion_models_CO10-4)'
 - en: The images output by the network will have mean zero and unit variance, so we
     need to denormalize by reapplying the mean and variance calculated from the training
     data.
+  id: totrans-254
   prefs: []
   type: TYPE_NORMAL
+  zh: 网络输出的图像将具有零均值和单位方差，因此我们需要通过重新应用从训练数据计算得出的均值和方差来去标准化。
 - en: In [Figure 8-14](#diffusion_samples_epoch) we can observe some samples from
     the diffusion model at different epochs of the training process.
+  id: totrans-255
   prefs: []
   type: TYPE_NORMAL
+  zh: 在[图8-14](#diffusion_samples_epoch)中，我们可以观察到训练过程中不同时期扩散模型的一些样本。
 - en: '![](Images/gdl2_0814.png)'
+  id: totrans-256
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0814.png)'
 - en: Figure 8-14\. Samples from the diffusion model at different epochs of the training
     process
+  id: totrans-257
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-14。训练过程中不同时期扩散模型的样本
 - en: Adjusting the number of diffusion steps
+  id: totrans-258
   prefs:
   - PREF_H3
   type: TYPE_NORMAL
+  zh: 调整扩散步数
 - en: We can also test to see how adjusting the number of diffusion steps in the reverse
     process affects image quality. Intuitively, the more steps taken by the process,
     the higher the quality of the image generation.
+  id: totrans-259
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们还可以测试调整逆向过程中扩散步数如何影响图像质量。直观地，过程中步数越多，图像生成的质量就越高。
 - en: We can see in [Figure 8-15](#diffusion_steps_quality) that the quality of the
     generations does indeed improve with the number of diffusion steps. With one giant
     leap from the initial sampled noise, the model can only predict a hazy blob of
@@ -1306,19 +2021,27 @@
     of diffusion steps, so there is a trade-off. There is minimal improvement between
     20 and 100 diffusion steps, so we choose 20 as a reasonable compromise between
     quality and speed in this example.
+  id: totrans-260
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们可以在[图8-15](#diffusion_steps_quality)中看到，随着扩散步数的增加，生成的质量确实会提高。从初始抽样的噪声中一次性跳跃，模型只能预测出一个朦胧的颜色斑块。随着步数的增加，模型能够改进和锐化生成物。然而，生成图像所需的时间与扩散步数成线性关系，因此存在权衡。在20和100个扩散步之间的改进很小，因此在这个例子中我们选择20作为质量和速度之间的合理折衷。
 - en: '![](Images/gdl2_0815.png)'
+  id: totrans-261
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0815.png)'
 - en: Figure 8-15\. Image quality improves with the number of diffusion steps
+  id: totrans-262
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-15。随着扩散步数的增加，图像质量提高
 - en: Interpolating between images
+  id: totrans-263
   prefs:
   - PREF_H3
   type: TYPE_NORMAL
+  zh: 在图像之间进行插值
 - en: Lastly, as we have seen previously with variational autoencoders, we can interpolate
     between points in the Gaussian latent space in order to smoothly transition between
     images in pixel space. Here we choose to use a form of spherical interpolation
@@ -1333,46 +2056,70 @@
     ranges smoothly from 0 to 1 and <math alttext="a"><mi>a</mi></math> and <math
     alttext="b"><mi>b</mi></math> are the two randomly sampled Gaussian noise tensors
     that we wish to interpolate between.
+  id: totrans-264
   prefs: []
   type: TYPE_NORMAL
+  zh: 最后，正如我们之前在变分自动编码器中看到的那样，我们可以在高斯潜在空间中的点之间进行插值，以便在像素空间中平滑过渡。在这里，我们选择使用一种球面插值的形式，确保方差在混合两个高斯噪声图之间保持恒定。具体来说，每一步的初始噪声图由<math
+    alttext="a sine left-parenthesis StartFraction pi Over 2 EndFraction t right-parenthesis
+    plus b cosine left-parenthesis StartFraction pi Over 2 EndFraction t right-parenthesis"><mrow><mi>a</mi>
+    <mo form="prefix">sin</mo> <mrow><mo>(</mo> <mfrac><mi>π</mi> <mn>2</mn></mfrac>
+    <mi>t</mi> <mo>)</mo></mrow> <mo>+</mo> <mi>b</mi> <mo form="prefix">cos</mo>
+    <mrow><mo>(</mo> <mfrac><mi>π</mi> <mn>2</mn></mfrac> <mi>t</mi> <mo>)</mo></mrow></mrow></math>给出，其中<math
+    alttext="t"><mi>t</mi></math>从0平滑地变化到1，<math alttext="a"><mi>a</mi></math>和<math
+    alttext="b"><mi>b</mi></math>是我们希望在其间插值的两个随机抽样的高斯噪声张量。
 - en: The resulting images are shown in [Figure 8-16](#diffusion_interpolation).
+  id: totrans-265
   prefs: []
   type: TYPE_NORMAL
+  zh: 生成的图像显示在[图8-16](#diffusion_interpolation)中。
 - en: '![](Images/gdl2_0816.png)'
+  id: totrans-266
   prefs: []
   type: TYPE_IMG
+  zh: '![](Images/gdl2_0816.png)'
 - en: Figure 8-16\. Interpolating between images using the denoising diffusion model`  `#
     Summary
+  id: totrans-267
   prefs:
   - PREF_H6
   type: TYPE_NORMAL
+  zh: 图8-16。使用去噪扩散模型在图像之间进行插值`  `# 总结
 - en: 'In this chapter we have explored one of the most exciting and promising areas
     of generative modeling in recent times: diffusion models. In particular, we implemented
     the ideas from a key paper on generative diffusion models (Ho et al., 2020) that
     introduced the original Denoising Diffusion Probabilistic Model (DDPM). We then
     extended this with the ideas from the Denoising Diffusion Implicit Model (DDIM)
     paper to make the generation process fully deterministic.'
+  id: totrans-268
   prefs: []
   type: TYPE_NORMAL
+  zh: 在本章中，我们探索了近期最令人兴奋和有前途的生成建模领域之一：扩散模型。特别是，我们实现了一篇关于生成扩散模型的关键论文（Ho等人，2020）中介绍的原始去噪扩散概率模型（DDPM）的思想。然后，我们借鉴了去噪扩散隐式模型（DDIM）论文中的思想，使生成过程完全确定性。
 - en: We have seen how diffusion models are formed of a forward diffusion process
     and a reverse diffusion process. The forward diffusion process adds noise to the
     training data through a series of small steps, while the reverse diffusion process
     consists of a model that tries to predict the noise added.
+  id: totrans-269
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们已经看到扩散模型由前向扩散过程和逆扩散过程组成。 前向扩散过程通过一系列小步骤向训练数据添加噪声，而逆扩散过程包括试图预测添加的噪声的模型。
 - en: We make use of a reparameterization trick in order to calculate the noised images
     at any step of the forward process without having to go through multiple noising
     steps. We have seen how the chosen schedule of parameters used to add noise to
     the data plays an important part in the overall success of the model.
+  id: totrans-270
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们利用重新参数化技巧，以便在前向过程的任何步骤中计算带噪声的图像，而无需经历多个加噪步骤。 我们已经看到，用于向数据添加噪声的参数选择计划在模型的整体成功中起着重要作用。
 - en: The reverse diffusion process is parameterized by a U-Net that tries to predict
     the noise at each timestep, given the noised image and the noise rate at that
     step. A U-Net consists of `DownBlock`s that increase the number of channels while
     reducing the size of the image and `UpBlock`s that decrease the number of channels
     while increasing the size. The noise rate is encoded using sinusoidal embedding.
+  id: totrans-271
   prefs: []
   type: TYPE_NORMAL
+  zh: 逆扩散过程由一个U-Net参数化，试图在每个时间步预测噪声，给定在该步骤的噪声图像和噪声率。 U-Net由`DownBlock`组成，它们增加通道数同时减小图像的大小，以及`UpBlock`，它们减少通道数同时增加大小。
+    噪声率使用正弦嵌入进行编码。
 - en: Sampling from the diffusion model is conducted over a series of steps. The U-Net
     is used to predict the noise added to a given noised image, which is then used
     to calculate an estimate for the original image. The predicted noise is then reapplied
@@ -1380,46 +2127,69 @@
     may be significantly smaller than the number of steps used during training), starting
     from a random point sampled from a standard Gaussian noise distribution, to obtain
     the final generation.
+  id: totrans-272
   prefs: []
   type: TYPE_NORMAL
+  zh: 从扩散模型中进行采样是在一系列步骤中进行的。 使用U-Net来预测添加到给定噪声图像的噪声，然后用于计算原始图像的估计。 然后使用较小的噪声率重新应用预测的噪声。
+    从标准高斯噪声分布中随机抽取的随机点开始，重复这个过程一系列步骤（可能明显小于训练过程中使用的步骤数），以获得最终生成。
 - en: We saw how increasing the number of diffusion steps in the reverse process improves
     the image generation quality, at the expense of speed. We also performed latent
     space arithmetic in order to interpolate between two images.
+  id: totrans-273
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们看到，在逆过程中增加扩散步骤的数量会提高图像生成质量，但会降低速度。 我们还执行了潜在空间算术，以在两个图像之间插值。
 - en: ^([1](ch08.xhtml#idm45387010500320-marker)) Jascha Sohl-Dickstein et al., “Deep
     Unsupervised Learning Using Nonequilibrium Thermodynamics,” March 12, 2015, [*https://arxiv.org/abs/1503.03585*](https://arxiv.org/abs/1503.03585)
+  id: totrans-274
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([1](ch08.xhtml#idm45387010500320-marker)) Jascha Sohl-Dickstein等，“使用非平衡热力学进行深度无监督学习”，2015年3月12日，[*https://arxiv.org/abs/1503.03585*](https://arxiv.org/abs/1503.03585)
 - en: ^([2](ch08.xhtml#idm45387010496240-marker)) Yang Song and Stefano Ermon, “Generative
     Modeling by Estimating Gradients of the Data Distribution,” July 12, 2019, [*https://arxiv.org/abs/1907.05600*](https://arxiv.org/abs/1907.05600).
+  id: totrans-275
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([2](ch08.xhtml#idm45387010496240-marker)) 杨松和Stefano Ermon，“通过估计数据分布的梯度进行生成建模”，2019年7月12日，[*https://arxiv.org/abs/1907.05600*](https://arxiv.org/abs/1907.05600)。
 - en: ^([3](ch08.xhtml#idm45387010494000-marker)) Yang Song and Stefano Ermon, “Improved
     Techniques for Training Score-Based Generative Models,” June 16, 2020, [*https://arxiv.org/abs/2006.09011*](https://arxiv.org/abs/2006.09011).
+  id: totrans-276
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([3](ch08.xhtml#idm45387010494000-marker)) 杨松和Stefano Ermon，“改进训练基于分数的生成模型的技术”，2020年6月16日，[*https://arxiv.org/abs/2006.09011*](https://arxiv.org/abs/2006.09011)。
 - en: ^([4](ch08.xhtml#idm45387010490880-marker)) Jonathon Ho et al., “Denoising Diffusion
     Probabilistic Models,” June 19, 2020, [*https://arxiv.org/abs/2006.11239*](https://arxiv.org/abs/2006.11239).
+  id: totrans-277
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([4](ch08.xhtml#idm45387010490880-marker)) Jonathon Ho等，“去噪扩散概率模型”，2020年6月19日，[*https://arxiv.org/abs/2006.11239*](https://arxiv.org/abs/2006.11239)。
 - en: ^([5](ch08.xhtml#idm45387010764208-marker)) Alex Nichol and Prafulla Dhariwal,
     “Improved Denoising Diffusion Probabilistic Models,” February 18, 2021, [*https://arxiv.org/abs/2102.09672*](https://arxiv.org/abs/2102.09672).
+  id: totrans-278
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([5](ch08.xhtml#idm45387010764208-marker)) Alex Nichol和Prafulla Dhariwal，“改进去噪扩散概率模型”，2021年2月18日，[*https://arxiv.org/abs/2102.09672*](https://arxiv.org/abs/2102.09672)。
 - en: ^([6](ch08.xhtml#idm45387008220416-marker)) Ashish Vaswani et al., “Attention
     Is All You Need,” June 12, 2017, [*https://arxiv.org/abs/1706.03762*](https://arxiv.org/abs/1706.03762).
+  id: totrans-279
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([6](ch08.xhtml#idm45387008220416-marker)) Ashish Vaswani等，“注意力就是一切”，2017年6月12日，[*https://arxiv.org/abs/1706.03762*](https://arxiv.org/abs/1706.03762)。
 - en: '^([7](ch08.xhtml#idm45387008216736-marker)) Ben Mildenhall et al., “NeRF: Representing
     Scenes as Neural Radiance Fields for View Synthesis,” March 1, 2020, [*https://arxiv.org/abs/2003.08934*](https://arxiv.org/abs/2003.08934).'
+  id: totrans-280
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([7](ch08.xhtml#idm45387008216736-marker)) Ben Mildenhall等，“NeRF：将场景表示为神经辐射场进行视图合成”，2020年3月1日，[*https://arxiv.org/abs/2003.08934*](https://arxiv.org/abs/2003.08934)。
 - en: ^([8](ch08.xhtml#idm45387008052288-marker)) Kaiming He et al., “Deep Residual
     Learning for Image Recognition,” December 10, 2015, [*https://arxiv.org/abs/1512.03385*](https://arxiv.org/abs/1512.03385).
+  id: totrans-281
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([8](ch08.xhtml#idm45387008052288-marker)) Kaiming He等，“用于图像识别的深度残差学习”，2015年12月10日，[*https://arxiv.org/abs/1512.03385*](https://arxiv.org/abs/1512.03385)。
 - en: ^([9](ch08.xhtml#idm45387007342688-marker)) Jiaming Song et al., “Denoising
     Diffusion Implicit Models,” October 6, 2020, [*https://arxiv.org/abs/2010.02502*](https://arxiv.org/abs/2010.02502)`
+  id: totrans-282
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([9](ch08.xhtml#idm45387007342688-marker)) 宋嘉明等，“去噪扩散隐式模型”，2020年10月6日，[*https://arxiv.org/abs/2010.02502*](https://arxiv.org/abs/2010.02502)`