2024-02-08 19:15:19

OpenDocCN · Feb 8, 2024 · 0a63444 · 0a63444
1 parent d4d1847
commit 0a63444
Showing 1 changed file with 44 additions and 0 deletions.
diff --git a/totrans/gen-dl_15.yaml b/totrans/gen-dl_15.yaml
@@ -1279,10 +1279,12 @@
   id: totrans-170
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![4](Images/4.png)](#co_music_generation_CO1-4)'
 - en: We remove the unnecessary extra dimension with a `Reshape` layer.
   id: totrans-171
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们使用 `Reshape` 层去除不必要的额外维度。
 - en: The reason we use convolutional operations rather than requiring two independent
     vectors into the network is because we would like the network to learn how one
     bar should follow on from another in a consistent way. Using a neural network
@@ -1292,19 +1294,23 @@
   id: totrans-172
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们使用卷积操作而不是要求两个独立的向量进入网络的原因是，我们希望网络学习如何以一种一致的方式让一个小节跟随另一个小节。使用神经网络沿着时间轴扩展输入向量意味着模型有机会学习音乐如何跨越小节流动，而不是将每个小节视为完全独立于上一个的。
 - en: Chords, style, melody, and groove
   id: totrans-173
   prefs:
   - PREF_H3
   type: TYPE_NORMAL
+  zh: 和弦、风格、旋律和 groove
 - en: 'Let’s now take a closer look at the four different inputs that feed the generator:'
   id: totrans-174
   prefs: []
   type: TYPE_NORMAL
+  zh: 现在让我们更仔细地看一下喂给生成器的四种不同输入：
 - en: Chords
   id: totrans-175
   prefs: []
   type: TYPE_NORMAL
+  zh: 和弦
 - en: The chords input is a single noise vector of length `Z_DIM`. This vector’s job
     is to control the general progression of the music over time, shared across tracks,
     so we use a `TemporalNetwork` to transform this single vector into a different
@@ -1314,37 +1320,45 @@
   id: totrans-176
   prefs: []
   type: TYPE_NORMAL
+  zh: 和弦输入是一个长度为 `Z_DIM` 的单一噪声向量。这个向量的作用是控制音乐随时间的总体进展，跨越轨道共享，因此我们使用 `TemporalNetwork`
+    将这个单一向量转换为每个小节的不同潜在向量。请注意，虽然我们称这个输入为和弦，但它实际上可以控制音乐中每个小节变化的任何内容，比如一般的节奏风格，而不是特定于任何特定轨道。
 - en: Style
   id: totrans-177
   prefs: []
   type: TYPE_NORMAL
+  zh: 风格
 - en: The style input is also a vector of length `Z_DIM`. This is carried forward
     without transformation, so it is the same across all bars and tracks. It can be
     thought of as the vector that controls the overall style of the piece (i.e., it
     affects all bars and tracks consistently).
   id: totrans-178
   prefs: []
   type: TYPE_NORMAL
+  zh: 风格输入也是长度为 `Z_DIM` 的向量。这个向量在不经过转换的情况下传递，因此在所有小节和轨道上都是相同的。它可以被视为控制乐曲整体风格的向量（即，它会一致地影响所有小节和轨道）。
 - en: Melody
   id: totrans-179
   prefs: []
   type: TYPE_NORMAL
+  zh: 旋律
 - en: The melody input is an array of shape `[N_TRACKS, Z_DIM]`—that is, we provide
     the model with a random noise vector of length `Z_DIM` for each track.
   id: totrans-180
   prefs: []
   type: TYPE_NORMAL
+  zh: 旋律输入是一个形状为 `[N_TRACKS, Z_DIM]` 的数组—也就是说，我们为每个轨道提供长度为 `Z_DIM` 的随机噪声向量。
 - en: Each of these vectors is passed through a track-specific `TemporalNetwork`,
     where the weights are not shared between tracks. The output is a vector of length
     `Z_DIM` for every bar of every track. The model can therefore use these input
     vectors to fine-tune the content of every single bar and track independently.
   id: totrans-181
   prefs: []
   type: TYPE_NORMAL
+  zh: 这些向量中的每一个都通过轨道特定的 `TemporalNetwork`，其中轨道之间的权重不共享。输出是每个轨道的每个小节的长度为 `Z_DIM` 的向量。因此，模型可以使用这些输入向量来独立地微调每个小节和轨道的内容。
 - en: Groove
   id: totrans-182
   prefs: []
   type: TYPE_NORMAL
+  zh: Groove
 - en: The groove input is also an array of shape `[N_TRACKS, Z_DIM]`—a random noise
     vector of length `Z_DIM` for each track. Unlike the melody input, these vectors
     are not passed through the temporal network but instead are fed straight through,
@@ -1353,69 +1367,86 @@
   id: totrans-183
   prefs: []
   type: TYPE_NORMAL
+  zh: groove 输入也是一个形状为 `[N_TRACKS, Z_DIM]` 的数组，即每个轨道的长度为 `Z_DIM` 的随机噪声向量。与旋律输入不同，这些向量不通过时间网络，而是直接传递，就像风格向量一样。因此，每个
+    groove 向量将影响轨道的整体属性，跨越所有小节。
 - en: We can summarize the responsibilities of each component of the MuseGAN generator
     as shown in [Table 11-1](#musegan_sections).
   id: totrans-184
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们可以总结每个 MuseGAN 生成器组件的责任，如 [表11-1](#musegan_sections) 所示。
 - en: Table 11-1\. Components of the MuseGAN generator
   id: totrans-185
   prefs: []
   type: TYPE_NORMAL
+  zh: 表11-1\. MuseGAN 生成器的组件
 - en: '|  | Output differs across bars? | Output differs across parts? |'
   id: totrans-186
   prefs: []
   type: TYPE_TB
+  zh: '|  | 输出在小节之间不同吗？ | 输出在部分之间不同吗？ |'
 - en: '| --- | --- | --- |'
   id: totrans-187
   prefs: []
   type: TYPE_TB
+  zh: '| --- | --- | --- |'
 - en: '| Style | Ｘ | Ｘ |'
   id: totrans-188
   prefs: []
   type: TYPE_TB
+  zh: '| 风格 | Ｘ | Ｘ |'
 - en: '| Groove | Ｘ | ✓ |'
   id: totrans-189
   prefs: []
   type: TYPE_TB
+  zh: '| Groove | Ｘ | ✓ |'
 - en: '| Chords | ✓ | Ｘ |'
   id: totrans-190
   prefs: []
   type: TYPE_TB
+  zh: '| 和弦 | ✓ | Ｘ |'
 - en: '| Melody | ✓ | ✓ |'
   id: totrans-191
   prefs: []
   type: TYPE_TB
+  zh: '| 旋律 | ✓ | ✓ |'
 - en: The final piece of the MuseGAN generator is the *bar generator*—let’s see how
     we can use this to glue together the outputs from the chord, style, melody, and
     groove components.
   id: totrans-192
   prefs: []
   type: TYPE_NORMAL
+  zh: MuseGAN 生成器的最后一部分是 *小节生成器*—让我们看看如何使用它来将和弦、风格、旋律和 groove 组件的输出粘合在一起。
 - en: The bar generator
   id: totrans-193
   prefs:
   - PREF_H3
   type: TYPE_NORMAL
+  zh: 小节生成器
 - en: The bar generator receives four latent vectors—one from each of the chord, style,
     melody, and groove components. These are concatenated to produce a vector of length
     `4 * Z_DIM` as input. The output is a piano roll representation of a single bar
     for a single track—i.e., a tensor of shape `[1, n_steps_per_bar, n_pitches, 1]`.
   id: totrans-194
   prefs: []
   type: TYPE_NORMAL
+  zh: 小节生成器接收四个潜在向量——来自和弦、风格、旋律和 groove 组件。这些被连接起来产生长度为 `4 * Z_DIM` 的输入向量。输出是单个轨道的单个小节的钢琴卷表示—即，形状为
+    `[1, n_steps_per_bar, n_pitches, 1]` 的张量。
 - en: The bar generator is just a neural network that uses convolutional transpose
     layers to expand the time and pitch dimensions of the input vector. We create
     one bar generator for every track, and weights are not shared between tracks.
     The Keras code to build a `BarGenerator` is given in [Example 11-7](#example0706).
   id: totrans-195
   prefs: []
   type: TYPE_NORMAL
+  zh: 小节生成器只是一个使用卷积转置层来扩展输入向量的时间和音高维度的神经网络。我们为每个轨道创建一个小节生成器，轨道之间的权重不共享。构建 `BarGenerator`
+    的 Keras 代码在 [示例11-7](#example0706) 中给出。
 - en: Example 11-7\. Building the `BarGenerator`
   id: totrans-196
   prefs:
   - PREF_H5
   type: TYPE_NORMAL
+  zh: 示例11-7\. 构建 `BarGenerator`
 - en: '[PRE8]'
   id: totrans-197
   prefs: []
@@ -1425,39 +1456,48 @@
   id: totrans-198
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![1](Images/1.png)](#co_music_generation_CO2-1)'
 - en: The input to the bar generator is a vector of length `4 * Z_DIM`.
   id: totrans-199
   prefs: []
   type: TYPE_NORMAL
+  zh: bar 生成器的输入是长度为 `4 * Z_DIM` 的向量。
 - en: '[![2](Images/2.png)](#co_music_generation_CO2-2)'
   id: totrans-200
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![2](Images/2.png)](#co_music_generation_CO2-2)'
 - en: After passing it through a `Dense` layer, we reshape the tensor to prepare it
     for the convolutional transpose operations.
   id: totrans-201
   prefs: []
   type: TYPE_NORMAL
+  zh: 通过一个 `Dense` 层后，我们重新塑造张量以准备进行卷积转置操作。
 - en: '[![3](Images/3.png)](#co_music_generation_CO2-3)'
   id: totrans-202
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![3](Images/3.png)](#co_music_generation_CO2-3)'
 - en: First we expand the tensor along the timestep axis…
   id: totrans-203
   prefs: []
   type: TYPE_NORMAL
+  zh: 首先我们沿着时间步长轴扩展张量…
 - en: '[![4](Images/4.png)](#co_music_generation_CO2-4)'
   id: totrans-204
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![4](Images/4.png)](#co_music_generation_CO2-4)'
 - en: …then along the pitch axis.
   id: totrans-205
   prefs: []
   type: TYPE_NORMAL
+  zh: …然后沿着音高轴。
 - en: '[![5](Images/5.png)](#co_music_generation_CO2-5)'
   id: totrans-206
   prefs: []
   type: TYPE_NORMAL
+  zh: '[![5](Images/5.png)](#co_music_generation_CO2-5)'
 - en: The final layer has a tanh activation applied, as we will be using a WGAN-GP
     (which requires tanh output activation) to train the network.
   id: totrans-207
@@ -1801,6 +1841,7 @@
   id: totrans-260
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们还探讨了如何调整标记化过程以处理多声部（多轨）音乐生成。网格标记化将乐谱的钢琴卷表示序列化，使我们能够在描述每个音轨中存在哪个音符的令牌的单个流上训练变压器，在离散的、等间隔的时间步长间隔内。基于事件的标记化产生了一个*配方*，描述了如何以顺序方式创建多行音乐，通过一系列指令的单个流。这两种方法都有优缺点——变压器基于的音乐生成方法的成功或失败往往严重依赖于标记化方法的选择。
 - en: We also saw that generating music does not always require a sequential approach—MuseGAN
     uses convolutions to generate polyphonic musical scores with multiple tracks,
     by treating the score as an image where the tracks are individual channels of
@@ -1813,14 +1854,17 @@
   id: totrans-261
   prefs: []
   type: TYPE_NORMAL
+  zh: 我们还看到生成音乐并不总是需要顺序方法——MuseGAN使用卷积来生成具有多轨的多声部乐谱，将乐谱视为图像，其中轨道是图像的各个通道。MuseGAN的新颖之处在于四个输入噪声向量（和弦、风格、旋律和节奏）的组织方式，使得可以对音乐的高级特征保持完全控制。虽然底层的和声仍然不像巴赫的那样完美或多样化，但这是对一个极其难以掌握的问题的良好尝试，并突显了GAN处理各种问题的能力。
 - en: '^([1](ch11.xhtml#idm45387004193120-marker)) Cheng-Zhi Anna Huang et al., “Music
     Transformer: Generating Music with Long-Term Structure,” September 12, 2018, [*https://arxiv.org/abs/1809.04281*](https://arxiv.org/abs/1809.04281).'
   id: totrans-262
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([1](ch11.xhtml#idm45387004193120-marker)) 黄成志安娜等人，“音乐变压器：生成具有长期结构的音乐”，2018年9月12日，[*https://arxiv.org/abs/1809.04281*](https://arxiv.org/abs/1809.04281)。
 - en: '^([2](ch11.xhtml#idm45387004128000-marker)) Hao-Wen Dong et al., “MuseGAN:
     Multi-Track Sequential Generative Adversarial Networks for Symbolic Music Generation
     and Accompaniment,” September 19, 2017, [*https://arxiv.org/abs/1709.06298*](https://arxiv.org/abs/1709.06298).``'
   id: totrans-263
   prefs: []
   type: TYPE_NORMAL
+  zh: ^([2](ch11.xhtml#idm45387004128000-marker)) 董浩文等人，“MuseGAN：用于符号音乐生成和伴奏的多轨序列生成对抗网络”，2017年9月19日，[*https://arxiv.org/abs/1709.06298*](https://arxiv.org/abs/1709.06298)。