forked from fengdu78/Coursera-ML-AndrewNg-Notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path9 - 2 - Backpropagation Algorithm (12 min).srt
1690 lines (1352 loc) · 29.9 KB
/
9 - 2 - Backpropagation Algorithm (12 min).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:00,090 --> 00:00:01,798
In the previous video, we talked about
在上一个视频里 我们讲解了
(字幕整理:中国海洋大学 黄海广,[email protected] )
2
00:00:01,857 --> 00:00:03,868
a cost function for the neural network.
神经网络的代价函数
3
00:00:04,139 --> 00:00:07,079
In this video, let's start to talk about an algorithm,
在这个视频里 让我们来说说
4
00:00:07,200 --> 00:00:09,062
for trying to minimize the cost function.
让代价函数最小化的算法
5
00:00:09,240 --> 00:00:12,735
In particular, we'll talk about the back propagation algorithm.
具体来说 我们将主要讲解反向传播算法
6
00:00:13,834 --> 00:00:15,380
Here's the cost function that
这个就是我们上一个视频里写好的
7
00:00:15,520 --> 00:00:17,905
we wrote down in the previous video.
代价函数
8
00:00:17,972 --> 00:00:19,438
What we'd like to do is
我们要做的就是
9
00:00:19,484 --> 00:00:21,161
try to find parameters theta
设法找到参数
10
00:00:21,246 --> 00:00:23,440
to try to minimize j of theta.
使得J(θ)取到最小值
11
00:00:23,530 --> 00:00:25,782
In order to use either gradient descent
为了使用梯度下降法或者
12
00:00:25,832 --> 00:00:28,625
or one of the advance optimization algorithms.
其他某种高级优化算法
13
00:00:28,675 --> 00:00:30,206
What we need to do therefore is
我们需要做的就是
14
00:00:30,249 --> 00:00:31,598
to write code that takes
写好一个可以通过输入 参数 θ
15
00:00:31,645 --> 00:00:33,487
this input the parameters theta
然后计算 J(θ)
16
00:00:33,540 --> 00:00:34,965
and computes j of theta
和这些
17
00:00:35,014 --> 00:00:37,364
and these partial derivative terms.
偏导数项的代码
18
00:00:37,425 --> 00:00:38,763
Remember, that the parameters
记住 这些神经网络里
19
00:00:38,790 --> 00:00:40,710
in the the neural network of these things,
对应的参数
20
00:00:40,760 --> 00:00:43,435
theta superscript l subscript ij,
也就是 θ 上标 (l) 下标 ij 的参数
21
00:00:43,492 --> 00:00:44,868
that's the real number
这些都是实数
22
00:00:44,930 --> 00:00:47,185
and so, these are the partial derivative terms
所以这些都是我们需要计算的
23
00:00:47,249 --> 00:00:48,869
we need to compute.
偏导数项
24
00:00:48,900 --> 00:00:50,077
In order to compute the
为了计算代价函数
25
00:00:50,115 --> 00:00:51,840
cost function j of theta,
J(θ)
26
00:00:51,883 --> 00:00:53,986
we just use this formula up here
我们就是用上面这个公式
27
00:00:54,042 --> 00:00:55,617
and so, what I want to do
所以我们在本节视频里大部分时间
28
00:00:55,655 --> 00:00:56,850
for the most of this video is
想要做的都是
29
00:00:56,897 --> 00:00:58,595
focus on talking about
重点关注
30
00:00:58,636 --> 00:00:59,952
how we can compute these
如何计算这些
31
00:00:59,989 --> 00:01:01,994
partial derivative terms.
偏导数项
32
00:01:02,031 --> 00:01:03,812
Let's start by talking about
我们从只有一个
33
00:01:03,858 --> 00:01:05,512
the case of when we have only
训练样本的情况
34
00:01:05,556 --> 00:01:06,839
one training example,
开始说起
35
00:01:06,872 --> 00:01:09,385
so imagine, if you will that our entire
假设 我们整个训练集
36
00:01:09,432 --> 00:01:11,301
training set comprises only one
只包含一个训练样本
37
00:01:11,351 --> 00:01:14,006
training example which is a pair xy.
也就是实数对
38
00:01:14,049 --> 00:01:15,591
I'm not going to write x1y1
我这里不写成x(1) y(1)
39
00:01:15,629 --> 00:01:16,375
just write this.
就写成这样
40
00:01:16,410 --> 00:01:17,665
Write a one training example
把这一个训练样本记为 (x, y)
41
00:01:17,718 --> 00:01:19,980
as xy and let's tap through
让我们粗看一遍
42
00:01:20,031 --> 00:01:21,423
the sequence of calculations
使用这一个训练样本
43
00:01:21,462 --> 00:01:24,332
we would do with this one training example.
来计算的顺序
44
00:01:25,754 --> 00:01:27,129
The first thing we do is
首先我们
45
00:01:27,167 --> 00:01:29,175
we apply forward propagation in
应用前向传播方法来
46
00:01:29,212 --> 00:01:31,773
order to compute whether a hypotheses
计算一下在给定输入的时候
47
00:01:31,813 --> 00:01:34,238
actually outputs given the input.
假设函数是否会真的输出结果
48
00:01:34,272 --> 00:01:36,734
Concretely, the called the
具体地说 这里的
49
00:01:36,769 --> 00:01:39,025
a(1) is the activation values
a(1) 就是第一层的激励值
50
00:01:39,071 --> 00:01:41,541
of this first layer that was the input there.
也就是输入层在的地方
51
00:01:41,600 --> 00:01:43,452
So, I'm going to set that to x
所以我准备设定他为
52
00:01:43,505 --> 00:01:45,389
and then we're going to compute
然后我们来计算
53
00:01:45,435 --> 00:01:47,506
z(2) equals theta(1) a(1)
z(2) 等于 θ(1) 乘以 a(1)
54
00:01:47,552 --> 00:01:49,919
and a(2) equals g, the sigmoid
然后 a(2) 就等于 g(z(2)) 函数
55
00:01:49,980 --> 00:01:52,250
activation function applied to z(2)
其中g是一个S型激励函数
56
00:01:52,310 --> 00:01:53,753
and this would give us our
这就会计算出第一个
57
00:01:53,800 --> 00:01:56,115
activations for the first middle layer.
隐藏层的激励值
58
00:01:56,162 --> 00:01:58,208
That is for layer two of the network
也就是神经网络的第二层
59
00:01:58,241 --> 00:02:00,649
and we also add those bias terms.
我们还增加这个偏差项
60
00:02:01,315 --> 00:02:03,132
Next we apply 2 more steps
接下来我们再用2次
61
00:02:03,176 --> 00:02:04,966
of this four and propagation
前向传播
62
00:02:05,013 --> 00:02:08,328
to compute a(3) and a(4)
来计算出 a(3) 和 最后的 a(4)
63
00:02:08,360 --> 00:02:11,458
which is also the upwards
同样也就是假设函数
64
00:02:11,505 --> 00:02:14,089
of a hypotheses h of x.
h(x) 的输出
65
00:02:14,711 --> 00:02:18,103
So this is our vectorized implementation of
所以这里我们实现了把前向传播
66
00:02:18,145 --> 00:02:19,228
forward propagation
向量化
67
00:02:19,276 --> 00:02:20,888
and it allows us to compute
这使得我们可以计算
68
00:02:20,938 --> 00:02:22,280
the activation values
神经网络结构里的
69
00:02:22,345 --> 00:02:24,056
for all of the neurons
每一个神经元的
70
00:02:24,110 --> 00:02:25,948
in our neural network.
激励值
71
00:02:27,934 --> 00:02:29,608
Next, in order to compute
接下来
72
00:02:29,650 --> 00:02:30,967
the derivatives, we're going to use
为了计算导数项 我们将
73
00:02:31,026 --> 00:02:33,589
an algorithm called back propagation.
采用一种叫做反向传播(Backpropagation)的算法
74
00:02:34,904 --> 00:02:37,765
The intuition of the back propagation algorithm
反向传播算法从直观上说
75
00:02:37,807 --> 00:02:38,430
is that for each note
就是对每一个结点
76
00:02:38,430 --> 00:02:41,065
we're going to compute the term
我们计算这样一项
77
00:02:41,126 --> 00:02:43,642
delta superscript l subscript j
δ下标 j 上标(l)
78
00:02:43,676 --> 00:02:45,130
that's going to somehow
这就用某种形式
79
00:02:45,171 --> 00:02:46,310
represent the error
代表了第 l 层的第 j 个结点的
80
00:02:46,361 --> 00:02:48,511
of note j in the layer l.
误差
81
00:02:48,552 --> 00:02:49,682
So, recall that
我们还记得
82
00:02:49,716 --> 00:02:52,313
a superscript l subscript j
a 上标 (l) 下标 j
83
00:02:52,355 --> 00:02:54,138
that does the activation of
表示的是第 l 层第 j 个单元的
84
00:02:54,185 --> 00:02:56,182
the j of unit in layer l
激励值
85
00:02:56,224 --> 00:02:58,001
and so, this delta term
所以这个 δ 项
86
00:02:58,045 --> 00:02:59,037
is in some sense
在某种程度上
87
00:02:59,082 --> 00:03:00,978
going to capture our error
就捕捉到了我们
88
00:03:01,012 --> 00:03:03,618
in the activation of that neural duo.
在这个神经节点的激励值的误差
89
00:03:03,650 --> 00:03:05,798
So, how we might wish the activation
所以我们可能希望这个节点的
90
00:03:05,823 --> 00:03:07,975
of that note is slightly different.
激励值稍微不一样
91
00:03:08,047 --> 00:03:09,670
Concretely, taking the example
具体地讲 我们用
92
00:03:10,270 --> 00:03:11,100
neural network that we have
右边这个有四层
93
00:03:11,360 --> 00:03:12,700
on the right which has four layers.
的神经网络结构做例子
94
00:03:13,440 --> 00:03:15,710
And so capital L is equal to 4.
所以这里大写 L 等于4
95
00:03:16,060 --> 00:03:17,120
For each output unit, we're going to compute this delta term.
对于每一个输出单元 我们准备计算δ项
96
00:03:17,400 --> 00:03:19,130
So, delta for the j of unit in the fourth layer is equal to
所以第四层的第j个单元的δ就等于
97
00:03:23,380 --> 00:03:24,490
just the activation of that
这个单元的激励值
98
00:03:24,720 --> 00:03:26,350
unit minus what was
减去训练样本里的
99
00:03:26,490 --> 00:03:28,650
the actual value of 0 in our training example.
真实值0
100
00:03:29,900 --> 00:03:32,420
So, this term here can
所以这一项可以
101
00:03:32,580 --> 00:03:34,510
also be written h of
同样可以写成
102
00:03:34,710 --> 00:03:38,040
x subscript j, right.
h(x) 下标 j
103
00:03:38,330 --> 00:03:39,640
So this delta term is just
所以 δ 这一项就是
104
00:03:39,930 --> 00:03:40,900
the difference between when a
假设输出
105
00:03:41,290 --> 00:03:43,200
hypotheses output and what
和训练集y值
106
00:03:43,370 --> 00:03:44,870
was the value of y
之间的差
107
00:03:45,570 --> 00:03:46,900
in our training set whereas
这里
108
00:03:47,060 --> 00:03:48,610
y subscript j is
y 下标 j 就是
109
00:03:48,750 --> 00:03:49,910
the j of element of the
我们标记训练集里向量
110
00:03:50,090 --> 00:03:53,340
vector value y in our labeled training set.
的第j个元素的值
111
00:03:56,200 --> 00:03:57,790
And by the way, if you
顺便说一句
112
00:03:57,970 --> 00:04:00,460
think of delta a and
如果你把 δ a 和 y 这三个
113
00:04:01,000 --> 00:04:02,350
y as vectors then you can
都看做向量
114
00:04:02,520 --> 00:04:03,760
also take those and come
那么你可以同样这样写
115
00:04:04,030 --> 00:04:05,890
up with a vectorized implementation of
并且得出一个向量化的表达式
116
00:04:06,010 --> 00:04:07,310
it, which is just
也就是
117
00:04:07,690 --> 00:04:09,840
delta 4 gets set as
δ(4)等于
118
00:04:10,700 --> 00:04:14,330
a4 minus y. Where
a(4) 减去 y 这里
119
00:04:14,560 --> 00:04:15,820
here, each of these delta
每一个变量
120
00:04:16,540 --> 00:04:18,080
4 a4 and y, each of
也就是 δ(4) a(4) 和 y
121
00:04:18,180 --> 00:04:19,860
these is a vector whose
都是一个向量
122
00:04:20,640 --> 00:04:22,040
dimension is equal to
并且向量维数等于
123
00:04:22,250 --> 00:04:24,150
the number of output units in our network.
输出单元的数目
124
00:04:25,210 --> 00:04:26,880
So we've now computed the
所以现在我们计算出
125
00:04:27,320 --> 00:04:28,670
era term's delta
网络结构的
126
00:04:29,020 --> 00:04:30,170
4 for our network.
误差项 δ(4)
127
00:04:31,440 --> 00:04:32,950
What we do next is compute
我们下一步就是计算
128
00:04:33,620 --> 00:04:36,280
the delta terms for the earlier layers in our network.
网络中前面几层的误差项 δ
129
00:04:37,210 --> 00:04:38,690
Here's a formula for computing delta
这个就是计算 δ(3) 的公式
130
00:04:39,010 --> 00:04:39,830
3 is delta 3 is equal
δ(3) 等于
131
00:04:40,310 --> 00:04:42,050
to theta 3 transpose times delta 4.
θ(3) 的转置乘以 δ(4)
132
00:04:42,560 --> 00:04:44,190
And this dot times, this
然后这里的点乘
133
00:04:44,390 --> 00:04:46,390
is the element y's multiplication operation
这是我们从 MATLAB 里知道的
134
00:04:47,580 --> 00:04:48,380
that we know from MATLAB.
对 y 元素的乘法操作
135
00:04:49,160 --> 00:04:50,760
So delta 3 transpose delta
所以 θ(3) 转置乘以
136
00:04:51,020 --> 00:04:52,860
4, that's a vector; g prime
δ(4) 这是一个向量
137
00:04:53,480 --> 00:04:55,080
z3 that's also a vector
g'(z(3)) 同样也是一个向量
138
00:04:55,800 --> 00:04:57,370
and so dot times is
所以点乘就是
139
00:04:57,530 --> 00:04:59,670
in element y's multiplication between these two vectors.
两个向量的元素间对应相乘
140
00:05:01,460 --> 00:05:02,650
This term g prime of
其中这一项 g'(z(3))
141
00:05:02,740 --> 00:05:04,560
z3, that formally is actually
其实是对激励函数 g
142
00:05:04,950 --> 00:05:06,420
the derivative of the activation
在输入值为 z(3) 的时候
143
00:05:06,720 --> 00:05:08,740
function g evaluated at
所求的
144
00:05:08,890 --> 00:05:10,620
the input values given by z3.
导数
145
00:05:10,760 --> 00:05:12,620
If you know calculus, you
如果你掌握微积分的话
146
00:05:12,710 --> 00:05:13,470
can try to work it out yourself
你可以试着自己解出来
147
00:05:13,850 --> 00:05:16,100
and see that you can simplify it to the same answer that I get.
然后可以简化得到我这里的结果
148
00:05:16,860 --> 00:05:19,690
But I'll just tell you pragmatically what that means.
但是我只是从实际角度告诉你这是什么意思
149
00:05:20,000 --> 00:05:21,260
What you do to compute this g
你计算这个 g'
150
00:05:21,460 --> 00:05:23,310
prime, these derivative terms is
这个导数项其实是
151
00:05:23,510 --> 00:05:25,660
just a3 dot times1
a(3) 点乘 (1-a(3))
152
00:05:26,010 --> 00:05:27,900
minus A3 where A3
这里a(3)是
153
00:05:28,160 --> 00:05:29,420
is the vector of activations.
激励向量
154
00:05:30,150 --> 00:05:31,440
1 is the vector of
1是以1为元素的向量
155
00:05:31,600 --> 00:05:33,240
ones and A3 is
a(3) 又是
156
00:05:34,020 --> 00:05:35,970
again the activation
一个对那一层的
157
00:05:36,290 --> 00:05:38,850
the vector of activation values for that layer.
激励向量
158
00:05:39,170 --> 00:05:40,210
Next you apply a similar
接下来你应用一个相似的公式
159
00:05:40,540 --> 00:05:42,850
formula to compute delta 2
来计算 δ(2)
160
00:05:43,220 --> 00:05:45,230
where again that can be
同样这里可以利用一个
161
00:05:45,670 --> 00:05:47,410
computed using a similar formula.
相似的公式
162
00:05:48,450 --> 00:05:49,950
Only now it is a2
只是在这里
163
00:05:50,120 --> 00:05:53,850
like so and I
是 a(2)
164
00:05:53,960 --> 00:05:55,020
then prove it here but you
这里我并没有证明
165
00:05:55,110 --> 00:05:56,400
can actually, it's possible to
但是如果你懂微积分的话
166
00:05:56,490 --> 00:05:57,520
prove it if you know calculus
证明是完全可以做到的
167
00:05:58,240 --> 00:05:59,520
that this expression is equal
那么这个表达式从数学上讲
168
00:05:59,860 --> 00:06:02,010
to mathematically, the derivative of
就等于激励函数
169
00:06:02,190 --> 00:06:03,570
the g function of the activation
g函数的偏导数
170
00:06:04,040 --> 00:06:05,460
function, which I'm denoting
这里我用
171
00:06:05,910 --> 00:06:08,540
by g prime. And finally,
g‘来表示
172
00:06:09,270 --> 00:06:10,690
that's it and there is
最后 就到这儿结束了
173
00:06:10,860 --> 00:06:13,650
no delta1 term, because the
这里没有 δ(1) 项 因为
174
00:06:13,720 --> 00:06:15,590
first layer corresponds to the
第一次对应输入层
175
00:06:15,630 --> 00:06:16,940
input layer and that's just the
那只是表示
176
00:06:17,000 --> 00:06:18,200
feature we observed in our
我们在训练集观察到的
177
00:06:18,300 --> 00:06:20,380
training sets, so that doesn't have any error associated with that.
所以不会存在误差
178
00:06:20,600 --> 00:06:22,080
It's not like, you know,
这就是说
179
00:06:22,120 --> 00:06:23,680
we don't really want to try to change those values.
我们是不想改变这些值的
180
00:06:24,320 --> 00:06:25,240
And so we have delta
所以这个例子中我们的 δ 项就只有
181
00:06:25,510 --> 00:06:28,090
terms only for layers 2, 3 and for this example.
第2层和第3层
182
00:06:30,170 --> 00:06:32,120
The name back propagation comes from
反向传播法这个名字
183
00:06:32,170 --> 00:06:33,260
the fact that we start by
源于我们从
184
00:06:33,350 --> 00:06:34,720
computing the delta term for
输出层开始计算
185
00:06:34,740 --> 00:06:36,190
the output layer and then
δ项
186
00:06:36,370 --> 00:06:37,480
we go back a layer and
然后我们返回到上一层
187
00:06:37,880 --> 00:06:39,670
compute the delta terms for the
计算第三隐藏层的
188
00:06:39,850 --> 00:06:41,050
third hidden layer and then we
δ项 接着我们
189
00:06:41,180 --> 00:06:42,540
go back another step to compute
再往前一步来计算
190
00:06:42,770 --> 00:06:44,070
delta 2 and so, we're sort of
δ(2) 所以说
191
00:06:44,660 --> 00:06:46,060
back propagating the errors from
我们是类似于把输出层的误差
192
00:06:46,280 --> 00:06:47,270
the output layer to layer 3
反向传播给了第3层
193
00:06:47,650 --> 00:06:50,180
to their to hence the name back complication.
然后是再传到第二层 这就是反向传播的意思
194
00:06:51,270 --> 00:06:53,120
Finally, the derivation is
最后 这个推导过程是出奇的麻烦的
195
00:06:53,340 --> 00:06:56,510
surprisingly complicated, surprisingly involved but
出奇的复杂
196
00:06:56,820 --> 00:06:58,100
if you just do this few steps
但是如果你按照
197
00:06:58,280 --> 00:07:00,130
steps of computation it is possible
这样几个步骤计算
198
00:07:00,680 --> 00:07:02,540
to prove viral frankly some
就有可能简单直接地完成
199
00:07:02,810 --> 00:07:04,440
what complicated mathematical proof.
复杂的数学证明
200
00:07:05,200 --> 00:07:07,410
It's possible to prove that if
如果你忽略标准化所产生的项