forked from fengdu78/Coursera-ML-AndrewNg-Notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path12 - 6 - Using An SVM (21 min).srt
3090 lines (2472 loc) · 58.3 KB
/
12 - 6 - Using An SVM (21 min).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:00,140 --> 00:00:01,310
So far we've been talking about
目前为止 我们已经讨论了
2
00:00:01,640 --> 00:00:03,290
SVMs in a fairly abstract level.
SVM比较抽象的层面
3
00:00:03,980 --> 00:00:05,030
In this video I'd like to
在这个视频中 我将要
4
00:00:05,200 --> 00:00:06,460
talk about what you actually need
讨论到为了运行或者运用SVM
5
00:00:06,740 --> 00:00:09,410
to do in order to run or to use an SVM.
你实际上所需要的一些东西
6
00:00:11,320 --> 00:00:12,300
The support vector machine algorithm
支持向量机算法
7
00:00:12,850 --> 00:00:14,870
poses a particular optimization problem.
提出了一个特别优化的问题
8
00:00:15,530 --> 00:00:16,940
But as I briefly mentioned in
但是就如在之前的
9
00:00:17,120 --> 00:00:18,150
an earlier video, I really
视频中我简单提到的
10
00:00:18,380 --> 00:00:20,570
do not recommend writing your
我真的不建议你自己写
11
00:00:20,630 --> 00:00:22,810
own software to solve for the parameter's theta yourself.
软件来求解参数θ
12
00:00:23,950 --> 00:00:26,110
So just as today, very
因此由于今天
13
00:00:26,420 --> 00:00:27,730
few of us, or maybe almost essentially
我们中的很少人 或者其实
14
00:00:28,090 --> 00:00:29,400
none of us would think of
没有人考虑过
15
00:00:29,530 --> 00:00:31,680
writing code ourselves to invert a matrix
自己写代码来转换矩阵
16
00:00:31,950 --> 00:00:33,940
or take a square root of a number, and so on.
或求一个数的平方根等
17
00:00:34,190 --> 00:00:36,570
We just, you know, call some library function to do that.
我们只是知道如何去调用库函数来实现这些功能
18
00:00:36,700 --> 00:00:38,090
In the same way, the
同样的
19
00:00:38,850 --> 00:00:40,310
software for solving the SVM
用以解决SVM
20
00:00:40,620 --> 00:00:42,200
optimization problem is very
最优化问题的软件很
21
00:00:42,440 --> 00:00:43,880
complex, and there have
复杂 且已经有
22
00:00:43,990 --> 00:00:44,960
been researchers that have been
研究者做了
23
00:00:45,110 --> 00:00:47,560
doing essentially numerical optimization research for many years.
很多年数值优化了
24
00:00:47,850 --> 00:00:48,960
So you come up with good
因此你提出好的
25
00:00:49,150 --> 00:00:50,550
software libraries and good software
软件库和好的软件
26
00:00:50,930 --> 00:00:52,270
packages to do this.
包来做这样一些事儿
27
00:00:52,470 --> 00:00:53,480
And then strongly recommend just using
然后强烈建议使用
28
00:00:53,860 --> 00:00:55,260
one of the highly optimized software
高优化软件库中的一个
29
00:00:55,710 --> 00:00:57,780
libraries rather than trying to implement something yourself.
而不是尝试自己落实一些数据
30
00:00:58,730 --> 00:01:00,680
And there are lots of good software libraries out there.
有许多好的软件库
31
00:01:00,970 --> 00:01:02,060
The two that I happen to
我正好用得最多的
32
00:01:02,210 --> 00:01:03,220
use the most often are the
两个是
33
00:01:03,400 --> 00:01:05,000
linear SVM but there are really
线性SVM 但是真的有
34
00:01:05,410 --> 00:01:06,860
lots of good software libraries for
很多软件库可以用来
35
00:01:07,030 --> 00:01:08,430
doing this that you know, you can
做这件事儿 你可以
36
00:01:08,600 --> 00:01:10,190
link to many of the
连接许多
37
00:01:10,450 --> 00:01:11,860
major programming languages that you
你可能会用来编写学习算法的
38
00:01:11,950 --> 00:01:14,410
may be using to code up learning algorithm.
主要编程语言
39
00:01:15,280 --> 00:01:16,460
Even though you shouldn't be writing
尽管你不去写
40
00:01:16,730 --> 00:01:18,330
your own SVM optimization software,
你自己的SVM(支持向量机)的优化软件
41
00:01:19,120 --> 00:01:20,680
there are a few things you need to do, though.
但是你也需要做几件事儿
42
00:01:21,420 --> 00:01:23,130
First is to come up
首先是提出
43
00:01:23,130 --> 00:01:24,230
with with some choice of the
参数C的选择
44
00:01:24,320 --> 00:01:25,640
parameter's C. We talked a
我们在之前的视频中
45
00:01:25,940 --> 00:01:26,930
little bit of the bias/variance properties of
讨论过误差/方差在
46
00:01:27,040 --> 00:01:28,850
this in the earlier video.
这方面的性质
47
00:01:30,290 --> 00:01:31,480
Second, you also need to
第二 你也需要
48
00:01:31,630 --> 00:01:33,040
choose the kernel or the
选择内核参数或
49
00:01:33,410 --> 00:01:34,880
similarity function that you want to use.
你想要使用的相似函数
50
00:01:35,730 --> 00:01:37,080
So one choice might
其中一个选择是
51
00:01:37,280 --> 00:01:38,980
be if we decide not to use any kernel.
我们选择不需要任何内核参数
52
00:01:40,560 --> 00:01:41,510
And the idea of no kernel
没有内核参数的理念
53
00:01:41,910 --> 00:01:43,600
is also called a linear kernel.
也叫线性核函数
54
00:01:44,130 --> 00:01:45,320
So if someone says, I use
因此 如果有人说他使用
55
00:01:45,530 --> 00:01:46,760
an SVM with a linear kernel,
了线性核的SVM(支持向量机)
56
00:01:47,180 --> 00:01:48,330
what that means is you know, they use
这就意味这 他使用了
57
00:01:48,490 --> 00:01:50,690
an SVM without using without
不带有
58
00:01:51,020 --> 00:01:52,250
using a kernel and it
核函数的SVM(支持向量机)
59
00:01:52,360 --> 00:01:53,410
was a version of the SVM
这是一个
60
00:01:54,120 --> 00:01:55,870
that just uses theta transpose X, right,
只是用了θTX
61
00:01:56,140 --> 00:01:57,620
that predicts 1 theta 0
预测1 θ0
62
00:01:57,850 --> 00:01:59,420
plus theta 1 X1
+θ1X1
63
00:01:59,740 --> 00:02:01,000
plus so on plus theta
+...+θnXn
64
00:02:01,690 --> 00:02:04,160
N, X N is greater than equals 0.
这个式子大于等于0
65
00:02:05,520 --> 00:02:06,830
This term linear kernel, you
这个内核线性参数
66
00:02:06,950 --> 00:02:08,250
can think of this as you know this
你可以把它想象成
67
00:02:08,480 --> 00:02:09,290
is the version of the SVM
SVM的一个版本
68
00:02:10,340 --> 00:02:12,320
that just gives you a standard linear classifier.
它只是给你一个标准的线性分类器
69
00:02:13,940 --> 00:02:14,700
So that would be one
因此它可以成为一个
70
00:02:15,040 --> 00:02:16,160
reasonable choice for some problems,
解决一些问题的合理选择
71
00:02:17,130 --> 00:02:18,080
and you know, there would be many software
且你知道的 有许多软件
72
00:02:18,470 --> 00:02:20,900
libraries, like linear, was
库 如线性的是
73
00:02:21,210 --> 00:02:22,320
one example, out of many,
其中的一个例子
74
00:02:22,840 --> 00:02:23,880
one example of a software library
一个软件库的例子
75
00:02:24,560 --> 00:02:25,620
that can train an SVM
可以用来训练不带
76
00:02:25,980 --> 00:02:27,410
without using a kernel, also
内核参数的SVM 也
77
00:02:27,760 --> 00:02:29,470
called a linear kernel.
叫线性内核函数
78
00:02:29,850 --> 00:02:31,340
So, why would you want to do this?
那么你为什么想要做这样一件事儿呢?
79
00:02:31,410 --> 00:02:32,820
If you have a large number of
如果你有大量的
80
00:02:33,150 --> 00:02:34,280
features, if N is
特征值 如果N
81
00:02:34,430 --> 00:02:37,800
large, and M the
很大 且M
82
00:02:37,990 --> 00:02:39,590
number of training examples is
训练的样本数
83
00:02:39,670 --> 00:02:41,050
small, then you know
很小 那么
84
00:02:41,230 --> 00:02:42,300
you have a huge number of
你有大量的
85
00:02:42,360 --> 00:02:43,630
features that if X, this is
特征值 如果X是
86
00:02:43,710 --> 00:02:45,850
an X is an Rn, Rn +1.
X属于Rn+1
87
00:02:46,010 --> 00:02:46,940
So if you have a
那么如果你已经有
88
00:02:47,080 --> 00:02:48,700
huge number of features already, with
大量的特征值 而只有
89
00:02:48,800 --> 00:02:50,540
a small training set, you know, maybe you
很小的训练数据集 也许你
90
00:02:50,610 --> 00:02:51,430
want to just fit a linear
就只想牛和一个线性
91
00:02:51,710 --> 00:02:52,890
decision boundary and not try
的判定边界 而不会去
92
00:02:53,060 --> 00:02:54,420
to fit a very complicated nonlinear
拟合一个非常复杂的非线性
93
00:02:54,860 --> 00:02:56,980
function, because might not have enough data.
函数 因为没有足够的数据
94
00:02:57,560 --> 00:02:59,330
And you might risk overfitting, if
你可能会过度拟合 如果
95
00:02:59,470 --> 00:03:00,530
you're trying to fit a very complicated function
你试着拟合非常复杂的函数的话
96
00:03:01,540 --> 00:03:03,220
in a very high dimensional feature space,
在一个非常高维的特征空间中
97
00:03:03,980 --> 00:03:04,990
but if your training set sample
但是如果你的训练集样本
98
00:03:05,040 --> 00:03:07,120
is small. So this
很小的话 因此
99
00:03:07,340 --> 00:03:08,600
would be one reasonable setting where
这将是一个合理的设置 在此
100
00:03:08,740 --> 00:03:09,950
you might decide to just
你可以决定
101
00:03:10,700 --> 00:03:11,960
not use a kernel, or
不使用内核参数 或
102
00:03:12,250 --> 00:03:15,580
equivalents to use what's called a linear kernel.
一些被叫做线性内核函数的等价物
103
00:03:15,740 --> 00:03:16,740
A second choice for the kernel that
对于内核函数的第二个选择是
104
00:03:16,820 --> 00:03:18,010
you might make, is this Gaussian
你可以构建 这是一个高斯
105
00:03:18,370 --> 00:03:19,920
kernel, and this is what we had previously.
内核函数 这个是我们之前有的
106
00:03:21,270 --> 00:03:22,350
And if you do this, then the
如果你选择这个 那么
107
00:03:22,440 --> 00:03:23,130
other choice you need to make
你需要做的另外一个选择是
108
00:03:23,420 --> 00:03:25,980
is to choose this parameter sigma squared
选择一个参数σ的平方
109
00:03:26,850 --> 00:03:29,800
when we also talk a little bit about the bias variance tradeoffs
当我们开始讨论一些如何权衡偏差方差的时候
110
00:03:30,820 --> 00:03:32,360
of how, if sigma squared is
如果σ2
111
00:03:32,600 --> 00:03:33,890
large, then you tend
很大 那么你就很有可能
112
00:03:34,160 --> 00:03:35,580
to have a higher bias, lower
会有一个较大的误差 较低
113
00:03:35,770 --> 00:03:37,650
variance classifier, but if
方差的分类器 但是如果
114
00:03:37,800 --> 00:03:39,700
sigma squared is small, then you
σ2很小 那么你
115
00:03:40,060 --> 00:03:42,360
have a higher variance, lower bias classifier.
就会有较大的方差 较低误差的分类器
116
00:03:43,940 --> 00:03:45,350
So when would you choose a Gaussian kernel?
那么什么时候选择高斯内核函数呢?
117
00:03:46,210 --> 00:03:48,050
Well, if your omission
如果你忽略了
118
00:03:48,310 --> 00:03:49,540
of features X, I mean
特征值X 我的意思是
119
00:03:49,820 --> 00:03:51,370
Rn, and if N
Rn 如果N
120
00:03:51,570 --> 00:03:53,890
is small, and, ideally, you know,
值很小 很理想地
121
00:03:55,660 --> 00:03:57,110
if n is large, right,
如果n值很大
122
00:03:58,470 --> 00:04:00,170
so that's if, you know, we have
那么如果我们有
123
00:04:00,550 --> 00:04:02,340
say, a two-dimensional training set,
如一个二维的训练集
124
00:04:03,130 --> 00:04:04,880
like the example I drew earlier.
就像我前面讲到的例子一样
125
00:04:05,470 --> 00:04:08,320
So n is equal to 2, but we have a pretty large training set.
那么n等于2 但是我们有相当大的训练集
126
00:04:08,680 --> 00:04:09,770
So, you know, I've drawn in a
我已经有了一个
127
00:04:09,950 --> 00:04:10,890
fairly large number of training examples,
相当大的训练样本了
128
00:04:11,650 --> 00:04:12,410
then maybe you want to use
那么可能你想用
129
00:04:12,540 --> 00:04:14,400
a kernel to fit a more
一个内核函数去拟合一个更加
130
00:04:14,910 --> 00:04:16,260
complex nonlinear decision boundary,
复杂的非线性的判定边界
131
00:04:16,650 --> 00:04:18,750
and the Gaussian kernel would be a fine way to do this.
那么高斯内核函数是一个不错的选择
132
00:04:19,480 --> 00:04:20,610
I'll say more towards the end
我会在这个视频的后面
133
00:04:20,720 --> 00:04:22,570
of the video, a little bit
部分讲到更多 一些关于
134
00:04:22,660 --> 00:04:23,760
more about when you might choose a
什么时候你可以选择
135
00:04:23,970 --> 00:04:26,310
linear kernel, a Gaussian kernel and so on.
线性内核函数 高斯内核函数等
136
00:04:27,860 --> 00:04:29,740
But if concretely, if you
但是如果具体地你
137
00:04:30,040 --> 00:04:31,210
decide to use a Gaussian
决定用高斯
138
00:04:31,720 --> 00:04:33,910
kernel, then here's what you need to do.
内核函数的话 那么这里就是你需要做的
139
00:04:35,380 --> 00:04:36,550
Depending on what support vector machine
根据你所要用的支持向量机
140
00:04:37,280 --> 00:04:38,990
software package you use, it
软件包 这
141
00:04:39,100 --> 00:04:40,960
may ask you to implement a
可能需要你实现一个
142
00:04:41,070 --> 00:04:42,200
kernel function, or to implement
核函数 或者实现
143
00:04:43,060 --> 00:04:43,880
the similarity function.
相似的函数
144
00:04:45,020 --> 00:04:46,750
So if you're using an
因此 如果你用
145
00:04:47,010 --> 00:04:49,820
octave or MATLAB implementation of
octave或者Matlab来实现
146
00:04:50,000 --> 00:04:50,720
an SVM, it may ask you
支持向量机的话 那么就需要你
147
00:04:50,810 --> 00:04:52,560
to provide a function to
提供一个函数来
148
00:04:52,690 --> 00:04:54,680
compute a particular feature of the kernel.
计算核函数的特征值
149
00:04:55,110 --> 00:04:56,480
So this is really computing f
因此这个是在一个特定值i
150
00:04:56,770 --> 00:04:57,890
subscript i for one
的情况下来
151
00:04:58,220 --> 00:04:59,560
particular value of i, where
计算fi
152
00:05:00,570 --> 00:05:02,310
f here is just a
这里的f只是一个
153
00:05:02,330 --> 00:05:03,570
single real number, so maybe
简单的实数
154
00:05:03,840 --> 00:05:05,060
I should move this better written
也许最好是写成
155
00:05:05,250 --> 00:05:07,230
f(i), but what you
f(i) 但是你所
156
00:05:07,510 --> 00:05:08,130
need to do is to write a kernel
需要做的是写一个核
157
00:05:08,480 --> 00:05:09,530
function that takes this input, you know,
函数 让它把这个作为输入 你知道的
158
00:05:10,610 --> 00:05:11,910
a training example or a
一个训练样本 或者一个
159
00:05:12,020 --> 00:05:13,140
test example whatever it takes
测试样本 不管是什么它
160
00:05:13,280 --> 00:05:14,640
in some vector X and takes
把向量X作为输入
161
00:05:14,990 --> 00:05:16,220
as input one of the
把输入作为一种
162
00:05:16,370 --> 00:05:18,270
landmarks and but
标识 不过
163
00:05:18,880 --> 00:05:20,750
only I've come down X1 and
在这里我只写了X1和
164
00:05:20,950 --> 00:05:21,810
X2 here, because the
X2 因为这些
165
00:05:21,900 --> 00:05:23,750
landmarks are really training examples as well.
标识也是训练样本
166
00:05:24,470 --> 00:05:26,160
But what you
但是你所
167
00:05:26,400 --> 00:05:27,490
need to do is write software that
需要做的是写一个
168
00:05:27,670 --> 00:05:28,960
takes this input, you know, X1, X2
可以将这些X1,X2进行输入的软
169
00:05:29,150 --> 00:05:30,320
and computes this sort
并用它们来计算
170
00:05:30,580 --> 00:05:31,950
of similarity function between them
这个相似函数
171
00:05:32,530 --> 00:05:33,470
and return a real number.
之后返回一个实数
172
00:05:36,180 --> 00:05:37,430
And so what some support vector machine
因此一些支持向量机的
173
00:05:37,580 --> 00:05:39,040
packages do is expect
包所做的是期望
174
00:05:39,510 --> 00:05:40,860
you to provide this kernel function
你能提供一个核函数
175
00:05:41,410 --> 00:05:44,580
that take this input you know, X1, X2 and returns a real number.
能够输入X1, X2 并返回一个实数
176
00:05:45,580 --> 00:05:46,460
And then it will take it from there
从这里开始
177
00:05:46,850 --> 00:05:49,070
and it will automatically generate all the features, and
它将自动地生成所有特征变量
178
00:05:49,410 --> 00:05:51,480
so automatically take X and
自动利用特征变量X
179
00:05:51,600 --> 00:05:53,370
map it to f1,
并用你写的函数对应到f1
180
00:05:53,420 --> 00:05:54,420
f2, down to f(m) using
f2 一直到f(m)
181
00:05:54,750 --> 00:05:56,200
this function that you write, and
并且
182
00:05:56,310 --> 00:05:57,190
generate all the features and
生成所有特征变量
183
00:05:57,650 --> 00:05:59,080
train the support vector machine from there.
并从这儿开始训练支持向量机
184
00:05:59,870 --> 00:06:00,800
But sometimes you do need to
但是有些时候你却一定要
185
00:06:00,880 --> 00:06:04,710
provide this function yourself.
自己提供这个函数
186
00:06:05,680 --> 00:06:06,770
Other if you are using the Gaussian kernel, some SVM implementations will also include the Gaussian kernel
如果你使用高斯核函数 一些SVM的函数实现也会包括高斯核函数
187
00:06:06,980 --> 00:06:09,950
and a
和一
188
00:06:10,040 --> 00:06:10,990
few other kernels as well, since
些其他的核函数 这是因为
189
00:06:11,230 --> 00:06:13,580
the Gaussian kernel is probably the most common kernel.
高斯核函数可能是最常见的核函数
190
00:06:14,880 --> 00:06:16,290
Gaussian and linear kernels are
到目前为止高斯核函数和线性核函数是
191
00:06:16,380 --> 00:06:18,210
really the two most popular kernels by far.
最普遍的核函数
192
00:06:19,130 --> 00:06:20,230
Just one implementational note.
一个实现函数的注意事项
193
00:06:20,750 --> 00:06:21,820
If you have features of very
如果你有大小很不一样
194
00:06:22,080 --> 00:06:23,620
different scales, it is important
的特征变量
195
00:06:24,700 --> 00:06:26,270
to perform feature scaling before
在使用高斯函数之前
196
00:06:26,600 --> 00:06:27,780
using the Gaussian kernel.
将这些特征变量的大小按比例归一化
197
00:06:28,580 --> 00:06:29,180
And here's why.
这就是原因
198
00:06:30,150 --> 00:06:31,600
If you imagine the computing
如果假设你在计算
199
00:06:32,290 --> 00:06:33,570
the norm between X and
X和l之间的标量
200
00:06:33,790 --> 00:06:34,890
l, right, so this term here,
就是这样一个式子