forked from fengdu78/Coursera-ML-AndrewNg-Notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path15 - 5 - Anomaly Detection vs. Supervised Learning (8 min).srt
1166 lines (936 loc) · 22.9 KB
/
15 - 5 - Anomaly Detection vs. Supervised Learning (8 min).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:00,180 --> 00:00:01,210
In the last video, we talked
在上一讲中,我们讨论了(字幕翻译:中国海洋大学 刘竞)
2
00:00:01,580 --> 00:00:02,950
about the process of evaluating
评估一个
3
00:00:03,790 --> 00:00:05,780
an anomaly detection algorithm and
异常检测算法的过程
4
00:00:05,910 --> 00:00:06,980
there we started to use some
当时我们使用了一些带有标签的数据,
5
00:00:07,210 --> 00:00:08,810
labelled data, with examples
对于使用的例子,我们知道
6
00:00:08,880 --> 00:00:10,150
that we knew were either anomalous
它们要么是正常的
7
00:00:11,010 --> 00:00:13,170
or not anomalous, with y equals 1 or y equals 0.
要么是异常的,y要么等于1,要么等于0.
8
00:00:14,690 --> 00:00:15,380
So the question then arises, if
所以问题就出来了,
9
00:00:15,690 --> 00:00:17,700
we have this labeled data,
假如我们有这些带有标签的数据,
9
00:00:18,130 --> 00:00:19,620
we have some examples that are
我们有一些例子,我们知道
10
00:00:19,750 --> 00:00:20,840
known to be anomalies and some
它们是异常,还有一些例子
11
00:00:21,020 --> 00:00:21,850
that are known not to be not
我们知道它们是正常的,
12
00:00:22,090 --> 00:00:23,540
anomalies, why don't we
那么为什么我们不用
13
00:00:23,640 --> 00:00:25,580
just use a supervised learning algorithm,
一个监督学习算法,
14
00:00:25,720 --> 00:00:26,790
so why don't we just use
那么为什么我们不用一个
15
00:00:27,110 --> 00:00:28,360
logistic regression or a neural
逻辑回归或者一个神经
16
00:00:28,680 --> 00:00:29,770
network to try to
网络算法去试图
17
00:00:30,020 --> 00:00:31,260
learn directly from our labeled
从我们的带标签的数据中直接学习,
18
00:00:31,550 --> 00:00:34,120
data, to predict whether y equals one or y equals zero.
并去预测是否y等于1或者y等于0呢?
19
00:00:34,900 --> 00:00:35,900
In this video, I'll try to
在这个视频中,我将尝试去
20
00:00:36,160 --> 00:00:37,170
share with you some of
和大学一起分享
21
00:00:37,350 --> 00:00:38,820
the thinking and some guidelines for
一些思想以及指导方针,这将用于
22
00:00:39,130 --> 00:00:40,610
when you should probably use an
当你可能会使用一个
23
00:00:40,720 --> 00:00:42,160
anomaly detection algorithm and when
异常检测算法的情况,但是当你
24
00:00:42,440 --> 00:00:43,500
it might be more fruitful to consider
使用一个监督学习算法时
25
00:00:43,920 --> 00:00:45,380
using a supervised learning algorithm.
将会更加有效的情况。
26
00:00:47,160 --> 00:00:48,950
This slide shows, what are
从这个幻灯片可以看到,
27
00:00:49,010 --> 00:00:50,130
the settings under which you should
什么情况下你
28
00:00:50,900 --> 00:00:52,370
maybe use anomaly detection versus
很可能使用异常检测算法,
29
00:00:52,930 --> 00:00:54,590
when supervised learning might be more fruitful.
以及什么情况下使用监督学习算法更有效。
30
00:00:56,030 --> 00:00:57,440
If you have a problem with a
如果你有一个问题,问题中有
31
00:00:57,560 --> 00:00:58,820
very small number of positive
很少数量的正例,
32
00:00:59,720 --> 00:01:01,780
examples, and remember examples of
记住,当这些例子
33
00:01:01,890 --> 00:01:03,000
y equals one are the
的y的取值等于1时,
34
00:01:03,650 --> 00:01:05,520
anomalous examples, then
它们是异常的例子,
35
00:01:06,170 --> 00:01:08,160
you might consider using an anomaly detection algorithm inset.
那么你可以考虑使用一个嵌入的异常检测算法。
36
00:01:09,260 --> 00:01:10,430
So having 0 to 20,
所以如果有0到20个
37
00:01:10,600 --> 00:01:12,740
maybe up to 50 positive examples,
可能最多到50个正例,
38
00:01:13,450 --> 00:01:15,190
might be pretty typical, and usually,
这可能是很典型的,并且通常情况下,
39
00:01:15,680 --> 00:01:16,810
we have such a small set
我们有很少数量的
40
00:01:17,130 --> 00:01:18,340
of positive examples,
正例,
41
00:01:19,270 --> 00:01:20,170
we are going to save the positive
我们会把这些正例保留下来,
42
00:01:20,510 --> 00:01:21,530
examples just for the cross
仅是把它们作为交叉
44
00:01:21,840 --> 00:01:24,440
validation sets and test sets.
验证集和测试集。
43
00:01:24,850 --> 00:01:26,130
In contrast, in a typical
与些相反,在一个典型的
44
00:01:26,510 --> 00:01:28,560
normal anomaly detection setting,
正规的异常检测下,
45
00:01:29,340 --> 00:01:30,630
we will often have a relatively
我们通常将会有一个相对更大
46
00:01:31,010 --> 00:01:32,340
large number of negative examples,
数量的反例,
47
00:01:33,110 --> 00:01:34,300
of these normal examples of
比如说这些正常的
48
00:01:34,910 --> 00:01:36,710
normal aircraft engines.
航空发动机。
49
00:01:37,720 --> 00:01:38,900
And we can then use this very
我们可以用这些更大
50
00:01:39,200 --> 00:01:40,240
large number of negative examples,
数量的反例,
51
00:01:41,470 --> 00:01:42,510
with which to fit the model
去拟合
52
00:01:43,000 --> 00:01:44,090
p of x. And so, there is
关于x的模型p. 因此,
55
00:01:44,190 --> 00:01:45,930
this idea in many anomaly detection
在许多异常检测
53
00:01:46,320 --> 00:01:48,510
applications, you have
应用中有这样一个思想,
54
00:01:48,760 --> 00:01:50,220
very few positive examples, and
你有很少的正例,
55
00:01:50,320 --> 00:01:52,540
lots of negative examples, and when
很多的反例,
56
00:01:52,810 --> 00:01:54,960
we are doing the process of
当我们在估计关于x的模型p的过程中,
57
00:01:55,220 --> 00:01:57,520
estimating p of x, of fitting all those Gaussian parameters,
当我们拟合这些高斯参数时,
58
00:01:58,650 --> 00:02:00,690
we need only negative examples to do that.
我们只需要反例就够了。
59
00:02:00,850 --> 00:02:01,680
So if you have a lot of negative data,
所以,如果你有大量的反例数据,
60
00:02:02,140 --> 00:02:04,310
we can still fit to p of x pretty well.
你仍然可以很好的拟合关于x的模型p.
61
00:02:05,340 --> 00:02:07,090
In contrast, for supervised learning,
与此相反,对于监督学习而言,
62
00:02:07,760 --> 00:02:08,790
more typically we would have
更为典型情况是我们需要
63
00:02:09,180 --> 00:02:10,810
a reasonably large number of
有一个合理的数量比较大的
64
00:02:11,050 --> 00:02:12,370
both positive and negative examples.
正例和反例。
65
00:02:13,920 --> 00:02:14,970
And so this is one
所以,这是一个
66
00:02:15,070 --> 00:02:16,240
way to look at your problem
观察你的问题的方式
67
00:02:16,770 --> 00:02:17,860
and decide if you should use
以决定你是应该使用
68
00:02:18,240 --> 00:02:20,180
an anomaly detection algorithm or a supervised learning algorithm.
一个异常检测算法还是一个监督学习算法。
69
00:02:21,750 --> 00:02:24,750
Here is another way people often think about anomaly detection algorithms.
这有另外一个人们经常思考异常检测算法的方式。
70
00:02:25,530 --> 00:02:26,890
So, for anomaly detection applications
确实如此,对于许多异常检测的应用,
71
00:02:27,900 --> 00:02:28,890
often there are many
经常有许多
72
00:02:29,040 --> 00:02:30,600
different types of anomalies.
不同类型的异常。
73
00:02:31,280 --> 00:02:31,770
So think about aircraft engines.
比如航空发动机。
74
00:02:32,040 --> 00:02:34,680
You know there are so many different ways for aircraft engines to go wrong.
你们知道当发动机异常时,有许多不同的异常形式。
75
00:02:34,880 --> 00:02:36,980
Right? There are so many things that could go wrong that could break an aircraft engine.
对吧?有许多方面可以出现故障,打断航空发动机的正常工作。
76
00:02:38,830 --> 00:02:40,080
And so, if that's the
因此,如果情况就是这样,
77
00:02:40,120 --> 00:02:40,940
case and you have a pretty small
你有很少数量的
78
00:02:41,140 --> 00:02:43,560
set of positive examples, then
正例,那么
79
00:02:44,430 --> 00:02:46,760
it can be difficult for
这将很困难,
80
00:02:47,580 --> 00:02:48,380
an algorithm to learn from your small
如果使用一个算法从你的
81
00:02:48,740 --> 00:02:50,130
set of positive examples what the anomalies look like.
这么少数量的正例中去学习异常是什么。
82
00:02:50,180 --> 00:02:51,880
And in particular,
尤其是
83
00:02:52,800 --> 00:02:54,050
you know, future anomalies may look
你也知道,未来可能出现的异常
84
00:02:54,220 --> 00:02:55,750
nothing like the ones you've seen so far.
可能与你至今见过的异常完全不同。
85
00:02:56,050 --> 00:02:57,540
So maybe in your set
可能在你的
86
00:02:57,790 --> 00:02:59,030
of positive examples, maybe you
正例中,有5个或者10个,或者20个
87
00:02:59,190 --> 00:02:59,740
had seen 5 or 10, or 20
你已经见过的
88
00:02:59,950 --> 00:03:02,960
different ways that an aircraft engine could go wrong.
航空发动机发生故障的不同情况。
89
00:03:03,780 --> 00:03:05,600
But maybe tomorrow, you
但是,可能到了明天,你
90
00:03:05,780 --> 00:03:07,110
need to detect a totally
就需要去检测一个全新
91
00:03:07,440 --> 00:03:08,870
new set, a totally new
的集合,一个全新
92
00:03:09,250 --> 00:03:10,620
type of anomaly, a totally
异常类型, 一个全
93
00:03:10,820 --> 00:03:12,170
new way for an aircraft
新的航空
94
00:03:12,570 --> 00:03:13,870
engine to be broken that
发动机故障的方式,
95
00:03:14,090 --> 00:03:15,660
you have just never seen before,
而且这个你之前从来就没有见过,
96
00:03:15,950 --> 00:03:17,010
and if that is the case,
并且,假如情况就是这样,
97
00:03:17,550 --> 00:03:18,460
then it might be more
那么它将更
98
00:03:18,650 --> 00:03:20,020
promising to just model
有希望,如果去为反例建立
99
00:03:20,480 --> 00:03:21,770
the negative examples, with a
一个模型,
100
00:03:21,970 --> 00:03:23,620
sort of a Gaussian model
模型是一个种关于x的高斯
101
00:03:23,970 --> 00:03:24,950
P of X. Rather than try
模型P。而不是很费劲地
102
00:03:25,290 --> 00:03:26,250
too hard to model the positive
去建立模型去拟合
103
00:03:26,640 --> 00:03:27,850
examples, because, you know,
正例,因为,你也知道
104
00:03:28,040 --> 00:03:29,310
tomorrow's anomaly may be
明天的异常可能
105
00:03:29,420 --> 00:03:32,680
nothing like the ones you've seen so far.
与你至今见过的这些完全不同。
106
00:03:33,140 --> 00:03:34,640
In contrast, in some other
与此相反,在一些其它的
107
00:03:34,790 --> 00:03:36,170
problems you have enough
问题中,你有足够的
108
00:03:36,600 --> 00:03:37,790
positive examples for an algorithm
正例使得一个算法
109
00:03:38,730 --> 00:03:40,850
to get a sense of what the positive examples are like.
能够学习到正例是什么样子的。
110
00:03:40,980 --> 00:03:42,860
And in particular, if you
并且尤其是,假如你
111
00:03:42,960 --> 00:03:44,270
think that future positive examples
认为未来的正例
112
00:03:44,870 --> 00:03:45,690
are likely to be similar
很可能与你当前
113
00:03:46,130 --> 00:03:46,980
to ones in the training set,
训练集中的正例很相似。
114
00:03:47,670 --> 00:03:49,090
then in that setting it might
那么,在这种情况下,
115
00:03:49,230 --> 00:03:51,720
be more reasonable to have a supervised learning algorithm,
使用一个监督学习算法将会更合理,
116
00:03:52,550 --> 00:03:53,390
that looks at a lot of
算法通过察看
117
00:03:53,520 --> 00:03:54,760
the positive examples, looks at a
正例,察看很多的
118
00:03:54,930 --> 00:03:56,530
lot of the negative examples, and
反例,通过观察到的这些知识,
119
00:03:56,650 --> 00:03:58,980
uses that to try to distinguish between positives and negatives.
去尝试区分正例和反例。
120
00:04:01,620 --> 00:04:02,780
So hopefully this gives you
希望这些可以让你
121
00:04:02,870 --> 00:04:04,180
a sense of if you have
明白,当你有一个
122
00:04:04,520 --> 00:04:05,690
a specific problem you should think
特定的问题,你能知道
123
00:04:05,950 --> 00:04:07,800
about using the anomaly
你应该使用异常检测算法
124
00:04:08,110 --> 00:04:09,450
detection algorithm or a supervised learning algorithm.
还是监督学习算法。
125
00:04:11,110 --> 00:04:12,340
And the key difference really is,
最关键的不同之处在于
126
00:04:12,520 --> 00:04:13,870
that in anomaly detection, after
在异常检测中,
127
00:04:14,020 --> 00:04:15,040
we have such a small
我们只有很少数量的
128
00:04:15,330 --> 00:04:17,200
number of positive examples that there
正例,以致于对于一个
129
00:04:17,240 --> 00:04:18,640
is not possible, for a learning
学习算法而言,它是不可能
130
00:04:19,330 --> 00:04:21,810
algorithm to learn that much from the positive examples.
从正例中学习到足够的知识。
131
00:04:22,430 --> 00:04:23,440
And so what we do instead,
所以,我们应该做的是
132
00:04:23,890 --> 00:04:25,050
is take a large set of
采用大量的
133
00:04:25,230 --> 00:04:26,420
negative examples, and have it just
反例, 并且让它学习
134
00:04:27,050 --> 00:04:28,070
learned a lot, learned p
很多,学习到
135
00:04:28,230 --> 00:04:29,300
of x from just the negative
关于反例的,关于x的模型p,
136
00:04:29,500 --> 00:04:31,730
examples of the normal aircraft engines, say.
例如这些反例就是正常的航空发动机。
137
00:04:32,190 --> 00:04:33,480
And we reserve the small
我们保留小数量的
138
00:04:33,640 --> 00:04:36,740
number of positive examples for evaluating our algorithm
正例去评估我们的算法,
139
00:04:37,350 --> 00:04:39,680
to use in either the cross validation sets or the test sets.
这一少部分的正例要么用于交叉验证集要么用于测试集。
140
00:04:41,210 --> 00:04:42,380
And just as a side comment about
仅仅是对这
141
00:04:42,620 --> 00:04:43,970
these many different types of
许多不同类型的异常
145
00:04:44,090 --> 00:04:45,490
anomalies, you know, in
的一个方面的说明,你知道的,
142
00:04:45,790 --> 00:04:46,910
some earlier videos we talked
在前几讲的视频中,我们讨论的
143
00:04:47,050 --> 00:04:49,060
about the email SPAM examples.
垃圾邮件的例子。
144
00:04:50,020 --> 00:04:51,510
In those examples, there are
在这些例子中,实际有
145
00:04:51,910 --> 00:04:53,450
actually many different types of SPAM email.
有许多不同类型的垃圾邮件。
146
00:04:53,930 --> 00:04:54,750
The SPAM email is trying to
有种垃圾邮件企图向你
147
00:04:55,030 --> 00:04:57,650
sell you things spam email, trying to steal your passwords,
兜售物品,企图偷取你的密码,
148
00:04:58,470 --> 00:05:01,060
this is called fishing emails, and many different types of SPAM emails.
这种叫做钓鱼邮件,而且有许多种其它类型的垃圾邮件。
149
00:05:01,820 --> 00:05:03,490
But for the SPAM problem, we usually
但是对于垃圾邮件问题,我们通常
150
00:05:03,930 --> 00:05:05,660
have enough examples of spam
有足够的垃圾
151
00:05:06,000 --> 00:05:07,400
email to see, you know,
邮件例子去观察,你也知道,
152
00:05:07,490 --> 00:05:08,650
most of these different types of
对于这许多不同类型的垃圾邮件,
153
00:05:08,890 --> 00:05:10,200
SPAM email, because we have a
因为我们有大量的
154
00:05:10,410 --> 00:05:11,650
large set of examples of
的例子,
155
00:05:11,860 --> 00:05:13,050
SPAM, and that's why we
这就是为什么我们
156
00:05:13,330 --> 00:05:14,800
usually think of SPAM as
通常把垃圾邮件问题看成是
157
00:05:14,980 --> 00:05:16,510
a supervised learning setting, even
一个监督学习问题,
158
00:05:16,710 --> 00:05:17,390
though, you know, there may be
即使我们都知道,有许多
159
00:05:17,530 --> 00:05:19,230
many different types of SPAM.
不同各类的垃圾邮件。
160
00:05:21,890 --> 00:05:23,170
And so, if we look at
因此, 假如我们
161
00:05:23,310 --> 00:05:24,940
some applications of anomaly detection
通过于监督学习相对比,
162
00:05:25,600 --> 00:05:27,290
versus supervised learning, we'll find
观察一些使用异常监测的应用,我们会
163
00:05:27,480 --> 00:05:29,280
that, in fraud detection, if
发现,在欺骗检测中,
164
00:05:29,410 --> 00:05:31,040
you have many different types
如果你有许多不同的方式
165
00:05:31,450 --> 00:05:32,510
of ways for people to
使人们可以
166
00:05:32,680 --> 00:05:34,120
try to commit fraud, and a
尝试欺骗,并且有相对
167
00:05:34,170 --> 00:05:35,730
relevantly small training set, a
较少的训练集,
168
00:05:35,880 --> 00:05:37,500
small number of fraudulent users
以及网站上只有很小一部分的欺骗性
169
00:05:37,920 --> 00:05:40,300
on your website, then I would use an anomaly detection algorithm.
用户,那么我将使用一个异常检测算法。
170
00:05:41,310 --> 00:05:42,520
I should say, if you
我应该说,假如你有,
171
00:05:42,650 --> 00:05:44,560
have, if you are very a
假如你是一个
172
00:05:44,700 --> 00:05:46,810
major online retailer, and
大的在线零售商,
173
00:05:46,930 --> 00:05:48,170
if you actually have had a
如果实际上
174
00:05:48,330 --> 00:05:49,230
lot of people try to commit
在你的网站上有很多人
175
00:05:49,390 --> 00:05:50,420
fraud on your website, so if
尝试欺骗,那么
176
00:05:50,480 --> 00:05:51,340
you actually have a lot of
你实际上有许多例子,
177
00:05:51,410 --> 00:05:53,760
examples where y equals 1, then
这些例子的y等于1,那么,
178
00:05:53,970 --> 00:05:55,410
you know, sometimes fraud detection
你也知道,有时欺骗检测
179
00:05:55,700 --> 00:05:58,030
could actually shift over to the supervised learning column.
实际上可以转换成监督学习问题。
180
00:05:58,850 --> 00:06:01,000
But, if you
但是,如果你
181
00:06:01,210 --> 00:06:02,440
haven't seen that many
还没有见过许多例子,
182
00:06:02,940 --> 00:06:04,480
examples of users doing
在这些例子中,用户
183
00:06:04,690 --> 00:06:05,720
strange things on your website
在你的网站上做奇怪的事情,
184
00:06:05,920 --> 00:06:07,970
then, more frequently, fraud detection
那么,更常见的是,欺骗检测
185
00:06:08,510 --> 00:06:09,730
is actually treated as an
实际是被异常检测算法来处理了,
186
00:06:09,990 --> 00:06:12,060
anomaly detection algorithm, rather than one of the supervised learning algorithm.
而不是被监督学习算法处理。
187
00:06:14,140 --> 00:06:15,160
Other examples, we talked about
其它的例子,我们讨论过的
188
00:06:15,310 --> 00:06:16,810
manufacturing already, hopefully you'll
生产问题,希望你可以
189
00:06:16,950 --> 00:06:18,230
see more normal examples,
看到更多正常的例子,
190
00:06:19,110 --> 00:06:19,840
not that many anomalies.
而不是那么多异常例子。
191
00:06:20,520 --> 00:06:21,560
But then again, for some manufacturing
但是,话又说回来,对于一些生产
192
00:06:22,180 --> 00:06:23,900
processes, if you're
过程,如果你在
193
00:06:23,990 --> 00:06:25,690
manufacturing very large volumes
进行大量生产,
194
00:06:25,860 --> 00:06:26,780
and you've seen a lot
并且你已经见到了许多
195
00:06:27,230 --> 00:06:29,220
of bad examples, maybe manufacturing
坏的例子,可能生产问题
196
00:06:29,790 --> 00:06:31,690
could shift to the supervised learning column as well.
也转换到了监督学习问题。
197
00:06:32,630 --> 00:06:33,680
But, if you haven't seen that
但是,假如你在过去的产品中