-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathsearch.xml
1903 lines (915 loc) · 757 KB
/
search.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<search>
<entry>
<title>Navigating Misinformation - How to identify and verify what you see on the web</title>
<link href="/2022/05/26/self-directed-course-navigating-misinformation/"/>
<url>/2022/05/26/self-directed-course-navigating-misinformation/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h2><p>This article is a learning note of the self-directed course <a href="https://journalismcourses.org/course/misinformation/">Navigating Misinformation - How to identify and verify what you see on the web</a>.</p><p>The main purpose of this course is to help researchers learn how fact-checkers identify and verify online content, namely how responsible reporting works in an age of misinformation/disinformation.<br>The topics below are involved in this course:</p><ol><li>Discovery of problematic content</li><li>Basic verification of online sources</li><li>Advanced verification of online sources<ol><li>How date and time stamps work on social posts</li><li>How to geo-locate where a photo or video was taken</li><li>Tools overview to help determine the time in a photo or video</li><li>Verification challenge</li></ol></li></ol><h2 id="Discovery-of-problematic-content"><a href="#Discovery-of-problematic-content" class="headerlink" title="Discovery of problematic content"></a>Discovery of problematic content</h2><p>To keep track of the misleading claims and content, journalists are monitoring multiple social media. The first and foremost thing to figure out is <strong>what should be monitored - groups and/or topics</strong>, and what you choose will depend on the social platform. In general, journalists use Reddit, Facebook and Twitter as information sources.</p><h3 id="Information-sources"><a href="#Information-sources" class="headerlink" title="Information sources"></a>Information sources</h3><h4 id="Reddit"><a href="#Reddit" class="headerlink" title="Reddit"></a>Reddit</h4><p>Reddit is the eighth most popular website in the world even more popular than Twitter. <strong>Misinformation ends up circulating widely on Facebook and Twitter often appears on Reddit first</strong>. Reddit is made up of a collection of open forums called <strong>subreddit</strong>, the subreddit can be discovered through the general search page. Once you have found an interesting subreddit, you can search for its name to discover similar subreddits. Also, keep an eye out for new subreddits mentioned in the comments.</p><h4 id="Twitter"><a href="#Twitter" class="headerlink" title="Twitter"></a>Twitter</h4><p>There are two key ways to monitor Twitter activity: <strong>terms and lists</strong>.<br>The terms include keywords, domains, hashtags and usernames. More specifically, journalists focus on those websites and particular accounts that are likely to produce misleading content, and tweets that include certain keywords or hashtags, like “snowflakes” and “#Lockherup”. The <a href="https://developer.twitter.com/en/docs/api-reference-index">Twitter Search API</a> provides a powerful way to form a query. Below are the example of Twitter search operators:</p><p><img src="https://firstdraftnews.org/wp-content/uploads/2017/07/tweets_monitoring.png" alt="Twitter search operators"></p><p>On the other hand, using Twitter lists is another effective way of quickly putting together groups of accounts to monitor. The lists can be created by any Twitter user who is following a group of accounts as a unit. Journalists use Twitter lists to capitalize on the expertise of other journalists, however, Twitter hasn’t provided an API to easily search Twitter lists based on keywords. Thus, we have to utilize a Google hack to search through Twitter lists.<br>The hack is: for any topic keywords that you are interested in, add the query <code>site:twitter.com/*/lists [keywords]</code> in the google search bar. Google will return keyword-related public lists of all Twitter users. What’s more, by going to the list creator’s profile and clicking <code>More</code> and then <code>Lists</code>, you can find more lists that potentially attract you.<br>And by keep doing so recursively, you can combine the lists that you have found into a super list.</p><h4 id="Facebook"><a href="#Facebook" class="headerlink" title="Facebook"></a>Facebook</h4><p>The potential to monitor Facebook is narrower due to two reasons. First, the content available is designated public by users. Second, Facebook does not support direct, programmatic access to the public feed.</p><h3 id="Monitoring-Reddit-Facebook-and-Instagram-with-CrowdTangle"><a href="#Monitoring-Reddit-Facebook-and-Instagram-with-CrowdTangle" class="headerlink" title="Monitoring Reddit, Facebook and Instagram with CrowdTangle"></a>Monitoring Reddit, Facebook and Instagram with CrowdTangle</h3><p><a href="https://www.crowdtangle.com/">Crowdtangle</a> was made free after being acquired by Facebook. It takes search queries, groups and pages and creates custom social feeds for Facebook, Instagram, and Reddit. If you give it a search query, it creates a feed of posts from the platform that match that query. If you give it a list of accounts, it creates a feed of posts from those accounts.</p><p><img src="https://firstdraftnews.org/wp-content/uploads/2017/08/crowdtangle.png" alt="crowdtange user interface"></p><h3 id="Monitoring-Twitter-with-TweetDeck"><a href="#Monitoring-Twitter-with-TweetDeck" class="headerlink" title="Monitoring Twitter with TweetDeck"></a>Monitoring Twitter with TweetDeck</h3><p>By far, the easiest way to monitor multiple Twitter streams in real-time is TweetDeck. With Tweetdeck, you can arrange an unlimited number of real-time streams of tweets side-by-side in columns that can easily be cycled through.</p><p><img src="https://firstdraftnews.org/wp-content/uploads/2017/08/tweetdeck.png" alt="TweetDeck"></p><h2 id="Basic-verification-of-online-sources"><a href="#Basic-verification-of-online-sources" class="headerlink" title="Basic verification of online sources"></a>Basic verification of online sources</h2><p>When attempting to verify a piece of content, journalists always investigate five elements:</p><ol><li><p><strong>Provenance</strong>: verify if the content is original</p><p> If we are not looking at the original, all the metadata about the source and date will be wrong and useless. The journalists are facing the challenge that footage can easily jump from platform to platform or prevail inside a platform, thus we should always be suspicious about the content originality.</p></li><li><p><strong>Source</strong>: verify who created the content</p><p> Note that source means who captured the content instead of who uploaded the content. To verify the source, one can depend on two aspects: directly contact the user and check the user location and event location are the same.</p></li><li><p><strong>Date</strong>: verify when the content captured</p><p> Never assuming the content uploaded date is when the content was captured.</p></li><li><p><strong>Location</strong>: verify where the content captured</p><p> The geolocation can be easily manipulated on social media platforms, so it is better to double-check the location on a map or satellite image.</p></li><li><p><strong>Motivation</strong>: verify why the content captured</p><p> The user can be an accidental eyewitness or a responsible stakeholder.</p></li></ol><p>With the help of reversed image search tools such as <a href="https://images.google.com/">Google Images</a> and <a href="https://chrome.google.com/webstore/detail/reveye-reverse-image-sear/keaaclcjhehbbapnphnmpiklalfhelgf?hl=en">RevEye</a>, one can easily accomplish the verification.</p><h2 id="Advanced-verification-of-online-sources"><a href="#Advanced-verification-of-online-sources" class="headerlink" title="Advanced verification of online sources"></a>Advanced verification of online sources</h2><ol><li><p>Wolfram Alpha</p><p> Wolfram Alpha is a knowledge engine that brings available information from across the web. It has a powerful tool for checking the weather from any particular location on any date. When you are trying to <strong>double-check the date on an image or video</strong>, it can be very useful.</p></li><li><p>Shadow Analysis</p><p> Check if the shadow is in the right shape, right length and the same direction.</p></li><li><p>Geo-location</p><p> Only a very small percentage of social media posts are geo-tagged by users themselves. Luckily <a href="https://www.google.com/maps/">high-quality satellite and street view imagery</a> allows you to pace yourself on the map and stand in the place that the user was standing when they captured the footage.</p></li></ol><h2 id="References"><a href="#References" class="headerlink" title="References"></a>References</h2><ul><li><a href="https://firstdraftnews.org/articles/monitor-social-media/">How to begin to monitor social media for misinformation</a></li></ul>]]></content>
<categories>
<category> Factchecking </category>
</categories>
</entry>
<entry>
<title>Deecamp-28组AGI第一次沙龙分享活动纪要</title>
<link href="/2019/07/21/AGI-salon-1/"/>
<url>/2019/07/21/AGI-salon-1/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Deecamp-28组AGI沙龙活动纪要"><a href="#Deecamp-28组AGI沙龙活动纪要" class="headerlink" title="Deecamp-28组AGI沙龙活动纪要"></a>Deecamp-28组AGI沙龙活动纪要</h2><ul><li>日期:2019年7月21日周日晚上8点-10点30分</li><li>地点:国科大教一楼132</li><li>轮值主持人: 朱正源</li><li>沙龙主题: 量化投资模型分享及实践展示</li></ul><h2 id="沙龙内容"><a href="#沙龙内容" class="headerlink" title="沙龙内容"></a>沙龙内容</h2><h3 id="主持人宣讲"><a href="#主持人宣讲" class="headerlink" title="主持人宣讲"></a>主持人宣讲</h3><p><img src="https://user-images.githubusercontent.com/13566583/61606864-d7ecfc00-ac7e-11e9-9dac-80333cea971e.jpg" alt="agi-host"></p><h3 id="刘兆丰"><a href="#刘兆丰" class="headerlink" title="刘兆丰"></a>刘兆丰</h3><blockquote><p>分享了<strong>Using Deep Reinforcement Learning to Trade</strong>, 介绍了强化学习的发展历程,讲解了一种结合深度神经网络、递归神经网络和强化学习的量化交易决策模型,并展示了该模型在真实数据中的实验结果。<br><img src="https://user-images.githubusercontent.com/13566583/61606998-6bbec800-ac7f-11e9-8897-4da210ada6f3.jpg" alt="8061563762930_ pic_hd"><br><img src="https://user-images.githubusercontent.com/13566583/61607007-737e6c80-ac7f-11e9-814f-7a1fad39e1a6.jpg" alt="8071563762947_ pic_hd"></p></blockquote><h3 id="朱正源"><a href="#朱正源" class="headerlink" title="朱正源"></a>朱正源</h3><blockquote><p>分享了<a href="https://docs.google.com/presentation/d/1W7RGD3X_MZB3dfzaTrQdGYzv_zuay2JpYOTQUjXzK5A/edit#slide=id.g4461849552_8_1825">Introduction to Quantitative Investment with Deep Learning</a>, 分享了有关量化投资方向的股票预测模型,使用工业届常用的时间序列回归模型,通过预先构建多种假设,使用LSTM进行滑窗预测,介绍了量化交易中的真实情况,股市有风险,入市需谨慎~<br><img src="https://user-images.githubusercontent.com/13566583/61607014-7bd6a780-ac7f-11e9-801e-524c6cbea4c3.jpg" alt="8081563762955_ pic_hd"><br><img src="https://user-images.githubusercontent.com/13566583/61607019-7da06b00-ac7f-11e9-84ab-703ba1ad8525.jpg" alt="8091563762967_ pic_hd"></p></blockquote><h3 id="葛景琳"><a href="#葛景琳" class="headerlink" title="葛景琳"></a>葛景琳</h3><blockquote><p><strong>设计人员分享</strong>:介绍前期调研的几个阶段及调研目的,分享新产品项目推进的流程概况。</p></blockquote><p><img src="https://user-images.githubusercontent.com/13566583/61606932-23070f00-ac7f-11e9-903b-2874c5ec77fb.jpg" alt="8101563762973_ pic_hd"><br><img src="https://user-images.githubusercontent.com/13566583/61606934-269a9600-ac7f-11e9-8e03-d950207daf6f.jpg" alt="8111563762981_ pic_hd"></p><h2 id="集体合照"><a href="#集体合照" class="headerlink" title="集体合照"></a>集体合照</h2><p><img src="https://user-images.githubusercontent.com/13566583/61606966-4b8f0900-ac7f-11e9-876f-d2d8e60502ce.jpg" alt="8041563762837_ pic_hd"></p><h2 id="沙龙讨论内容"><a href="#沙龙讨论内容" class="headerlink" title="沙龙讨论内容"></a>沙龙讨论内容</h2><ol><li>在活动开始前建议调整设备</li><li>沙龙注重时间控制</li><li>不要拘束,不用师兄师姐的称呼,不要过分自谦</li><li>Demo展示具体形式需要等待产业导师就位再做定夺</li><li>下次沙龙暂定于下个没有课的晚上</li></ol><h2 id="特别鸣谢Deecamp全体人员对本沙龙的支持与帮助"><a href="#特别鸣谢Deecamp全体人员对本沙龙的支持与帮助" class="headerlink" title="特别鸣谢Deecamp全体人员对本沙龙的支持与帮助~"></a>特别鸣谢Deecamp全体人员对本沙龙的支持与帮助~</h2>]]></content>
<categories>
<category> Quant </category>
</categories>
<tags>
<tag> agi </tag>
<tag> salon </tag>
</tags>
</entry>
<entry>
<title>DeepInvestment introduction</title>
<link href="/2019/07/12/DeepInvestment/"/>
<url>/2019/07/12/DeepInvestment/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="The-slide-shows-all-you-need"><a href="#The-slide-shows-all-you-need" class="headerlink" title="The slide shows all you need"></a>The slide shows all you need</h2><p><strong>Whose money I want to make</strong>: Essentially according to Game Theory</p><p><a href="https://docs.google.com/presentation/d/1W7RGD3X_MZB3dfzaTrQdGYzv_zuay2JpYOTQUjXzK5A/edit#slide=id.g4461849552_8_1825">Introduction to Quantitative Investment with Deep Learning</a></p>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> deepLearning </tag>
<tag> quantitativeInvestment </tag>
</tags>
</entry>
<entry>
<title>Mutation test and Deep Learning</title>
<link href="/2019/06/09/mutationtest-deeplearning/"/>
<url>/2019/06/09/mutationtest-deeplearning/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Brief-Introduction-to-Mutation-Test"><a href="#Brief-Introduction-to-Mutation-Test" class="headerlink" title="Brief Introduction to Mutation Test"></a>Brief Introduction to Mutation Test</h2><blockquote><p>Mutation testing is a mature technology for testing data quality assessment in traditional software.<br>Mutation testing is a form of white-box testing.<br>Mutation testing (or mutation analysis or program mutation) is used to design new software tests and evaluate the quality of existing software tests. </p></blockquote><h3 id="Goal"><a href="#Goal" class="headerlink" title="Goal"></a>Goal</h3><p>The goals of mutation testing are multiple:</p><ul><li>identify weakly tested pieces of code (those for which mutants are not killed)</li><li>identify weak tests (those that never kill mutants)</li><li>compute the mutation score</li><li>learn about error propagation and state infection in the program</li></ul><h3 id="Example"><a href="#Example" class="headerlink" title="Example"></a>Example</h3><p>Selecting some <strong>mutation operations</strong>, and applying them to the source code for each executable code segment in turn. </p><p>The result of using a mutation operation on a program is called a mutant heterogeneity. </p><p><strong>If the test unit can detect the error (ie, a test fails), then the mutant is said to have been killed.</strong></p><h4 id="Foo"><a href="#Foo" class="headerlink" title="Foo"></a>Foo</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">foo</span>(<span class="params">x: <span class="built_in">int</span>, y: <span class="built_in">int</span></span>) -> <span class="built_in">int</span>:</span><br><span class="line">z = <span class="number">0</span></span><br><span class="line">If x><span class="number">0</span> <span class="keyword">and</span> y><span class="number">0</span>:</span><br><span class="line">z = x</span><br><span class="line"><span class="keyword">return</span> z</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">foo</span>(<span class="params">x: <span class="built_in">int</span>, y: <span class="built_in">int</span></span>) -> <span class="built_in">int</span>:</span><br><span class="line">z = <span class="number">0</span></span><br><span class="line">If x><span class="number">0</span> <span class="keyword">and</span> y>=<span class="number">0</span>:</span><br><span class="line">z = x</span><br><span class="line"><span class="keyword">return</span> z</span><br></pre></td></tr></table></figure><p>Given some test cases, we find that unit test cannot find variants<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Success</span><br><span class="line">assertEquals(<span class="number">2</span>, foo(<span class="number">2</span>, <span class="number">2</span>))</span><br><span class="line">assertEquals(<span class="number">0</span>, foo(<span class="number">2</span>, <span class="number">1</span>))</span><br><span class="line">assertEquals(<span class="number">0</span>, foo(-<span class="number">1</span>, <span class="number">2</span>))</span><br></pre></td></tr></table></figure><br>Add new tests to achieve the effect of eliminating variants:<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="literal">False</span></span><br><span class="line">assertEquals(<span class="number">0</span>, foo(<span class="number">2</span>, <span class="number">0</span>))</span><br></pre></td></tr></table></figure></p><h4 id="Bar"><a href="#Bar" class="headerlink" title="Bar"></a>Bar</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">bar</span>(<span class="params">a: <span class="built_in">int</span>, b:<span class="built_in">int</span></span>) -> <span class="built_in">int</span>:</span><br><span class="line"> <span class="keyword">if</span> a <span class="keyword">and</span> b:</span><br><span class="line"> c = <span class="number">1</span></span><br><span class="line"> <span class="keyword">else</span>:</span><br><span class="line"> c = <span class="number">0</span></span><br><span class="line"> <span class="keyword">return</span> c</span><br><span class="line"><span class="comment"># Here is an mutation which operator is `and`</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">bar</span>(<span class="params">a: <span class="built_in">int</span>, b:<span class="built_in">int</span></span>) -> <span class="built_in">int</span>:</span><br><span class="line"> <span class="keyword">if</span> a <span class="keyword">or</span> b:</span><br><span class="line"> c = <span class="number">1</span></span><br><span class="line"> <span class="keyword">else</span>:</span><br><span class="line"> c = <span class="number">0</span></span><br><span class="line"> <span class="keyword">return</span> c</span><br></pre></td></tr></table></figure><p>Given a test case that will absolutely pass:<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Success</span><br><span class="line">assertEquals(<span class="number">1</span>, foo(<span class="number">1</span>, <span class="number">1</span>))</span><br></pre></td></tr></table></figure><br>But we need to kill the mutation by adding more test cases:<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">Failed:</span><br><span class="line">assertEquals(<span class="number">0</span>, foo(<span class="number">1</span>, <span class="number">0</span>))</span><br><span class="line">assertEquals(<span class="number">0</span>, foo(<span class="number">0</span>, <span class="number">1</span>))</span><br><span class="line">assertEquals(<span class="number">1</span>, foo(<span class="number">0</span>, <span class="number">0</span>)) </span><br></pre></td></tr></table></figure></p><h3 id="Inspriation"><a href="#Inspriation" class="headerlink" title="Inspriation"></a>Inspriation</h3><p>In deep learning, you can also create variants by changing the operators in the model. </p><p>Adding the idea of the mutation test to the deep learning model, if the performance of the model after the mutation is unchanged, then there is a problem with the test set</p><p>It is necessary to add or generate higher quality test data to achieve the data enhancement effect.</p><h2 id="A-comparison-of-traditional-and-DL-software-development"><a href="#A-comparison-of-traditional-and-DL-software-development" class="headerlink" title="A comparison of traditional and DL software development"></a>A comparison of traditional and DL software development</h2><p><img src="https://ws1.sinaimg.cn/mw690/ca26ff18gy1g3uypojkkyj20z20kc77j.jpg" alt=""></p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="https://en.wikipedia.org/wiki/Mutation_testing">wiki: Mutation Testing</a></li><li><a href="https://www.testwo.com/article/869">突变测试——通过一个简单的例子快速学习这种有趣的测试技术</a></li></ul>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> deepLearning </tag>
<tag> mutationTest </tag>
</tags>
</entry>
<entry>
<title>Comparison of ON-LSTM and DIORA</title>
<link href="/2019/05/31/ON-LSTM-and-DIORA/"/>
<url>/2019/05/31/ON-LSTM-and-DIORA/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="ON-LSTM"><a href="#ON-LSTM" class="headerlink" title="ON-LSTM"></a>ON-LSTM</h2><p>Insprition under the hood: How to introduce grammar tree structure into LSTM in an unsupervised apporach.</p><h3 id="Introduction-Ordered-Neurons-ON"><a href="#Introduction-Ordered-Neurons-ON" class="headerlink" title="Introduction: Ordered Neurons(ON)"></a>Introduction: Ordered Neurons(ON)</h3><ol><li>The neurons inside ON-LSTM are specifically <code>ordered</code> to <code>express richer information</code>: Change the order of update frequency.</li><li>The specific order of neurons is to integrate the hierarchical structure (tree structure) into the LSTM, allowing the LSTM to <code>automatically learn the hierarchical structure</code>.</li><li><code>High/Low level information</code>: Should keep longer/shorter in corresponding coding interval.</li><li><code>cumax()</code>: A special function to internate special $F1$ and $F2$gate.</li></ol><h3 id="The-nuts-and-bolts-in-Mathematic"><a href="#The-nuts-and-bolts-in-Mathematic" class="headerlink" title="The nuts and bolts in Mathematic"></a>The nuts and bolts in Mathematic</h3><div class="row"><iframe src="https://drive.google.com/file/d/1UjxnKAcMtydDEr_-PuvVMRLK_2pjx80M/preview" style="width:100%; height:550px"></iframe></div><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ol><li><a href="https://kexue.fm/archives/6621">ON-LSTM:用有序神经元表达层次结构</a></li></ol>]]></content>
<categories>
<category> NLP </category>
</categories>
<tags>
<tag> LSTM </tag>
</tags>
</entry>
<entry>
<title>Using Scheduled Sample to improve sentence quality</title>
<link href="/2019/05/10/schedule-sampling/"/>
<url>/2019/05/10/schedule-sampling/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><p><strong>Note that the author is not Yoshua Bengio</strong></p><h2 id="Overview"><a href="#Overview" class="headerlink" title="Overview"></a>Overview</h2><p>In Seq2Seq sequence learning task, using Scheduled Sampling can improve performance of RNN model.</p><p>The ditribution bewteen traning stage and evaluating stage are different and reults in <strong>error accumulation question</strong> in evaluating stage. </p><p>The former methods deal with this error accumullation problem is <code>Teacher Forcing</code>.</p><p>Scheduled Sampling can solve the problem through take generated words as input for decoder in certain probability. </p><p>Note that scheduled sampling is only applied in training stage.</p><h2 id="Algorithm-Details"><a href="#Algorithm-Details" class="headerlink" title="Algorithm Details"></a>Algorithm Details</h2><p>In training stage, when generate word $t$, Instead of take ground truth word $y_{t<em>1}$ as input, Scheduled Sampling take previous generated word $g</em>{t-1}$ in certain probability.</p><p>Assume that in $i_{th}$ mini-batch, Schduled Sampling define a probability $\epsilon_i$ to control the input of decoder. And $\epsilon_i$ is a probability variable that decreasing as $i$ increasing.</p><p>There are three decreasing methods:<br>$$Linear Decay: \epsilon_i = max(\epsilon, (k-c)*i), where \epsilon restrict minimum of \epsilon_i, k and c controll the range of decay$$<br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2wav8w8fvj20hs0d841u.jpg" alt=""></p><p><strong>Warning</strong>:<br>In time step $t$, Scheduled Sampling will take $y_{t-1}$ according to $\epsilon<em>i$ as input. And take $g</em>{t-1}$ according to $1-\epsilon_i$ as input.</p><p>As a result, decoder will tend to use generated word as input.</p><h2 id="Implementation"><a href="#Implementation" class="headerlink" title="Implementation"></a>Implementation</h2><h3 id="Parameters"><a href="#Parameters" class="headerlink" title="Parameters"></a>Parameters</h3><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">parser.add_argument(<span class="string">'--scheduled_sampling_start'</span>, <span class="built_in">type</span>=<span class="built_in">int</span>, default=<span class="number">0</span>, <span class="built_in">help</span>=<span class="string">'at what epoch to start decay gt probability, -1 means never'</span>)</span><br><span class="line">parser.add_argument(<span class="string">'--scheduled_sampling_increase_every'</span>, <span class="built_in">type</span>=<span class="built_in">int</span>, default=<span class="number">5</span>,<span class="built_in">help</span>=<span class="string">'every how many epochs to increase scheduled sampling probability'</span>)</span><br><span class="line">parser.add_argument(<span class="string">'--scheduled_sampling_increase_prob'</span>, <span class="built_in">type</span>=<span class="built_in">float</span>, default=<span class="number">0.05</span>,<span class="built_in">help</span>=<span class="string">'How much to update the prob'</span>)</span><br><span class="line">parser.add_argument(<span class="string">'--scheduled_sampling_max_prob'</span>, <span class="built_in">type</span>=<span class="built_in">float</span>, default=<span class="number">0.25</span>,<span class="built_in">help</span>=<span class="string">'Maximum scheduled sampling prob.'</span>)</span><br></pre></td></tr></table></figure><h3 id="Assign-scheduled-sampling-probability"><a href="#Assign-scheduled-sampling-probability" class="headerlink" title="Assign scheduled sampling probability"></a>Assign scheduled sampling probability</h3><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># scheduled sampling probability is min(epoch*0.01, 0.25)</span></span><br><span class="line">frac = (epoch - opt.scheduled_sampling_start) // opt.scheduled_sampling_increase_every</span><br><span class="line">opt.ss_prob = <span class="built_in">min</span>(opt.scheduled_sampling_increase_prob * frac, opt.scheduled_sampling_max_prob)</span><br><span class="line">model.ss_prob = opt.ss_prob</span><br><span class="line"></span><br><span class="line"><span class="comment"># choose the word when decoding</span></span><br><span class="line"><span class="keyword">if</span> self.ss_prob > <span class="number">0.0</span>:</span><br><span class="line"> sample_prob = torch.FloatTensor(batch_size).uniform_(<span class="number">0</span>, <span class="number">1</span>).cuda()</span><br><span class="line"> sample_mask = sample_prob < self.ss_prob</span><br><span class="line"> <span class="keyword">if</span> sample_mask.<span class="built_in">sum</span>() == <span class="number">0</span>: <span class="comment"># use ground truth</span></span><br><span class="line"> last_word = caption[:, i].clone()</span><br><span class="line"> <span class="keyword">else</span>: <span class="comment"># use previous generated words</span></span><br><span class="line"> sample_ind = sample_mask.nonzero().view(-<span class="number">1</span>)</span><br><span class="line"> last_word = caption[:, i].data.clone()</span><br><span class="line"> <span class="comment"># fetch prev distribution: shape Nx(M+1)</span></span><br><span class="line"> prob_prev = torch.exp(log_probs.data)</span><br><span class="line"> last_word.index_copy_(<span class="number">0</span>, sample_ind,</span><br><span class="line"> torch.multinomial(prob_prev, <span class="number">1</span>).view(-<span class="number">1</span>).index_select(<span class="number">0</span>, sample_ind))</span><br><span class="line"> last_word = Variable(last_word)</span><br><span class="line"><span class="keyword">else</span>:</span><br><span class="line"> last_word = caption[:, i].clone()</span><br></pre></td></tr></table></figure><h2 id="Result"><a href="#Result" class="headerlink" title="Result"></a>Result</h2><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2wdkxqmgvj20sg09ywnl.jpg" alt=""></p><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ul><li><a href="https://cloud.tencent.com/developer/article/1081168">【序列到序列学习】使用Scheduled Sampling改善翻译质量</a></li></ul>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> videoCaptioning </tag>
</tags>
</entry>
<entry>
<title>basic_knowledge_supplement</title>
<link href="/2019/05/06/basic-knowledge-supplement/"/>
<url>/2019/05/06/basic-knowledge-supplement/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h1 id="Machine-Learning"><a href="#Machine-Learning" class="headerlink" title="Machine Learning"></a>Machine Learning</h1><h2 id="Basic-knowledge"><a href="#Basic-knowledge" class="headerlink" title="Basic knowledge"></a>Basic knowledge</h2><h3 id="Bias-and-variance"><a href="#Bias-and-variance" class="headerlink" title="Bias and variance"></a>Bias and variance</h3><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2rmgy3ib1j208s0dhtbb.jpg" alt=""></p><ol><li>Bias:</li></ol><p>Represent fitting ability, a naive model will lead to high bias because of underfitting.</p><ol><li>Variance</li></ol><p>Represent stability, a complex model will lead to high variance beacause of overfitting.</p><p>$$Generalization error = Bias^2 + Variance + Irreducible Error$$</p><h3 id="Generative-model-and-Discriminative-Model"><a href="#Generative-model-and-Discriminative-Model" class="headerlink" title="Generative model and Discriminative Model"></a>Generative model and Discriminative Model</h3><ol><li>Discriminative Model</li></ol><p>Learn a <code>function</code> or <code>conditional probability model P(X|Y)</code>(posterior probability) directly.</p><ol><li>Generative Model<br>Learn a <code>joint probability model P(X, Y)</code> then to calculate <code>P(Y|X)</code></li></ol><h3 id="Search-hyper-parameter"><a href="#Search-hyper-parameter" class="headerlink" title="Search hyper-parameter"></a>Search hyper-parameter</h3><ol><li>Grid search</li><li>Random search</li></ol><h3 id="Euclidean-distance-and-Cosine-distance"><a href="#Euclidean-distance-and-Cosine-distance" class="headerlink" title="Euclidean distance and Cosine distance"></a>Euclidean distance and Cosine distance</h3><p>Example: A=[2, 2, 2] B=[5, 5, 5] represents <code>two</code> review scores of <code>three</code> movie.<br>the Euclidean distance is $\sqrt{3^2 + 3^2 + 3^2}$, and the Cosine distance is $1$. As a result, Cosine distance can avoid of difference</p><p>After normalization, essentially they are the same,<br>$$D=(x-y)^2 = x^2+y^2-2|x||y|cosA = 2-2cosA,D=2(1-cosA)$$</p><h3 id="Confusion-Matrix"><a href="#Confusion-Matrix" class="headerlink" title="Confusion Matrix"></a>Confusion Matrix</h3><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2rmrezxi4j20wq042t9q.jpg" alt=""></p><ol><li>accuracy: $ACC = \frac{TP+TN}{TP+FN+FP+FN}$</li><li>precison: $P = \frac{TP}{TP+FP}$</li><li>recall: $R = \frac{TP}{TP+FN}$</li><li>F1: $F_1 = \frac{2TP}{2TP+FP+FN}$</li></ol><h3 id="deal-with-missing-value"><a href="#deal-with-missing-value" class="headerlink" title="deal with missing value"></a>deal with missing value</h3><ol><li>More missing value: drop feature column.</li><li>Less missing value: fill a value<ol><li>Fill outlier: <code>data.fillna(0)</code></li><li>Fill mean value: <code>data.fillna(data.mean())</code></li></ol></li></ol><h3 id="Describe-your-project"><a href="#Describe-your-project" class="headerlink" title="Describe your project"></a>Describe your project</h3><ol><li>Abstract reality to math problem</li><li>Describe your data</li><li>Proprocessing and feature selection</li><li>Model training and tuning</li></ol><h2 id="Algorithm"><a href="#Algorithm" class="headerlink" title="Algorithm"></a>Algorithm</h2><h3 id="Logistic-regreesion"><a href="#Logistic-regreesion" class="headerlink" title="Logistic regreesion"></a>Logistic regreesion</h3><h4 id="Defination"><a href="#Defination" class="headerlink" title="Defination"></a>Defination</h4><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2ro85d70dj209q017mwy.jpg" alt=""></p><h4 id="Loss-negative-log-los"><a href="#Loss-negative-log-los" class="headerlink" title="Loss: negative log los"></a>Loss: negative log los</h4><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2roe9667aj20ay04y0sp.jpg" alt=""></p><h3 id="Support-Vector-Machine"><a href="#Support-Vector-Machine" class="headerlink" title="Support Vector Machine"></a>Support Vector Machine</h3><h3 id="Decision-Tree"><a href="#Decision-Tree" class="headerlink" title="Decision Tree"></a>Decision Tree</h3><ol><li>ID3: use <code>information gain</code></li><li>C4.5: use <code>information gain rate</code><h3 id="Ensemble-Learning"><a href="#Ensemble-Learning" class="headerlink" title="Ensemble Learning"></a>Ensemble Learning</h3><h4 id="Boosting-AdaBoost-GBDT"><a href="#Boosting-AdaBoost-GBDT" class="headerlink" title="Boosting: AdaBoost GBDT"></a>Boosting: <code>AdaBoost</code> <code>GBDT</code></h4></li></ol><p>Seiral strategy, new learning machine is based on previous one</p><h4 id="GBDT-Gradient-Boosting-Decision-Tree"><a href="#GBDT-Gradient-Boosting-Decision-Tree" class="headerlink" title="GBDT(Gradient Boosting Decision Tree)"></a>GBDT(Gradient Boosting Decision Tree)</h4><h4 id="XGBoost"><a href="#XGBoost" class="headerlink" title="XGBoost"></a>XGBoost</h4><h4 id="Bagging-Random-forest-and-Dropout-in-Neural-Network"><a href="#Bagging-Random-forest-and-Dropout-in-Neural-Network" class="headerlink" title="Bagging: Random forest and Dropout in Neural Network"></a>Bagging: <code>Random forest</code> and <code>Dropout in Neural Network</code></h4><p>Parallel strategy, no dependency between learning machines.</p><h1 id="Deep-Learning"><a href="#Deep-Learning" class="headerlink" title="Deep Learning"></a>Deep Learning</h1><h2 id="Basic-Knowledge"><a href="#Basic-Knowledge" class="headerlink" title="Basic Knowledge"></a>Basic Knowledge</h2><h3 id="Overfitting-and-underfitting"><a href="#Overfitting-and-underfitting" class="headerlink" title="Overfitting and underfitting"></a>Overfitting and underfitting</h3><h4 id="Deal-with-overfitting"><a href="#Deal-with-overfitting" class="headerlink" title="Deal with overfitting"></a>Deal with overfitting</h4><ol><li><p>Data enhancement</p><ol><li>image: translation, rotation, scaling</li><li>GAN: generate new data</li><li>NLP: generate new data via neural machine translation</li></ol></li><li><p>Decrease the complexity of model</p><ol><li>neural network: decrease layer numbers and neuron numbers</li><li>decision tree: decrease tree depth and pruning</li></ol></li><li><p>Constrain weight:</p><ol><li>L1 regularization</li><li>L2 regularization</li></ol></li><li><p>Ensemble learning:</p><ol><li>Neural network: Dropout</li><li>Decision tree: random forest, GBDT</li></ol></li><li><p>early stopping</p></li></ol><h4 id="Deal-with-underfitting"><a href="#Deal-with-underfitting" class="headerlink" title="Deal with underfitting"></a>Deal with underfitting</h4><ol><li>add new feature</li><li>add model complexity</li><li>decrease regularization</li></ol><h3 id="Back-propagation-TODO-https-github-com-imhuay-Algorithm-Interview-Notes-Chinese-blob-master-A-E6-B7-B1-E5-BA-A6-E5-AD-A6-E4-B9-A0-A-E6-B7-B1-E5-BA-A6-E5-AD-A6-E4-B9-A0-E5-9F-BA-E7-A1-80-md"><a href="#Back-propagation-TODO-https-github-com-imhuay-Algorithm-Interview-Notes-Chinese-blob-master-A-E6-B7-B1-E5-BA-A6-E5-AD-A6-E4-B9-A0-A-E6-B7-B1-E5-BA-A6-E5-AD-A6-E4-B9-A0-E5-9F-BA-E7-A1-80-md" class="headerlink" title="Back-propagation TODO:https://github.com/imhuay/Algorithm_Interview_Notes-Chinese/blob/master/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80.md"></a>Back-propagation TODO:<a href="https://github.com/imhuay/Algorithm_Interview_Notes-Chinese/blob/master/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80.md">https://github.com/imhuay/Algorithm_Interview_Notes-Chinese/blob/master/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0/A-%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E5%9F%BA%E7%A1%80.md</a></h3><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2ubtex9hwj207101kjr8.jpg" alt=""></p><blockquote><p>上标 (l) 表示网络的层,(L) 表示输出层(最后一层);下标 j 和 k 指示神经元的位置;w_jk 表示 l 层的第 j 个神经元与(l-1)层第 k 个神经元连线上的权重</p></blockquote><p>MSE as loss function:<br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2ubuojv5xj207z03ljr9.jpg" alt=""></p><p>another expression:<br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g2ubub5wrkj20e907y0sv.jpg" alt=""></p><h3 id="Activation-function-improve-ability-of-expression"><a href="#Activation-function-improve-ability-of-expression" class="headerlink" title="Activation function: improve ability of expression"></a>Activation function: improve ability of expression</h3><h4 id="sigmoid-z"><a href="#sigmoid-z" class="headerlink" title="sigmoid(z)"></a>sigmoid(z)</h4><p>$$\sigma(z)=\frac{1}{1+exp(-z)}, where the range is [0, 1]$$</p><p>the derivative of simoid is:<br>TODO: to f(x)<br>$$f’(x)=f(x)(1-f(x))$$</p><h3 id="Batch-Normalization"><a href="#Batch-Normalization" class="headerlink" title="Batch Normalization"></a>Batch Normalization</h3><p>Goal: restrict data point to same distribution through normalization data before each layer.</p><h3 id="Optimizers"><a href="#Optimizers" class="headerlink" title="Optimizers"></a>Optimizers</h3><h4 id="SGD"><a href="#SGD" class="headerlink" title="SGD"></a>SGD</h4><p>Stochastic Gradient Descent, update weights each mini-batch</p><h4 id="Momentum"><a href="#Momentum" class="headerlink" title="Momentum"></a>Momentum</h4><p>Add former gradients with decay into current gradient.</p><h4 id="Adagrad"><a href="#Adagrad" class="headerlink" title="Adagrad"></a>Adagrad</h4><p>Dynamically adjust learning rate when training. </p><p>Learning rate is in reverse ratio to the sum of parameters.</p><h4 id="Adam"><a href="#Adam" class="headerlink" title="Adam"></a>Adam</h4><p>Dynamically adjust learning rate when training. </p><p>utilize first order moment estisourmation and second order moment estimation to make sure the steadiness.</p><h4 id="How-to-deal-with-L1-not-differentiable"><a href="#How-to-deal-with-L1-not-differentiable" class="headerlink" title="How to deal with L1 not differentiable"></a>How to deal with L1 not differentiable</h4><p>Update parameters along the axis direction.</p><h3 id="How-to-initialize-the-neural-network"><a href="#How-to-initialize-the-neural-network" class="headerlink" title="How to initialize the neural network"></a>How to initialize the neural network</h3><p>Init network with <strong>Gaussian Distribution</strong> or <strong>Uniform Distribution</strong>.</p><p>Glorot Initializer:<br>$$W_{i,j}~U(-\sqrt{\frac{6}{m+n}}, \sqrt{\frac{6}{m+n}})$$</p><h1 id="Computer-Vision"><a href="#Computer-Vision" class="headerlink" title="Computer Vision"></a>Computer Vision</h1><h2 id="Models-and-History"><a href="#Models-and-History" class="headerlink" title="Models and History"></a>Models and History</h2><ul><li>2015 VGGNet(16/19): Very Deep Convolutional Networks for Large-Scale Image Recognition, ICLR 2015.</li><li>2015 GoogleNet: </li><li>2016 Inception-v1/v2/v3: Rethinking the Inception Architecture for Computer Vision, CVPR 2016.</li><li>2016 ResNet: Deep Residual Learning for Image Recognition, CVPR 2016.</li><li>2017 Xception: Xception: Deep Learning with Depthwise Separable Convolutions, CVPR 2017.</li><li>2017 InceptionResNet-v1/v2、Inception-v4</li><li>2017 MobileNet: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, arXiv 2017.</li><li>2017 DenseNet: Densely Connected Convolutional Networks, CVPR 2017.</li><li>2017 NASNet: Learning Transferable Architectures for Scalable Image Recognition, arXiv 2017.</li><li>2018 MobileNetV2: MobileNetV2: Inverted Residuals and Linear Bottlenecks, CVPR 2018.</li></ul><h2 id="Basic-knowledge-1"><a href="#Basic-knowledge-1" class="headerlink" title="Basic knowledge"></a>Basic knowledge</h2><h1 id="Practice-experience"><a href="#Practice-experience" class="headerlink" title="Practice experience"></a>Practice experience</h1><h2 id="Loss-function-decline-to-0-0000"><a href="#Loss-function-decline-to-0-0000" class="headerlink" title="Loss function decline to 0.0000"></a>Loss function decline to 0.0000</h2><p>Because of <strong>overflow</strong> in Tensorflow or other framework. it is better to initialize parameters in a reasonable interval. The solution is <strong>Xavier initialization</strong> and <strong>Kaiming initialization</strong>.</p><h2 id="Do-not-normaolize-the-bias-in-neural-network"><a href="#Do-not-normaolize-the-bias-in-neural-network" class="headerlink" title="Do not normaolize the bias in neural network"></a>Do not normaolize the bias in neural network</h2><p>That will lead to underfitting because of sparse $b$</p><h2 id="Do-not-set-learning-rate-too-large"><a href="#Do-not-set-learning-rate-too-large" class="headerlink" title="Do not set learning rate too large"></a>Do not set learning rate too large</h2><p>When using Adam optimizer, try $10^{-3}$ to $10^{-4}$</p><h2 id="Do-not-add-activation-before-sotmax-layer"><a href="#Do-not-add-activation-before-sotmax-layer" class="headerlink" title="Do not add activation before sotmax layer"></a>Do not add activation before sotmax layer</h2><h2 id="Do-not-forget-to-shuffle-training-data"><a href="#Do-not-forget-to-shuffle-training-data" class="headerlink" title="Do not forget to shuffle training data"></a>Do not forget to shuffle training data</h2><p>For the sake of overfitting</p><h2 id="Do-not-use-same-label-in-a-batch"><a href="#Do-not-use-same-label-in-a-batch" class="headerlink" title="Do not use same label in a batch"></a>Do not use same label in a batch</h2><p>For the sake of overfitting</p><h2 id="Do-not-use-vanilla-SGD-optimizer"><a href="#Do-not-use-vanilla-SGD-optimizer" class="headerlink" title="Do not use vanilla SGD optimizer"></a>Do not use vanilla SGD optimizer</h2><p>Avoid getting into saddle point</p><h2 id="Please-checkout-gradient-in-each-layer"><a href="#Please-checkout-gradient-in-each-layer" class="headerlink" title="Please checkout gradient in each layer"></a>Please checkout gradient in each layer</h2><p>For the sake of potential gradient explosion, we need to use <strong>gradient clip</strong> to cut off gradient</p><h2 id="Please-checkout-your-labels-are-not-random"><a href="#Please-checkout-your-labels-are-not-random" class="headerlink" title="Please checkout your labels are not random"></a>Please checkout your labels are not random</h2><h2 id="Problem-of-classification-confidence"><a href="#Problem-of-classification-confidence" class="headerlink" title="Problem of classification confidence"></a>Problem of classification confidence</h2><p>Symptom: When losses increasing, but the accuracy still increasing</p><p>For the sake of <strong>confidence</strong>: [0.9,0.01,0.02,0.07] in epoch 5 VS [0.5,0.4,0.05,0.05] in epoch 20.</p><p>Overall, this phenomenon is kind of <strong>overfitting</strong>.</p><h2 id="Do-not-use-batch-normalization-layer-with-small-batch-size"><a href="#Do-not-use-batch-normalization-layer-with-small-batch-size" class="headerlink" title="Do not use batch normalization layer with small batch size"></a>Do not use batch normalization layer with small batch size</h2><p>The data in batch size can not represent the statistical feature over whole dataset。</p><h2 id="Set-BN-layer-in-the-front-of-Activation-or-behind-Activation"><a href="#Set-BN-layer-in-the-front-of-Activation-or-behind-Activation" class="headerlink" title="Set BN layer in the front of Activation or behind Activation"></a>Set BN layer in the front of Activation or behind Activation</h2><h2 id="Improperly-Use-dropout-in-Conv-layer-may-lead-to-worse-performance"><a href="#Improperly-Use-dropout-in-Conv-layer-may-lead-to-worse-performance" class="headerlink" title="Improperly Use dropout in Conv layer may lead to worse performance"></a>Improperly Use dropout in Conv layer may lead to worse performance</h2><p>It is better to use dropout layer in a low probability such as 0.1 or 0.2.</p><p>Just like add some noise to Conv layer for normalization.</p><h2 id="Do-not-initiate-weight-to-0-but-bias-can"><a href="#Do-not-initiate-weight-to-0-but-bias-can" class="headerlink" title="Do not initiate weight to 0, but bias can"></a>Do not initiate weight to 0, but bias can</h2><h2 id="Do-not-forget-your-bias-in-each-FNN-layer"><a href="#Do-not-forget-your-bias-in-each-FNN-layer" class="headerlink" title="Do not forget your bias in each FNN layer"></a>Do not forget your bias in each FNN layer</h2><h2 id="Evaluation-accuracy-better-than-training-accuracy"><a href="#Evaluation-accuracy-better-than-training-accuracy" class="headerlink" title="Evaluation accuracy better than training accuracy"></a>Evaluation accuracy better than training accuracy</h2><p>Because the distributions between training set and test set have large difference.</p><p>Try methods in transfer learning.</p><h2 id="KL-divergence-goes-negative-number"><a href="#KL-divergence-goes-negative-number" class="headerlink" title="KL divergence goes negative number"></a>KL divergence goes negative number</h2><p>Need to pay attention to softmax for computing probability.</p><h2 id="Nan-values-appear-in-numeral-calculation"><a href="#Nan-values-appear-in-numeral-calculation" class="headerlink" title="Nan values appear in numeral calculation"></a>Nan values appear in numeral calculation</h2><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="https://blog.csdn.net/LoseInVain/article/details/83021356">深度学习debug沉思录</a></li></ul>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> note </tag>
</tags>
</entry>
<entry>
<title>The first and second generation of Neural Turing Machine</title>
<link href="/2019/05/05/Hybrid-computing-using-a-neural-network-with-dynamic-external-memory/"/>
<url>/2019/05/05/Hybrid-computing-using-a-neural-network-with-dynamic-external-memory/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Hand-writing-pdf-version"><a href="#Hand-writing-pdf-version" class="headerlink" title="Hand-writing pdf version:"></a>Hand-writing pdf version:</h2><div class="row"><iframe src="https://drive.google.com/file/d/1H63IlKB8ekJWUO8rcKfGHYtBhZfRRnR4/preview" style="width:100%; height:550px"></iframe></div><h2 id="Structure"><a href="#Structure" class="headerlink" title="Structure"></a>Structure</h2><p>Computer has a CPU and a RAM.</p><p>Differential Neural Computer has a neural network as <strong>the controller</strong> that take the role of <strong>the CPU</strong>.<br>The memory is an $N <em> W$ <strong>matrix</strong> that take the role of <em>*the RAM</em></em>, where $N$ means the locations and $W$ means the length of each pieces of memory.</p><h2 id="Memory-augmentation-and-attention-mechanism"><a href="#Memory-augmentation-and-attention-mechanism" class="headerlink" title="Memory augmentation and attention mechanism"></a>Memory augmentation and attention mechanism</h2><blockquote><p>The episodic memories or evenet memories are known to depend on the hippocampus in the human brain.</p></blockquote><p>The main point is that the memory of the network is external to the network itself.</p><p>The attention mechanism defines some distributions over the $N$ locations.<br>Each $i-th$ component of a weighting vector will communicate how much attention the controller should give to the content in the $i-th$ location of the memory.</p><h2 id="Differntiability"><a href="#Differntiability" class="headerlink" title="Differntiability"></a>Differntiability</h2><p>Every unit and operation in this structure is differentiable.</p><h2 id="Weightings"><a href="#Weightings" class="headerlink" title="Weightings"></a>Weightings</h2><p>The controller wants to do something which involves memory, and it doesn’t just look at every location of the memor.<br>Instead, it focues its attention on those locations which contain the information it is looking for.</p><p>The weighting produced for an input is a distribution over the N locations for their relative importance in a particular process(reading or writing).</p><p>Note that the weightings are produced by means by a vector emitted by the controller, which is called <strong>interface vector</strong>. The </p><h2 id="Three-interactions-between-controller-and-memory"><a href="#Three-interactions-between-controller-and-memory" class="headerlink" title="Three interactions between controller and memory"></a>Three interactions between controller and memory</h2><p>The controller and memory are mediated by the <strong>interface vector</strong>.</p><h3 id="Content-lookup"><a href="#Content-lookup" class="headerlink" title="Content lookup"></a>Content lookup</h3><p>A particular set of values within the interface vector, which we will collect in something called key vector, is compared to the content of each location. This comparison is made by means of a similarity measure.</p><h3 id="Temporal-memory-linkage"><a href="#Temporal-memory-linkage" class="headerlink" title="Temporal memory linkage"></a>Temporal memory linkage</h3><p>The transitions between consecutively written locations are recorded in an $N * N$ matrix, called temproal link matrix “L”. The sequence by which the controller writes in the memory is an information by itself, and it is something we want to store.</p><p>DNC stores the ‘temporal link’ to keep track of the order things where written in, and records the current ‘usage’ level of each memory location.</p><h3 id="Dynamic-memory-allocation"><a href="#Dynamic-memory-allocation" class="headerlink" title="Dynamic memory allocation"></a>Dynamic memory allocation</h3><p>Each location has a usage level represented as a number from 0 to 1. A weighting that picks out an unused location is sent to the write head, so that it knows where to store new information. The word “dynamic” refers to the ability of the controller to reallocate memory that is no longer required, erasing its content.</p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="https://towardsdatascience.com/rps-intro-to-differentiable-neural-computers-e6640b5aa73a">Differentiable Neural Computers: An Overview</a></li><li><a href="https://deepmind.com/blog/differentiable-neural-computers/">Deepmind->Differentiable neural computers</a></li><li><a href="https://slideplayer.com/slide/14373603/">DNC-slide</a></li><li><a href="https://www.slideshare.net/databricks/demystifying-differentiable-neural-computers-and-their-brain-inspired-origin-with-luis-leal">Demystifying Differentiable Neural Computers and Their Brain Inspired Origin with Luis Leal</a></li></ul>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> agi </tag>
</tags>
</entry>
<entry>
<title>Inspiration of On-intelligence</title>
<link href="/2019/04/24/on-intelligence/"/>
<url>/2019/04/24/on-intelligence/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><p>Please note that all these ideas may prove to be wrong or will be revised.</p><h2 id="Artificial-Intelligence-wrong-way"><a href="#Artificial-Intelligence-wrong-way" class="headerlink" title="Artificial Intelligence: wrong way"></a>Artificial Intelligence: wrong way</h2><p>We are on the wrong way of Artificial General Intelligence(AGI).</p><p>The biggest mistake is the belief that intelligence is defined by intelligent behavior.<br>Object detection or other tasks are the manifestations of intelligence not the intelligence itself.</p><p>The great brain uses vast amounts of memory to create a model of the world, everything you know and have learned is stored in this model.</p><p>The ability to make predictions about the future that is the crux of intelligence.</p><h2 id="Neural-Networks"><a href="#Neural-Networks" class="headerlink" title="Neural Networks:"></a>Neural Networks:</h2><ol><li>We must include time as brain function: real brains process rapidly changing streams of information.</li><li>The importance of feedback: In thalamus(丘脑), connections going backward toward the input exceed the connections going forward by almost a factor of ten. But <strong>back propagation is not really feedback</strong>, because it is only occurred during the learning phase.</li><li>Brain is organized as a repeating hierarchy.</li></ol><p>History shows that the best solution to scitific problems are simple and elegant.</p><h2 id="The-Human-Brain-all-your-knowledge-of-the-world-is-a-model-based-on-patterns"><a href="#The-Human-Brain-all-your-knowledge-of-the-world-is-a-model-based-on-patterns" class="headerlink" title="The Human Brain: all your knowledge of the world is a model based on patterns"></a>The Human Brain: all your knowledge of the world is a model based on patterns</h2><ol><li>The neocortex is about 2 milimeters thick and has six layers, each approximated by one card.</li><li>The mind is the creation of the cells in the brain. <strong>There is nothing else.</strong>And remember the cortex is built using a common repeated element.</li><li>The cortex uses the same computational tool to accomplish everything it does.</li></ol><p>According to <a href="https://en.wikipedia.org/wiki/Vernon_Benjamin_Mountcastle#Research_and_career">Mountcastle</a>‘s proposal:<br>The algorithm of cortex must be expressed independently of any particular function or sense.<br>The cortex does something universal that can be applied to any type of sensory or motor system.</p><blockquote><p>When scientists and engineers try to understand vision or make computer that can “see”, they devise terminologies and techniques specific to vision.<br>They talk about edges, textures, and three-dimensional representations.<br>If they want to understand spoken language, they build algorithms based on rules of grammar, syntax and semantics.</p></blockquote><p><strong>But these approaches are not how the brain solves these problems, and are therefore likely to fail.</strong></p><p>Attention mechanism:<br>About three times every second, your eyes make a sudden movement called a saccade.<br>Many vision research ignore saccades and the rapidly changing patterns of vision.</p><p>Existence may be objective, but the spatial-temporal pattern flowing into the axon bundles in our brains are all we have to go on.</p><h2 id="Memory"><a href="#Memory" class="headerlink" title="Memory"></a>Memory</h2><p>The brain does not “compute” the answers to problems, it <strong>retrieves the answers from memory</strong>.<br>The entire cortex is a memory system rather than a computer at all.</p><p>The memory is <code>invariant representations</code>, which handle variations in the world automatically.</p><ul><li><p>The neocortex stores <strong>sequences of patterns</strong><br>There are thousands of detailed memories stored in the synapses of our brains that are rarely used.<br>At any point in time we recall only a tiny fraction of what we know.(remind A-Z is easy, Z-A is hard)</p></li><li><p>The neocortex recalls patterns <strong>auto-associatively</strong>.<br>Your eyes only see parts of a body, but your brain fills in the rest.<br>At any time, a piece can activate the whole. This is the essence of auto-associative memories or inferring.<br><strong>Thought and memories are associately linked, notice that random thoughts never really occur!</strong></p></li><li><p>The neocortex stores patterns in an <strong>invariant form</strong>.<br>We do not remember or recall things with complete fidelity.<br>Because the brain remembers the important relationships in the world, independent of the details.<br>To make a specific prediction, the brain must combine knowledge of the invariant structure with the most recent details.</p><blockquote><p>When listening to a familar song played on a piano, your cortex predicts the next note before it is played. And when listening to people speak, you often know what they are going to say before they have finished speaking.</p></blockquote></li><li><p>The neocortex stores patterns in a <strong>hierarchy</strong>.</p></li></ul><h2 id="A-New-Framework-of-Intelligence-Hierarchy"><a href="#A-New-Framework-of-Intelligence-Hierarchy" class="headerlink" title="A New Framework of Intelligence: Hierarchy"></a>A New Framework of Intelligence: Hierarchy</h2><p>The brain is using memories to form predictions about what it expects to experience before experience it.<br>When prediction is violated, attention is drawn to the error.<br>Incorret predictions result in confusion and prompt you to pay attention.<br><strong>Your brain has made a model of the world and is constantly checking that model against reality.</strong></p><p>By comparing the actual sensory input with recalled memory, the animal not only understands where it is but can see into the future.</p><h2 id="How-the-Cortex-Works"><a href="#How-the-Cortex-Works" class="headerlink" title="How the Cortex Works"></a>How the Cortex Works</h2><p>If you don’t have a picture of puzzle’s solution, <strong>the bottom-up method</strong> is sometimes the only way to proceed.</p><p>Here is an interesting metaphor: </p><blockquote><p>Many of puzzle pieces will not be used in the ultimate solution, but you don’t know which one or how many.</p></blockquote><p>I can not approve the ideas from Hawkins in this part. Still we don’t know how the cortex works actually.</p><h2 id="How-the-Cortex-Learns"><a href="#How-the-Cortex-Learns" class="headerlink" title="How the Cortex Learns"></a>How the Cortex Learns</h2><blockquote><p>Donlad O.Hebb, Hebbian learing: When two neurons fire at the same tiem, the synapses between them get strengthened</p></blockquote><ol><li>Forming the classifications of patterns.</li><li>Building memory sequences.</li></ol><p>Note that prior to neocortex, the brain has:</p><ol><li>The Basal ganglia(基底神经节): Primitive motor system.</li><li>The cerebellum(小脑): Leared precise timing relationships of evenets.</li><li>The hippocampus(海马体): stored memories of specific events and places.</li></ol><p><strong>The hippocampus is the top region of the neocortex, not a separate structure.</strong></p><p>There are many more secrets to be discovered than we currently know</p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="https://en.wikipedia.org/wiki/On_Intelligence">On intelligence, Jeff Hawkins</a></li><li><a href="https://en.wikipedia.org/wiki/Vernon_Benjamin_Mountcastle#Research_and_career">Mountcastle</a></li></ul>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> agi </tag>
</tags>
</entry>
<entry>
<title>零样本学习的视频描述</title>
<link href="/2019/04/07/zero-shot-for-VD/"/>
<url>/2019/04/07/zero-shot-for-VD/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Hand-writing-pdf-version"><a href="#Hand-writing-pdf-version" class="headerlink" title="Hand-writing pdf version"></a>Hand-writing pdf version</h2><div class="row"><iframe src="https://drive.google.com/file/d/1XtGej5wnl5hiJebI38wYpQrI6TIjndSs/preview" style="width:100%; height:550px"></iframe></div><h2 id="Hand-writing-image-version"><a href="#Hand-writing-image-version" class="headerlink" title="Hand-writing image version"></a>Hand-writing image version</h2><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g1u7a6lduaj20zf19u1d4.jpg" alt=""><br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g1u7aj7ktxj20zf19uk31.jpg" alt=""><br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g1u7as81fnj20zf19utfw.jpg" alt=""></p>]]></content>
<categories>
<category> VideoCaptioning </category>
</categories>
<tags>
<tag> zeroshot </tag>
</tags>
</entry>
<entry>
<title>基于分层强化学习的视频描述</title>
<link href="/2019/03/10/HRL-video-captioning/"/>
<url>/2019/03/10/HRL-video-captioning/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名:Video Captioning via Hierarchical Reinforcement Learning</p></li><li><p>论文链接:<a href="https://ieeexplore.ieee.org/document/8578541/">https://ieeexplore.ieee.org/document/8578541/</a></p></li><li><p>论文源码:</p><ul><li>None</li></ul></li><li><p>关于笔记作者:</p><ul><li>朱正源,北京邮电大学研究生,研究方向为多模态与认知计算。 </li></ul></li></ol><hr><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><p>视频描述中细粒度的动作描述仍然是该领域中一个巨大的挑战。该论文创新点分为两部分:1. 通过层级化的强化学习框架,使用高层manager识别粗粒度的视频信息并控制描述生成的目标,使用低层的worker识别细粒度的动作并完成目标。2. 提出Charades数据集。</p><hr><h1 id="Video-Captioning-via-Hierarchical-Reinforcement-Learning"><a href="#Video-Captioning-via-Hierarchical-Reinforcement-Learning" class="headerlink" title="Video Captioning via Hierarchical Reinforcement Learning"></a>Video Captioning via Hierarchical Reinforcement Learning</h1><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g0xt4i7vsnj20qs0k47lv.jpg" alt=""></p><h2 id="Framework-of-model"><a href="#Framework-of-model" class="headerlink" title="Framework of model"></a>Framework of model</h2><ol><li><p>Work processing</p><ul><li><p><strong>Pretrained CNN</strong> encoding stage we obtain:<br>video frame features: $v={v_i}$, where $i$ is index of frames.</p></li><li><p>Language Model encoding stage we obtain:<br>Worker : $h^{E_w}={h_i^{E_w}}$ from low-level <strong>Bi-LSTM</strong> encoder<br>Manager: $h^{E_m}={h_i^{E_m}}$ from high <strong>LSTM</strong> encoder</p></li><li><p>HRL agent decoding stage we obtain:<br>Language description:$a<em>{1}a</em>{2}…a_{T}$, where $T$ is the length of generated caption.</p></li></ul></li><li><p>Details in HRL agent:</p><ol><li>High-level manager:<ul><li>Operate at lower temporal resolution.</li><li>Emits a goal for worker to accomplish.</li></ul></li><li>Low-level worker<ul><li>Generate a word for each time step by following the goal.</li></ul></li><li>Internal critic <ul><li>Determin if the worker has accomplished the goal</li></ul></li></ol></li><li><p>Details in Policy Network:</p><ol><li>Attention Module:<ol><li>At each time step t: $c<em>t^W=\sum\alpha</em>{t,i}^{W}h^{E_w}_i$</li><li>Note that attention score $\alpha<em>{t,i}^{W}=\frac{exp(e</em>{t, i})}{\sum_{k=1}^{n}exp(e<em>t, k)}$, where $e</em>{t,i}=w^{T} tanh(W<em>{a} h</em>{i}^{E<em>w} + U</em>{a} h^{W}_{t-1})$</li></ol></li><li>Manager and Worker:<ol><li>Manage: take $[c_t^M, h_t^M]$ as input to produce goal. Goal is obtained through a MLP.</li><li>Worker: receive the goal $g_t$ and take the concatenation of $c_t^W, g<em>t, a</em>{t-1}$ as input, and outputs the probabilities of $\pi_t$ over all action $a_t$.</li></ol></li><li>Internal Critic:<ol><li>evaluate worker’s progress. Using an RNN struture takes a word sequence as input to discriminate whether end.</li><li>Internal Critic RNN take $h^I_{t-1}, a_t$ as input, and generate probability $p(z_t)$.</li></ol></li></ol></li><li><p>Details in Learning:</p><ol><li>Definition of Reward:<br>$R(a<em>t)$ = $\sum</em>{k=0} \gamma^{k} f(a_{t+k})$ , where $f(x)=CIDEr(sent+x)-CIDEr(sent)$ and $sent$ is previous generated caption.</li><li>Pseudo Code of HRL training algorithm:<pre><code class="py"><span class="keyword">import</span> training_pairs<span class="keyword">import</span> pretrained_CNN, internal_critic<span class="keyword">for</span> i <span class="keyword">in</span> range(M):Initial_random(minibatch)<span class="keyword">if</span> Train_Worker: goal_exploration(enable=<span class="literal">False</span>) sampled_capt = LSTM() <span class="comment"># a_1, a_2, ..., a_T</span> Reward = [r_i <span class="keyword">for</span> r_i <span class="keyword">in</span> calculate_R(sampled_caption)] Manager(enable=<span class="literal">False</span>) worker_policy = Policy_gradient(Reward)<span class="keyword">elif</span> Train_Manager: Initial_ramdom_process(N) greedy_decoded_cap = LSTM() Reward = [r_i <span class="keyword">for</span> r_i <span class="keyword">in</span> calculate_R(sampled_caption)] Worker(enable=<span class="literal">False</span>) manager_policy = Policy_gradient(Reward)</code></pre></li></ol></li></ol><h2 id="All-in-one"><a href="#All-in-one" class="headerlink" title="All in one"></a>All in one</h2><p><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g0xt54v9puj21ao0p27c8.jpg" alt=""></p><h2 id="数据集"><a href="#数据集" class="headerlink" title="数据集"></a>数据集</h2><ol><li><a href="http://ms-multimedia-challenge.com/2017/challenge">MSR-VTT</a><blockquote><p>该数据集包含50个小时的视频和26万个相关视频描述。</p></blockquote></li></ol><ol><li><a href="https://mila.quebec/en/publications/public-datasets/m-vad/">Charades</a><blockquote><p>Charades Captions:室内互动的9848个视频,包含157个动作的66500个注解,46个类别的物体的41104个标签,和共27847个文本描述。</p></blockquote></li></ol><h2 id="实验结果"><a href="#实验结果" class="headerlink" title="实验结果"></a>实验结果</h2><ol><li><p>实验可视化<br><img src="https://ws1.sinaimg.cn/large/ca26ff18gy1g0xs2qfw1rj220k0hce5u.jpg" alt=""></p></li><li><p>模型对比<br><img src="https://ws1.sinaimg.cn/mw690/ca26ff18gy1g0xs1f57tkj21120hwgpl.jpg" alt=""></p></li></ol>]]></content>
<categories>
<category> VideoCaptioning </category>
</categories>
<tags>
<tag> reinforcement_learning </tag>
</tags>
</entry>
<entry>
<title>HTM_theory</title>
<link href="/2019/03/03/HTM-theory/"/>
<url>/2019/03/03/HTM-theory/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><ol><li>Overview<br>cellular structure are same.(bio pic)</li></ol><p>hierarchical structure.(need a pic)and each region is performing the same set of processes on the input data.</p><p>[SDR] sparse distributed representations(0 or 1)</p><p>Input: 1. Motor commands 2. sensory input</p><p>Encoder: takes a datatype and converts it into a sparse distributed representations. </p><p>Temporal means that systems learn continuously, every time it receives input it is attemptiing to predict what is going to happen next.</p><ol><li>Sparse Distributed Representation(SDR)<br>Terms: 1. n=Bit array length 2. w=Bits of array 3. sparsiity 4. Dense Bit Array Capacity=2**of bits<br>Bit array</li></ol><p>Capacity=n!/w!(n-w)!<br><img src="https://ws1.sinaimg.cn/mw690/ca26ff18gy1g0pke65e1dj218g11awlp.jpg" alt=""></p><p>OVERLAP/UNION<br>Similarity can be represented by overlap/union? of SDR.</p><p>MATCH</p><ol><li>Overlap Sets and Subsampling</li><li><p>Scalar Encoder(retina/cochlea)</p><ul><li>Scalar Encoder: consecutive one</li><li>Random Distributed Scalar Encoder: random one</li></ul></li><li><p>Data-time encoder </p></li><li>Input Space& Connections<ul><li>Spactial Pooler: maintain a fixed sparsity & maintain a overlap properties.</li></ul></li></ol>]]></content>
</entry>
<entry>
<title>基于时序结构的视频描述</title>
<link href="/2019/03/02/describing-videos-by-exploiting-tempporal-structure/"/>
<url>/2019/03/02/describing-videos-by-exploiting-tempporal-structure/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名:Describing Videos by Exploiting Temporal Structure</p></li><li><p>论文链接:<a href="https://arxiv.org/pdf/1502.08029">https://arxiv.org/pdf/1502.08029</a></p></li><li><p>论文源码:</p><ul><li><a href="https://github.com/tsenghungchen/SA-tensorflow">https://github.com/tsenghungchen/SA-tensorflow</a></li></ul></li><li><p>关于笔记作者:</p><ul><li>朱正源,北京邮电大学研究生,研究方向为多模态与认知计算。 </li></ul></li></ol><hr><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><p>本文是蒙特利尔大学发表在ICCV2015的研究成果,其主要创新点在于提出了时序结构并且利用注意力机制达到了在2015年的SOTA。通过3D-CNN捕捉视频局部信息和注意力机制捕捉全局信息相结合,可以全面提升模型效果。<br>其另一个重要成果是MVAD电影片段描述数据集,此<a href="https://mila.quebec/en/publications/public-datasets/m-vad/">数据集</a>已经成为了当前视频描述领域主流的数据集。</p><hr><h2 id="Describing-Videos-by-Exploiting-Temporal-Structure"><a href="#Describing-Videos-by-Exploiting-Temporal-Structure" class="headerlink" title="Describing Videos by Exploiting Temporal Structure"></a>Describing Videos by Exploiting Temporal Structure</h2><h3 id="视频描述任务介绍:"><a href="#视频描述任务介绍:" class="headerlink" title="视频描述任务介绍:"></a>视频描述任务介绍:</h3><p>根据视频生成单句的描述,一例胜千言:</p><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzcxfmvyxuj20si0hqqfg.jpg" alt=""></p><p> A monkey pulls a dog’s tail and is chased by the dog.</p><p>2015年较早的模型:<br><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzcwjf53bsj214x0ksajx.jpg" alt="LSTM-YT模型"></p><h3 id="2015年之前的模型存在的问题"><a href="#2015年之前的模型存在的问题" class="headerlink" title="2015年之前的模型存在的问题"></a>2015年之前的模型存在的问题</h3><ol><li>输出的描述没有考虑到动态的<strong>时序结构</strong>。</li><li>之前的模型利用一个特征向量来表示视频中的所有帧,导致无法识别视频中物体出现的<strong>先后顺序</strong>。</li></ol><h3 id="论文思路以及创新点"><a href="#论文思路以及创新点" class="headerlink" title="论文思路以及创新点"></a>论文思路以及创新点</h3><ol><li>通过局部和全局的时序结构来产生视频描述:</li></ol><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0odfi9c82j20zk0eih1g.jpg" alt=""></p><p> 针对Decoder生成的每一个单词,模型都会关注视频中特定的某一帧。</p><ol><li>使用3-D CNN来捕捉视频中的动态时序特征。</li></ol><h3 id="模型结构设计"><a href="#模型结构设计" class="headerlink" title="模型结构设计"></a>模型结构设计</h3><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0oe02i6tjj21c80k47e5.jpg" alt=""></p><ul><li>Encoder(3-D CNN + 2-D GoogLeNet)的设置:3 * 3 * 3 的三维卷积核,并且是3-D CNN在行为识别数据集上预训练好的。</li></ul><p>每个卷积层后衔接ReLu激活函数和Local max-pooling, dropout参数设置为0.5。</p><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0oe2iz7iaj20v20ien2y.jpg" alt=""></p><ul><li>Decoder(LSTM)的设置:使用了additive attention作为注意力机制,下图为在两个数据集上的超参数设置:<br><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0osevs2qoj21620pwafz.jpg" alt=""></li></ul><h3 id="实验细节"><a href="#实验细节" class="headerlink" title="实验细节"></a>实验细节</h3><h4 id="数据集"><a href="#数据集" class="headerlink" title="数据集"></a>数据集</h4><ol><li><a href="http://www.cs.utexas.edu/users/ml/clamp/videoDescription/">Microsoft Research Video Description dataset</a></li></ol><blockquote><p>1970条Youtobe视频片段:每条大约10到30秒,并且只包含了一个活动,其中没有对话。1200条用作训练,100条用作验证,670条用作测试。</p></blockquote><ol><li><a href="https://mila.quebec/en/publications/public-datasets/m-vad/">Montreal Video Annotation Dataset</a></li></ol><blockquote><p>数据集包含从92部电影的49000个视频片段,并且每个视频片段都被标注了描述语句。</p></blockquote><h4 id="评估指标"><a href="#评估指标" class="headerlink" title="评估指标"></a>评估指标</h4><ul><li>BLEU</li></ul><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0os1pgs0mj20qe0kgqh7.jpg" alt=""></p><ul><li>METEOR</li></ul><p><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0os2ibx04j20su0mgh2g.jpg" alt=""></p><ul><li>CIDER</li><li>Perplexity</li></ul><h4 id="实验结果"><a href="#实验结果" class="headerlink" title="实验结果"></a>实验结果</h4><ol><li>实验可视化<br><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0ornzkuylj220a144u0x.jpg" alt="实验结果"></li></ol><p>柱状图表示每一帧生成对应颜色每个单词时的注意力权重。</p><ol><li>模型对比<br><img src="https://ws1.sinaimg.cn/large/ca26ff18ly1g0ormgxp41j22120ggten.jpg" alt="模型对比"></li></ol><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ul><li><a href="https://arxiv.org/pdf/1502.08029">Describing Videos by Exploiting Temporal Structure</a></li></ul>]]></content>
<categories>
<category> VideoCaptioning </category>
</categories>
<tags>
<tag> videoCaptioning </tag>
</tags>
</entry>
<entry>
<title>视频描述领域的第一篇深度模型论文</title>
<link href="/2019/01/19/first-deep-model-in-video-captioning/"/>
<url>/2019/01/19/first-deep-model-in-video-captioning/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名:Translating Videos to Natural Language Using Deep Recurrent Neural Networks</p></li><li><p>论文链接:<a href="https://www.cs.utexas.edu/users/ml/papers/venugopalan.naacl15.pdf">https://www.cs.utexas.edu/users/ml/papers/venugopalan.naacl15.pdf</a></p></li><li><p>论文源码:</p><ul><li><a href="https://github.com/vsubhashini/caffe/tree/recurrent/examples/youtube">https://github.com/vsubhashini/caffe/tree/recurrent/examples/youtube </a></li></ul></li><li><p>关于笔记作者:</p><ul><li>朱正源,北京邮电大学研究生,研究方向为多模态与认知计算。 </li></ul></li></ol><hr><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><p>假设我们在未来已经实现了通用人工智能,当我们回首向过去看,到底哪个时代会被投票选为最重要的“Aha Moment”呢?</p><p>作为没有预知未来能力的普通人。为了回答这个问题,首先需要明确的一点就是:我们现在究竟处在实现通用人工智能之前的哪个位置?</p><p>一个常用的比喻便是,如果把从开始尝试到最终实现通用人工智能比作一条一公里的公路的话。大部分人可能会认为我们已经走了200米到500米之间。但是真实的情况可能是,我们仅仅走过了5厘米不到。</p><p>因为在通往正确道路的各种尝试中,有很大一部分会犯方向性错误。当我们在错误的道路上越走越远的时候,那么肯定无法到达终点。推倒现有成果重新来过便是不可避免的。我们需要时时刻刻保持谨小慎微,以躲避“岔路口”。</p><p>现在有理由相信(其实是因为不得不掩耳盗铃),我们正走在一条正确的道路上。如果非要说现在的技术有哪些让我感觉不那么符合我的直觉的地方的话,我肯定会抢着回答:We are not living in the books or images.</p><p>公元前五亿年前,当我们还是扁形虫的时候,那时候我们便会在未知的环境中为了生存下去作出连续的决策。</p><p>公元前两亿年前,我们进化成啮齿类动物,并且拥有了一套完整的操作系统。不变的是,不断连续变化的生存环境。</p><p>公元前四百万年前,原始人类进化出了大脑皮层之后,终于拥有了进行推理和思考的能力。但是这一切是在他们发明文字和语言之前。</p><p>现如今,当人类巨灵正在尝试创造出超越本身智能的超智能体时,却神奇的忽略了超智能体也应该生存在不断变化的、充满危险的世界之中。</p><p>回到最开始的问题,我一定会把票投在利用神经模型来处理视频流的模型上。</p><hr><h2 id="Translating-Videos-to-Natural-Language-Using-Deep-Recurrent-Neural-Networks"><a href="#Translating-Videos-to-Natural-Language-Using-Deep-Recurrent-Neural-Networks" class="headerlink" title="Translating Videos to Natural Language Using Deep Recurrent Neural Networks"></a>Translating Videos to Natural Language Using Deep Recurrent Neural Networks</h2><h3 id="视频描述任务介绍:"><a href="#视频描述任务介绍:" class="headerlink" title="视频描述任务介绍:"></a>视频描述任务介绍:</h3><p>根据视频生成单句的描述,一例胜千言:</p><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzcxfmvyxuj20si0hqqfg.jpg" alt=""></p><p> A monkey pulls a dog’s tail and is chased by the dog.</p><h3 id="视频描述的前世:"><a href="#视频描述的前世:" class="headerlink" title="视频描述的前世:"></a>视频描述的前世:</h3><p>管道方法(PipeLine Approach)</p><ol><li>从视频中识别出<code>主语</code>、<code>动作</code>、<code>宾语</code>、<code>场景</code></li><li>计算被识别出实体的置信度</li><li>根据最高置信度的实体与预先设置好的模板,进行句子生成</li></ol><p> 在神经模型风靡之前,传统方法集中使用<strong>隐马尔科夫模型识别实体</strong>和<strong>条件随机场生成句子</strong></p><h3 id="神经模型的第一次尝试:"><a href="#神经模型的第一次尝试:" class="headerlink" title="神经模型的第一次尝试:"></a>神经模型的第一次尝试:</h3><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzcwjf53bsj214x0ksajx.jpg" alt="LSTM-YT模型"></p><ol><li>从视频中,每十帧取出一帧进行分析</li></ol><p> <img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzczmke1dbj20vg0astdq.jpg" alt=""><br> 人类眼睛的帧数是每秒24帧,从仿生学的观点出发,模型也不需要处理视频中所有的帧。再对视频帧进行缩放以便计算机进行处理。</p><ol><li>使用CNN提取特征并进行平均池化(Mean Pooling)</li></ol><p> <img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzczv0iwy6j20x40mktg0.jpg" alt=""></p><ul><li><p>预训练的Alexnet[2012]:在120万张图片上进行预训练[ImageNet LSVRC-2012],提取最后一层(第七层全连接层)的特征(4096维)。注意:提取的向量不是最后进行分类的1000维特征向量。</p><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzd0iak8vhj216o0eqacn.jpg" alt="Alexnet"></p></li><li><p>对所有的视频帧进行池化</p></li></ul><ol><li>句子生成</li></ol><p> <img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzd06rquyyj20qk0nwgo6.jpg" alt="RNN生成句子"></p><h3 id="迁移学习和微调模型"><a href="#迁移学习和微调模型" class="headerlink" title="迁移学习和微调模型"></a>迁移学习和微调模型</h3><ol><li>在图片描述任务进行预训练</li></ol><p> <img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fzd0skao7sj216c0jm0zn.jpg" alt="transfer-learning from image captioning"></p><ol><li>微调(Fine-tuning)<br> 需要注意的是,在视频描述过程中:<ul><li>将输入从图片转换为视频;</li><li>添加了平均池化特征这个技巧;</li><li>模型进行训练的时候使用了更低的学习率</li></ul></li></ol><h3 id="实验细节"><a href="#实验细节" class="headerlink" title="实验细节"></a>实验细节</h3><h4 id="数据集"><a href="#数据集" class="headerlink" title="数据集"></a>数据集</h4><ol><li><a href="http://www.cs.utexas.edu/users/ml/clamp/videoDescription/">Microsoft Research Video Description dataset</a></li></ol><blockquote><p>1970条Youtobe视频片段:每条大约10到30秒,并且只包含了一个活动,其中没有对话。1200条用作训练,100条用作验证,670条用作测试。</p></blockquote><p><img src="https://ws1.sinaimg.cn/mw690/ca26ff18ly1fzd1cmxyalj217g0lcdyf.jpg" alt="dataset"></p><ol><li><a href="https://blog.csdn.net/daniaokuye/article/details/78699138">MSCOCO数据集下载</a></li><li><a href="https://blog.csdn.net/gaoyueace/article/details/80564642">Flickr30k数据集下载</a></li></ol><h4 id="评估指标"><a href="#评估指标" class="headerlink" title="评估指标"></a>评估指标</h4><ul><li>SVO(Subject, Verb, Object accuracy)</li><li>BLEU</li><li>METEOR</li><li>Human evaluation</li></ul><h4 id="实验结果"><a href="#实验结果" class="headerlink" title="实验结果"></a>实验结果</h4><ol><li>SVO正确率:</li></ol><p> <img src="https://ws1.sinaimg.cn/mw690/ca26ff18ly1fzd5gnjzg6j20p20f8n0d.jpg" alt="result on SVO"></p><ol><li>BLEU值和METEOR值</li></ol><p> <img src="https://ws1.sinaimg.cn/mw690/ca26ff18ly1fzd5p0xo78j20nm0aotak.jpg" alt="result on BLEU and METEOR"></p><h3 id="站在2019年回看2015年的论文"><a href="#站在2019年回看2015年的论文" class="headerlink" title="站在2019年回看2015年的论文"></a>站在2019年回看2015年的论文</h3><p>以19年的后见之明来考察这篇论文,虽然论文没有Attention和强化学习加持的,但是也开辟了用神经模型完成视频描述任务的先河。</p><p>回顾一下以前提出的问题,如何才能实现:</p><ol><li>常识推理。</li><li>空间位置。</li><li>根据不同粒度回复问题。</li></ol><p>答案很有可能在我们身上,大脑皮质中的前额皮质掌管着人格(就是你脑中出现的那个声音,就是他)。大脑皮质虽然仅仅是大脑最外层的两毫米厚的薄薄一层(<a href="https://zh.wikipedia.org/wiki/%E5%A4%A7%E8%84%91%E7%9A%AE%E8%B4%A8">没错,我确定就是两毫米</a>),但是它起到的作用却是史无前例的。</p><p>以大脑皮质作为启发,最少我们也需要让人工大脑皮质也“生存”在一个类似于现实世界中的环境当中。因此视频是一个很好的起点,但也仅仅是个起点。</p><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ul><li><a href="https://waitbutwhy.com/2017/04/neuralink.html">Neuralink and the Brain’s Magical Future</a></li><li><a href="https://www.cs.utexas.edu/~vsub/pdf/Translating_Videos_slides.pdf">Translating Videos to Natural Language Using Deep Recurrent Neural Networks – Slides</a></li><li><a href="https://medium.com/@smallfishbigsea/a-walk-through-of-alexnet-6cbd137a5637">A Walk-through of AlexNet</a></li><li><a href="https://www.cs.utexas.edu/users/ml/clamp/videoDescription/#data">Collecting Multilingual Parallel Video Descriptions Using Mechanical Turk</a></li><li><a href="https://zh.wikipedia.org/wiki/%E5%A4%A7%E8%84%91%E7%9A%AE%E8%B4%A8">大脑皮质</a></li></ul>]]></content>
<categories>
<category> AGI </category>
</categories>
<tags>
<tag> videoCaptioning </tag>
</tags>
</entry>
<entry>
<title>Tips for Examination of Network Software Design</title>
<link href="/2018/12/26/Network-software-design-exam-tips/"/>
<url>/2018/12/26/Network-software-design-exam-tips/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Question-distribution"><a href="#Question-distribution" class="headerlink" title="Question distribution"></a>Question distribution</h2><ul><li>10 choice questions(20%)</li><li>10 true or false questions(20%)</li><li>6 essay questions(60%)</li></ul><h2 id="Examation-contains-three-parts-Network-amp-Design-amp-Programming"><a href="#Examation-contains-three-parts-Network-amp-Design-amp-Programming" class="headerlink" title="Examation contains three parts: Network & Design & Programming"></a>Examation contains three parts: Network & Design & Programming</h2><h3 id="Network"><a href="#Network" class="headerlink" title="Network"></a>Network</h3><ul><li>IP address:<ul><li><code>public address</code>: an IP address that can be <strong>accessed over the Internet</strong>. And your public IP address is the <strong>globally unique</strong> and <strong>can be found</strong>, and can only be assigned to a unique device.</li><li><code>private IP address</code>: The devices with private IP address will <strong>use your router’s public IP address to communicate</strong>. Note that to allow direct access to a local device which is assign a private IP address, a Network Address Translator(NAT) should be used.</li><li><code>how to compute total ip address in a subnet</code>: <ol><li>transform ip address into binary address.</li><li>count zero from tail to first one.</li><li>subtract 2(reserve address and broadcast address)</li></ol></li></ul></li></ul><hr><ul><li>Port(some default ports for common protocol):<ul><li><strong>http</strong>: 80</li><li><strong>https</strong>: 443</li><li><strong>ntp</strong>: 123</li><li><strong>ssh/tcp</strong>: 22</li><li><strong>mongoDB</strong>: 27017</li><li>DNS: 53</li><li>FTP: 21</li><li>Telnet: 23</li></ul></li></ul><hr><ul><li><p>DNS(duplicated):The Domain Name System(DNS) is the <strong>phonebook</strong> of the Internet. </p><ul><li>DNS <strong>translate domain names to IP address</strong> so browsers can load Internet resources.</li><li>DNS is hierarchical with a few authoritative serves at the top level.<ol><li>Your router or ISP provides information about DNS server to contact when doing a look up.</li><li>Low level DNS servers cache mappings, which could become stale due to DNS propagation delays. </li><li>DNS results can also be cached by your browser or OS for a certain period of time, determined by the time to live(TTL)</li></ol></li></ul></li></ul><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fynty6a1xwj20e50vnjt4.jpg" alt=""></p><hr><ul><li>CDN(duplicated):<ul><li>Definition: <strong>CDN(Content dilivery network/Content distributed network) is a geographically distributed network</strong> of proxy servers and their data centers. The goal is to distribute service spatially relative to end-users to <strong>provide high availability and high performance</strong>.</li><li>Improve performance in two ways:<ul><li>Users receive content at <strong>data centers close to them</strong>.</li><li>Your servers do not have to serve requests that the CND fulfills.</li></ul></li></ul></li></ul><hr><ul><li>Main routing protocol:<ul><li><code>OSPF</code>(Open Shortest Path First): OSPF is an <strong>interior gateway protocal</strong>.</li><li><code>IS-IS</code>(Intermediate System-to-Intermediate System): It is just like OSPF. IS-IS associates routers into areas of intra-area and inter-area.</li><li><code>BGP</code>(Border Gateway Protocol): It is used as the edge of your network. BGP <strong>constructs a routing table of networks</strong> reachable among Autonomous Systems(AS) number defined by the user.</li></ul></li></ul><hr><h3 id="Design"><a href="#Design" class="headerlink" title="Design"></a>Design</h3><h2 id="software-requirements-analysis-not-key-point"><a href="#software-requirements-analysis-not-key-point" class="headerlink" title="- software requirements analysis: not key point"></a>- software requirements analysis: not key point</h2><ul><li>main principles and key technologies for High concurrency programming.(duplicated)<ol><li><code>Security</code>, or correctness, is when a program executes concurrently with the expected results<ol><li>The <strong>visibility</strong></li><li>The <strong>order</strong></li><li>The <strong>atomic</strong></li></ol></li><li><code>Activeness</code>: Program must confront to <strong>deadlocks and livelocks</strong></li><li><code>Performance</code>: <strong>Less context switching, less kernel calls, less consistent traffic</strong>, and so on.</li></ol></li></ul><hr><ul><li>generic phases of software engineering<ol><li><code>Requirements analysis</code></li><li><code>Software Design</code></li><li><code>Implementation</code></li><li><code>Verification/Testing</code></li><li><code>Deployment</code></li><li><code>Maintenance</code></li></ol></li></ul><hr><ul><li>Agile Development(duplicated)<ol><li><code>Agile Values</code>:<ol><li>Individuals and <strong>interactions</strong> over processes and tools</li><li>Working software over <strong>comprehensive documentation</strong></li><li><strong>Customer collaboration</strong> over contract negotiation</li><li><strong>Responding to change</strong> over following a plan</li></ol></li><li><code>Agile Methods</code>:<ol><li>Frequently <strong>deliver small incremental</strong> units of functionality</li><li><strong>Define, build, test and evaluate cycles</strong></li><li>Maximize speed of <strong>feedback loop</strong></li></ol></li></ol></li></ul><hr><h3 id="Programming"><a href="#Programming" class="headerlink" title="Programming"></a>Programming</h3><ul><li>MVC(model-view-controller) model:<ul><li>MVC is an <strong>architectural pattern</strong> commonly used for developing <strong>user interfaces</strong> and allowing for effcient <strong>code reuse</strong> and <strong>paraller development</strong>.<ul><li>Model[probe]: an object carrying data. It can also have logic to <strong>update controller</strong> if its data changes.</li><li>View[frontend]: it can be any <strong>output representation of information</strong>, such as chart or a diagram.</li><li>Controller[backend]: accpet input and converts it to commands for the model or view</li></ul></li></ul></li></ul><hr><ul><li>NoSQL database(duplicated):<ol><li>NoSQL is <code>Not only SQL</code>, it has the advantages below: <ul><li><strong>Not using</strong> the <strong>relational model</strong> nor the SQL language. It is a collection of <strong>data items represented in a key-value store, document-store, wide column store, or a graph database</strong>.</li><li>Designed to run on <strong>large clusters</strong></li><li><strong>No schema</strong></li><li>Open Source</li></ul></li><li>NoSQL properties in detail:<ul><li><strong>Flexible scalability</strong></li><li><strong>Dynamic schema</strong> of data</li><li><strong>Efficient reading</strong></li><li><strong>Cost saving</strong></li></ul></li><li>NoSQL Technologies:<ul><li><strong>MapReduce</strong> programming model</li><li><strong>Key-value</strong> stores</li><li><strong>Document databases</strong></li><li><strong>Column-family stores</strong></li><li><strong>Graph databases</strong> </li></ul></li></ol></li></ul><hr><ul><li>Websocket(duplicated):<ul><li>WebSocket is a <strong>computer communications protocal</strong>, providing <strong>Bidirectional full-duplex communication channels</strong> over a single TCP connection and it is defined in <strong>RFC6445</strong>.</li><li>WebSocket is a different protocol from HTTP. Both protocols are located at <strong>layer 7 in the OSI model</strong> and depend on <strong>TCP at layer 4</strong>.</li><li>The WebSocket protocol enables interaction between a web client and a web server with lower overheads, facilitating real-time data transfer from and to the server.</li><li>working progress<ul><li>There are four main functions in Tornado<ul><li><code>open()</code>: Invoked when a new websocket is opened.</li><li><code>on_message(message)</code>: Handle incoming messages on the WebSocket</li><li><code>on_close()</code>: Invoke when the WebSocket is closed.</li><li><code>write_message(message)</code>: Sends the given message to the client of this Web Socket.</li></ul></li></ul></li></ul></li></ul><hr><ul><li>Differences between <code>git</code> and <code>svn</code>:<ol><li>Git is a <strong>distrubuted</strong> version control system; SVN is a non-distributed version control system.</li><li>Git has a <strong>centralized</strong> server and repository; SVN has <strong>non-centralized</strong> server and repository.</li><li>The content in Git is stored as <strong>metadata</strong>; SVN stores <strong>files of content</strong>.</li><li>Git branches are <strong>easier</strong> to work with than SVN branches.</li><li>Git does not have the <strong>global revision number</strong> feature like SVN has.</li><li>Git has <strong>better content protection</strong> than SVN,</li><li>Git was developed for <strong>Linux kernel</strong> by Linus Torvalds; SVN was deveploped by <strong>CollabNet</strong>.</li></ol></li></ul><h2 id="Essay-Questions"><a href="#Essay-Questions" class="headerlink" title="Essay Questions"></a>Essay Questions</h2><h3 id="Main-role-of-IP-address-port-DNS-CDN-for-network-software-design"><a href="#Main-role-of-IP-address-port-DNS-CDN-for-network-software-design" class="headerlink" title="Main role of IP address, port, DNS, CDN for network software design"></a>Main role of IP address, port, DNS, CDN for network software design</h3><ol><li>An internet Protocal address(IP address) is <strong>a numerical label</strong> assigned to each device connected to a computer network that <strong>uses the Internet Protocal for communication</strong>.</li><li>In computer networking, <strong>a port is an endpoint of communication</strong> and <strong>a logical construct that identifies a specific process</strong> or a type of network device.</li><li><strong>DNS(Domain Name System)</strong> is a <strong>hierarchical decentralized naming system</strong> for computers connected to the Internet or a private network,</li><li><strong>CDN(Content dilivery network/Content distributed network) is a geographically distributed network</strong> of proxy servers and their data centers. The goal is to distribute service spatially relative to end-users to <strong>provide high availability and high performance</strong>.</li></ol><h3 id="Difference-between-git-and-svn"><a href="#Difference-between-git-and-svn" class="headerlink" title="Difference between git and svn:"></a>Difference between <strong>git</strong> and <strong>svn</strong>:</h3><ol><li>Git is a <strong>distrubuted</strong> version control system; SVN is a non-distributed version control system.</li><li>Git has a <strong>centralized</strong> server and repository; SVN has <strong>non-centralized</strong> server and repository.</li><li>The content in Git is stored as <strong>metadata</strong>; SVN stores <strong>files of content</strong>.</li><li>Git branches are <strong>easier</strong> to work with than SVN branches.</li><li>Git does not have the <strong>global revision number</strong> feature like SVN has.</li><li>Git has <strong>better content protection</strong> than SVN,</li><li>Git was developed for <strong>Linux kernel</strong> by Linus Torvalds; SVN was deveploped by <strong>CollabNet</strong>.</li></ol><h3 id="main-principles-and-key-technologies-for-High-concurrency-programming"><a href="#main-principles-and-key-technologies-for-High-concurrency-programming" class="headerlink" title="main principles and key technologies for High concurrency programming"></a>main principles and key technologies for High concurrency programming</h3><ol><li><code>Security</code>, or correctness, is when a program executes concurrently with the expected results<ol><li>The <strong>visibility</strong></li><li>The <strong>order</strong></li><li>The <strong>atomic</strong></li></ol></li><li><code>Activeness</code>: Program must confront to <strong>deadlocks and livelocks</strong></li><li><code>Performance</code>: <strong>Less context switching, less kernel calls, less consistent traffic</strong>, and so on.</li></ol><h3 id="NoSQL-and-SQL-database"><a href="#NoSQL-and-SQL-database" class="headerlink" title="NoSQL and SQL database"></a>NoSQL and SQL database</h3><ol><li>RDBMS(Relational database management system)<ol><li>A relational database like SQL is a collection of <strong>data items organized by tables</strong>. It has features below:<ol><li><code>ACID</code> is a set of <strong>properties of relational database transactions</strong>.</li><li><code>Atomicity</code>: Each transaction is all or nothing</li><li><code>Consistency</code>: Any transaction will bring the database from one valid state to server.</li><li><code>Isolation</code>: Executing transaction has been committed, it will remain so.</li></ol></li></ol></li><li><p>NoSQL</p><ol><li>NoSQL is <code>Not only SQL</code>, it has the advantages below: <ul><li><strong>Not using</strong> the <strong>relational model</strong> nor the SQL language. It is a collection of <strong>data items represented in a key-value store, document-store, wide column store, or a graph database</strong>.</li><li>Designed to run on <strong>large clusters</strong></li><li><strong>No schema</strong></li><li>Open Source</li></ul></li><li>NoSQL properties in detail:<ul><li><strong>Flexible scalability</strong></li><li><strong>Dynamic schema</strong> of data</li><li><strong>Efficient reading</strong></li><li><strong>Cost saving</strong></li></ul></li><li>NoSQL Technologies:<ul><li><strong>MapReduce</strong> programming model</li><li><strong>Key-value</strong> stores</li><li><strong>Document databases</strong></li><li><strong>Column-family stores</strong></li><li><strong>Graph databases</strong> </li></ul></li></ol></li><li><p>SQL VS NoSQL</p><ol><li>Relational data model VS Document data model</li><li>Structured data VS semi-structured data</li><li>strict schema VS dynamic/flexible schema</li><li>relational data VS Non-relational data</li></ol></li></ol><h3 id="Websocket-working-progress"><a href="#Websocket-working-progress" class="headerlink" title="Websocket(working progress)"></a>Websocket(working progress)</h3><ul><li>WebSocket is a <strong>computer communications protocal</strong>, providing <strong>Bidirectional full-duplex communication channels</strong> over a single TCP connection and it is defined in <strong>RFC6445</strong>.</li><li>WebSocket is a different protocol from HTTP. Both protocols are located at <strong>layer 7 in the OSI model</strong> and depend on <strong>TCP at layer 4</strong>.</li><li>The WebSocket protocol enables interaction between a web client and a web server with lower overheads, facilitating real-time data transfer from and to the server.</li><li>working progress<ul><li>There are four main functions in Tornado<ul><li><code>open()</code>: Invoked when a new websocket is opened.</li><li><code>on_message(message)</code>: Handle incoming messages on the WebSocket</li><li><code>on_close()</code>: Invoke when the WebSocket is closed.</li><li><code>write_message(message)</code>: Sends the given message to the client of this Web Socket.</li></ul></li></ul></li></ul><h3 id="Agile-Development-and-scrum"><a href="#Agile-Development-and-scrum" class="headerlink" title="Agile Development and scrum"></a>Agile Development and scrum</h3><ol><li>Agile Values:<ol><li>Individuals and interactions over processes and tools</li><li>Working software over comprehensive documentation</li><li>Customer collaboration over contract negotiation</li><li>Responding to change over following a plan</li></ol></li><li><p>Agile Methods:</p><ol><li>Frequently deliver small incremental units of functionality</li><li>Define, build, test and evaluate cycles</li><li>Maximize speed of feedback loop</li></ol></li><li><p>Scrum is 3 roles:</p><ol><li>Development Team</li><li>Product Owner</li><li>Scrum Master</li></ol></li><li><p>Scrum is 4 events:</p><ol><li>Sprint Planning</li><li>Daily Stand-up Meeting</li><li>Sprint Review</li><li>Sprint Retrospective</li></ol></li><li><p>Scrum is 4 artifacts:</p><ol><li>Product Backlog</li><li>Sprint Backlog</li><li>User Stories</li><li>Scrum Board</li></ol></li></ol><h3 id="Explain-how-DNS-work"><a href="#Explain-how-DNS-work" class="headerlink" title="Explain how DNS work"></a>Explain how DNS work</h3><p>Definition: The Domain Name System(DNS) is the <strong>phonebook</strong> of the Internet. </p><ul><li>DNS <strong>translate domain names to IP address</strong> so browsers can load Internet resources.</li><li>DNS is hierarchical with a few authoritative serves at the top level.<ol><li>Your router or ISP provides information about DNS server to contact when doing a look up.</li><li>Low level DNS servers cache mappings, which could become stale due to DNS propagation delays. </li><li>DNS results can also be cached by your browser or OS for a certain period of time, determined by the time to live(TTL)</li></ol></li></ul><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fynty6a1xwj20e50vnjt4.jpg" alt=""></p><h3 id="Difference-between-docker-and-virtual-host"><a href="#Difference-between-docker-and-virtual-host" class="headerlink" title="Difference between docker and virtual host"></a>Difference between docker and virtual host</h3><ol><li>Virtual Machine definition: Virtualization is the technique of importing a Guest operating system <strong>on top of a Host operating system</strong>.</li><li>Docker definition: A container image is <strong>a lightweight, stand-alone, executable package of a piece of software</strong> that includes everything needed to run it.</li><li>Docker is the service to run <strong>multiple containers on a machine</strong> (node) which can be on a vitual machine or on a physical machine.</li><li>A virtual machine is an <strong>entire operating system</strong> (which normally is not lightweight).</li></ol><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fymtosr7vyj20vt0f0whj.jpg" alt="Difference between docker and virtual machine"></p><h3 id="Type-of-software-test"><a href="#Type-of-software-test" class="headerlink" title="Type of software test"></a>Type of software test</h3><ul><li><p>Black Box testing: Black box testing is a software testing method where testers <strong>are not required to know coding or internal structure</strong> of the software. Black box testing method relies on testing software with various inputs and validating results against expected output.</p></li><li><p>White Box testing: White box testing strategy deals with the <strong>internal logic and structure of the code</strong>. The tests written based on the white box testing strategy incorporate coverage of the code written, branches, paths, statements and internal logic of the code etc.</p></li><li><p>Equivalence Partitioning:Equivalence Partitioning is also known as Equivalence Class Partitioning is a software testing technique and not a type of testing by itself. Equivalence partitioning technique is <strong>used in black box and gray box testing types</strong>. Equivalence partitioning <strong>classifies test data into Equivalence classes as positive Equivalence classes and negative Equivalence classes</strong>, such classification ensures both positive and negative conditions are tested.</p></li></ul><h3 id="Explain-how-CDN-work"><a href="#Explain-how-CDN-work" class="headerlink" title="Explain how CDN work"></a>Explain how CDN work</h3><ul><li>Definition: <strong>CDN(Content dilivery network/Content distributed network) is a geographically distributed network</strong> of proxy servers and their data centers. The goal is to distribute service spatially relative to end-users to <strong>provide high availability and high performance</strong>.</li><li>Improve performance in two ways:<ul><li>Users receive content at data centers close to them.</li><li>Your servers do not have to serve requests that the CND fulfills</li></ul></li></ul><p>To minimize the distance between the visitors and your website’s server, a CDN stores a cached version of its content in multiple geographical locations (a.k.a., points of presence, or PoPs). Each PoP contains a number of caching servers responsible for content delivery to visitors within its proximity.</p><p>In essence, CDN puts your content in many places at once, providing superior coverage to your users. For example, when someone in London accesses your US-hosted website, it is done through a local UK PoP. This is much quicker than having the visitor’s requests, and your responses, travel the full width of the Atlantic and back.<br><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fymv8bwce7j20d00kqgne.jpg" alt="CDN"></p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference:"></a>Reference:</h2><ul><li><a href="https://www.iplocation.net/public-vs-private-ip-address">What is the difference between public and private IP address?</a></li><li><a href="http://www.differencebetween.net/technology/software-technology/difference-between-git-and-svn/">Difference Between Git and SVN</a></li><li><a href="https://www.tutorialspoint.com/design_pattern/mvc_pattern.htm">Design Patterns - MVC Pattern</a></li><li><a href="https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller">Model-view-controller</a></li><li><a href="https://en.wikipedia.org/wiki/IP_address">IP address</a></li><li><a href="https://en.wikipedia.org/wiki/Port_(computer_networking">Port(computer networking)</a>)</li><li><a href="https://en.wikipedia.org/wiki/Domain_Name_System">Domain Name System</a></li><li><a href="https://en.wikipedia.org/wiki/Content_delivery_network">Content delivery network/Content distributed network</a></li><li><a href="https://en.wikipedia.org/wiki/WebSocket">Websocket wiki</a></li><li><a href="https://www.tornadoweb.org/en/stable/websocket.html">tornado.websocket — Bidirectional communication to the browser</a></li><li><a href="https://wenku.baidu.com/view/df90877bf121dd36a22d827d.html">网络常见协议及端口号</a></li><li><a href="https://www.cloudflare.com/learning/dns/what-is-dns/">How does DNS work</a></li><li><a href="http://freewimaxinfo.com/routing-protocol-types.html">Routing Protocols Types (RIP, IGRP, OSPF, EGP, EIGRP, BGP, IS-IS)</a></li><li><a href="https://www.virtually-limitless.com/vcix-nv-study-guide/configure-dynamic-routing-protocols-ospf-bgp-is-is/">Configure dynamic routing protocols: OSPF, BGP, IS-IS</a></li><li><a href="https://stackoverflow.com/questions/48396690/docker-vs-virtual-machine">Docker vs Virtual Machine</a></li><li><a href="https://www.geeksforgeeks.org/types-software-testing/">Types of Software Testing</a></li><li><a href="https://www.testingexcellence.com/white-box-testing/">white-box-testing</a></li><li><a href="https://www.testingexcellence.com/types-of-software-testing-complete-list/">types-of-software-testing-complete-list</a></li><li><a href="https://blog.csdn.net/ITer_ZC/article/details/40748587">聊聊高并发专栏</a></li></ul>]]></content>
<categories>
<category> BUPT </category>
</categories>
<tags>
<tag> exam </tag>
</tags>
</entry>
<entry>
<title>信息论-信道容量迭代算法python实现版</title>
<link href="/2018/12/26/information-theory-channel-capacity-iteration-algorithm/"/>
<url>/2018/12/26/information-theory-channel-capacity-iteration-algorithm/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Talk-is-Cheap-Let-me-show-you-the-code"><a href="#Talk-is-Cheap-Let-me-show-you-the-code" class="headerlink" title="Talk is Cheap, Let me show you the code"></a>Talk is Cheap, Let me show you the code</h2><p><a href="https://colab.research.google.com/drive/1oFkI8WYPzQhvC1FLV7Fw8NR70EqNtENc">传送门</a></p>]]></content>
<categories>
<category> BUPT </category>
</categories>
<tags>
<tag> exam </tag>
<tag> infomationTheory </tag>
</tags>
</entry>
<entry>
<title>Neuralink and Brains'Magical Future</title>
<link href="/2018/12/25/Neuralink-and-Brains-Magical-Future/"/>
<url>/2018/12/25/Neuralink-and-Brains-Magical-Future/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h3 id="Part1-The-Human-Colossus"><a href="#Part1-The-Human-Colossus" class="headerlink" title="Part1: The Human Colossus"></a>Part1: The Human Colossus</h3><p>In this part, the brief history of humanbeing is displayed. The process of evolution:</p><ol><li>Sponge(600 Million BC): Data is just like saved in <code>Cache</code>~</li><li>Jellyfish(580 Million BC): The first animal has <code>nerves net</code> to save data from environment. Note that nerves net not only exist in its head but also in the whole body.</li><li>Flatworm(550 Million BC) and Frog(265 Million BC): The flatworm has nervous system in charge of everything.</li><li>Rodent(225 Million BC) and Tree mammal(80 Million BC): More complex animals.</li><li>Hominid(4 Million BC): The early version of neocortex. Hominid could think(complex thoughts, reason through decisions, long-term plans). When language had appeared, knowledges are saved in an intricate system(neural net). Homimid already has enough knowledge from their ancestors.</li><li>Computer Colossus(1990s): Computer network that can not learning to think.</li></ol><h3 id="Part2-The-Brain"><a href="#Part2-The-Brain" class="headerlink" title="Part2: The Brain"></a>Part2: The Brain</h3><p>Three membranes around brain: dura mater, arachnoid mater, pia mater.<br>Looking into brain, there are three parts: neomammalian, paleomammalian, reptilian.</p><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fz4u66o1a6j20ob0ll4qp.jpg" alt=""></p><ol><li>The Reptilian Brain(爬行脑): the brain stem<ol><li>The medulla[mi’dula] oblongata[abon’gata] (延髓): control involuntary things like heart rate, breathing, and blood pressure.</li><li>The pons(脑桥): generate actions about the little things like bladder control, facial expressions.</li><li>The mid brain(中脑): eyes moving.</li><li>The cerebellum(小脑): Stay balanced.</li></ol></li></ol><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fz4ukscpc0j20j50h9wst.jpg" alt=""></p><ol><li>The Paleo-Mammalian Brain(古哺乳脑): the limbic system(边缘脑)<ol><li>The amygdala(杏仁核): deal with anxiety, fear, happy feeling.</li><li>The hippocampus(海马体): a board for memory to direction.</li><li>The thalamus(丘脑): sensory middleman that receives information from your sensory organ and sends them to your cortex for processing.</li></ol></li></ol><p><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fz502o3cl3j20m80fntl1.jpg" alt=""></p><ol><li>The Neo-Mammalian Brain(新哺乳脑): The Cortex(皮质)<ol><li>The frontal lobe(前叶): Handle with reasoning, planning, executive function. And <strong>the adult in your head</strong> call <strong>prefrontal cortex</strong>(前额皮质).</li><li>The parietal lobe(顶叶): Controls sense of touch.</li><li>The temporal lobe(额叶): where your memory lives</li><li>The occipital lobe(枕叶): entirely dedicated to vision.</li></ol></li></ol><p>Inspiration from neural nets:<br><code>Neuroplasticity</code>: Neurons’ ability to alter themselves chemically, structurally, and even functionally, allow your brain’s neural network to optimize itself to the external world. Neuroplasticity makes sure that human can grow and change and learn new things throughout their whole lives.</p><h3 id="Reference"><a href="#Reference" class="headerlink" title="Reference:"></a>Reference:</h3><ul><li><a href="https://waitbutwhy.com/2017/04/neuralink.html">neuralink from wait but why</a></li></ul>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> agi </tag>
</tags>
</entry>
<entry>
<title>意识先验</title>
<link href="/2018/12/09/ConsciousnessPrior-Bengio/"/>
<url>/2018/12/09/ConsciousnessPrior-Bengio/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h1 id="意识先验理论"><a href="#意识先验理论" class="headerlink" title="意识先验理论"></a>意识先验理论</h1><h2 id="如何理解意识先验"><a href="#如何理解意识先验" class="headerlink" title="如何理解意识先验"></a>如何理解意识先验</h2><p>首先,意识先验这篇论文没有实验结果,是一篇纯粹的开脑洞的、理论性的文章。</p><p>论文中提到的意识先验更多的是对<strong>不同层次</strong>的信息的<strong>表征</strong>提取。例如:人类创造了高层次的概念,如符号(自然语言)来简化我们的思维。</p><p>2007 年,Bengio 与 Yann LeCun 合著的论文着重强调表征必须是多层的、逐渐抽象的。13年,Bengio 在综述论文中,增加了对解纠缠(Disentangling)的强调。</p><h3 id="RNN是个很好的例子"><a href="#RNN是个很好的例子" class="headerlink" title="RNN是个很好的例子"></a>RNN是个很好的例子</h3><p>RNN的隐藏状态包含一个低维度的子状态,可以用来解释过去,帮助预测未来,也可以作为自然语言来呈现。</p><p><img src="https://github.com/824zzy/blogResources/blob/master/picResources/ConsciousnessPrior.png?raw=true" alt="意识先验网络示意图"></p><h2 id="表征RNN(Representation-RNN-F)"><a href="#表征RNN(Representation-RNN-F)" class="headerlink" title="表征RNN(Representation RNN / F)"></a>表征RNN(Representation RNN / F)</h2><p>$$h_t = F(s_t,h_t−1)$$</p><p>Bengio提出表征RNN($F$)和表征状态$h_t$。其中$F$是包含了大脑中所有的神经连接权重。它们可以看作是我们的知识和经验,将一种表示状态映射到另一种表示状态。</p><p>表征RNN与一个人在不同环境学习到的知识、学识和经验相对应。即使有相同的$F$, 人们的反应和未来的想法也会不尽相同。表征状态$h_t$对应大脑所有神经元状态的聚合。并且他们可以被看作是当时环境(最底层信息)的表征。</p><h2 id="意识RNN-Consciousness-RNN-C"><a href="#意识RNN-Consciousness-RNN-C" class="headerlink" title="意识RNN (Consciousness RNN / C)"></a>意识RNN (Consciousness RNN / C)</h2><p>$$c_t=C(h<em>t,c</em>{t-1},z_t)$$</p><p>没有人能够有意识地体会到大脑里所有神经元是如何运作的。因为只有一小部分神经元与大脑此时正在思考的想法和概念相对应。因此意识是大脑神经元一个小的子集,或者说是副产品(by-product)。</p><p>因此Bengio认为,意识RNN本身应该包含某种注意力机制(当前在神经机器翻译中使用的)。他引入注意力作为额外的机制来描述大脑选择关注什么,以及如何预测或行动。</p><p>简而言之,意识RNN应该只“注意”意识向量更新自身时的重要细节,以<strong>减少计算量</strong>。</p><h2 id="验证网络(Verifier-Network-V)"><a href="#验证网络(Verifier-Network-V)" class="headerlink" title="验证网络(Verifier Network / V)"></a>验证网络(Verifier Network / V)</h2><p>$$V(h<em>t,c</em>{t-k})\in R$$</p><p>Bengio的思想还包含了一种训练方法,他称之为验证网络$V$。网络的目标是将当前的$h<em>t$表示与之前的意识状态$c</em>{t-k}$相匹配。在他的设想中可以用变分自动编码器(VAE)或GAN进行训练。</p><h2 id="语言与符号主义的联结"><a href="#语言与符号主义的联结" class="headerlink" title="语言与符号主义的联结"></a>语言与符号主义的联结</h2><p>深度学习的主要目标之一就是设计出能够习得更好表征的算法。好的表征理应是高度抽象的、高维且稀疏的,但同时,也能和自然语言以及符号主义 AI 中的『高层次要素』联系在一起。</p><p>语言和符号人工智能的联系在于:语言是一种“选择性的过程”,语言中的语句可以忽略世界上的大部分细节,而专注于少数。符号人工智能只需要了解世界的一个特定方面,而不是拥有一切的模型。</p><p>Bengio关于如何使这一点具体化的想法是:先有一个“意识”,它迫使一个模型拥有不同类型的“意识流”,这些“意识流”可以独立运作,捕捉世界的不同方面。例如,如果我在想象与某人交谈,我对那个人、他们的行为以及我与他们的互动有一种意识,但我不会在那一刻对我的视觉流中的所有像素进行建模。</p><h3 id="思考:快与慢"><a href="#思考:快与慢" class="headerlink" title="思考:快与慢"></a>思考:快与慢</h3><p>人类的认知任务可以分为系统 1 认知(System 1 cognition)和系统 2 认知(System 2 cognition)。系统 1 认知任务是那些你可以在不到 1 秒时间内无意识完成的任务。例如你可以很快认出手上拿着的物体是一个瓶子,但是无法向其他人解释如何完成这项任务。这也是当前深度学习擅长的事情,「感知」。<br>系统 2 认知任务与系统 1 任务的方式完全相反,它们很「慢」且有意识。例如计算「23*56」,大多数人需要有意识地遵循一定的规则、按照步骤完成计算。完成的方法可以用语言解释,而另一个人可以理解并重现。这是算法,是计算机科学的本意,符号主义 AI 的目标,也属于此类。<br>人类联合完成系统 1 与系统 2 任务,人工智能也理应这样。</p><h2 id="还有很多问题需要解决"><a href="#还有很多问题需要解决" class="headerlink" title="还有很多问题需要解决"></a>还有很多问题需要解决</h2><h3 id="训练的目标函数是什么?"><a href="#训练的目标函数是什么?" class="headerlink" title="训练的目标函数是什么?"></a>训练的目标函数是什么?</h3><p>标准的深度学习算法的目标函数通常基于最大似然,但是我们很难指望最大似然的信号能够一路经由反向传播穿过用于预测的网络,穿过意识RNN,最终到达表征 RNN。</p><p>最大似然与意识先验的思想天然存在冲突。「人类从不在像素空间进行想象与生成任务,人类只在高度抽象的语义空间使用想象力,生成一张像素级的图像并非人类需要完成的任务。」因此,在训练目标里引入基于表征空间的项目就变得顺理成章。<strong>不在原始数据空间内定义目标函数</strong></p><h3 id="梯度下降是否适用于意识先验?"><a href="#梯度下降是否适用于意识先验?" class="headerlink" title="梯度下降是否适用于意识先验?"></a>梯度下降是否适用于意识先验?</h3><blockquote><p>Jaderberg, M., Czarnecki, W. M., Osindero, S., Vinyals, O., Graves, A., Silver, D., & Kavukcuoglu, K. (2016). Decoupled neural interfaces using synthetic gradients. arXiv preprint arXiv:1608.05343.</p></blockquote><p>除了目标函数之外,意识先验的优化方式也会和经典深度学习有所不同。Bengio: 什么样的优化方式最适合意识先验?我仍然不知道这个问题的答案。<br>在他看来,一类很有前景的研究是合成梯度(synthetic gradient)。</p><p>有了合成梯度之后,每一层的梯度可以单独更新了。但是当时间步继续拉长,问题仍然存在。理论上反向传播可以处理相当长的序列,但是鉴于人类处理时间的方式并非反向传播,可以轻松跨越任意时长,等「理论上」遇到一千乃至一万步的情况,实际上就不奏效了。</p><h3 id="信用分配的仍然是最大的问题"><a href="#信用分配的仍然是最大的问题" class="headerlink" title="信用分配的仍然是最大的问题"></a>信用分配的仍然是最大的问题</h3><blockquote><p>Ke, N. R., Goyal, A., Bilaniuk, O., Binas, J., Mozer, M. C., Pal, C., & Bengio, Y. (2018). Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding. arXiv preprint arXiv:1809.03702.</p></blockquote><p>换言之,我们对时间的信用分配(credit assignment)问题的理解仍然有待提高。「比如你在开车的时候听到『卟』的一声,但是你没在意。三个小时之后你停下车,看到有一个轮胎漏气了,立刻,你的脑海里就会把瘪轮胎和三小时前的『卟』声联系起来——不需要逐个时间步回忆,直接跳到过去的某个时间,当场进行信用分配。」。受人脑的信用分配方式启发,Bengio 的团队尝试了一种稀疏注意回溯(Sparse Attentive Backtracking)方法。「我们有一篇关于时间信用分配的工作,是 NIPS 2018 的论文,能够跳过成千上万个时间步,利用对记忆的访问直接回到过去——就像人脑在获得一个提醒时所作的那样——直接对一件事进行信用分配。」</p><h2 id="关于意识先验的代码"><a href="#关于意识先验的代码" class="headerlink" title="关于意识先验的代码"></a>关于意识先验的代码</h2><ul><li>论文:<a href="https://ai-on.org/pdf/bengio-consciousness-prior.pdf">Experiments on the Consciousness Prior</a></li><li>代码:<a href="https://github.com/AI-ON/TheConsciousnessPrior/tree/master/src">TheConsciousnessPrior github</a></li></ul><h2 id="参考与引用"><a href="#参考与引用" class="headerlink" title="参考与引用"></a>参考与引用</h2><ul><li><a href="http://thegrandjanitor.com/2018/05/09/a-read-on-the-consciousness-prior-by-prof-yoshua-bengio/">A READ ON “THE CONSCIOUSNESS PRIOR” BY PROF. YOSHUA BENGIO</a></li><li><a href="https://www.quora.com/What-is-Yoshua-Bengios-new-Consciousness-Prior-paper-about">What is Yoshua Bengio’s new “Consciousness Prior” paper about?</a></li><li><a href="https://www.reddit.com/r/MachineLearning/comments/72h5zf/r_the_consciousness_prior/">reddit</a></li><li><a href="https://www.jiqizhixin.com/articles/2018-11-29-7">Yoshua Bengio访谈笔记:用意识先验糅合符号主义与联结主义</a></li></ul>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> agi </tag>
<tag> bengio </tag>
</tags>
</entry>
<entry>
<title>最大概率汉语切分作业</title>
<link href="/2018/12/06/Chinese-segmentation-Homework/"/>
<url>/2018/12/06/Chinese-segmentation-Homework/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Talk-is-Cheap-Let-me-show-you-the-code"><a href="#Talk-is-Cheap-Let-me-show-you-the-code" class="headerlink" title="Talk is Cheap, Let me show you the code"></a>Talk is Cheap, Let me show you the code</h2><p><a href="https://colab.research.google.com/drive/1ns3HetlP-8Np6GdF-mjaa0zIaB8k9lDG#scrollTo=XX1EqIqD3Y9q">传送门</a></p>]]></content>
<categories>
<category> NLP </category>
</categories>
<tags>
<tag> chineseSegmentation </tag>
<tag> bupt </tag>
</tags>
</entry>
<entry>
<title>数据挖掘文本分类作业</title>
<link href="/2018/12/06/DataMining-Homework/"/>
<url>/2018/12/06/DataMining-Homework/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Talk-is-Cheap-Let-me-show-you-the-code"><a href="#Talk-is-Cheap-Let-me-show-you-the-code" class="headerlink" title="Talk is Cheap, Let me show you the code"></a>Talk is Cheap, Let me show you the code</h2><p><a href="https://colab.research.google.com/drive/1AIzOZinBCn7iHo8Dx6AgLMRBzy2WglA-#scrollTo=ZPYreCTDWQB_&uniqifier=4">传送门</a></p>]]></content>
<categories>
<category> ML </category>
</categories>
<tags>
<tag> bupt </tag>
<tag> dataMining </tag>
</tags>
</entry>
<entry>
<title>世界模型(World Model)实验以及原理</title>
<link href="/2018/11/24/world-model-experiment-and-priciple/"/>
<url>/2018/11/24/world-model-experiment-and-priciple/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Basic-Concepts-in-Reinforcement-Learning"><a href="#Basic-Concepts-in-Reinforcement-Learning" class="headerlink" title="Basic Concepts in Reinforcement Learning"></a>Basic Concepts in Reinforcement Learning</h2><div class="row"><iframe src="https://drive.google.com/file/d/1LceWiaaxbhfrV1mEvwP0ojCefpY5A_gt/preview" style="width:100%; height:550px"></iframe></div><h2 id="世界模型的实验"><a href="#世界模型的实验" class="headerlink" title="世界模型的实验"></a>世界模型的实验</h2><p>[World models on colab]{<a href="https://colab.research.google.com,/drive/1sF2iUdhMbm2mwdECvy01l9iWEFiIodlb#scrollTo=pCg_b9DOwDN6}">https://colab.research.google.com,/drive/1sF2iUdhMbm2mwdECvy01l9iWEFiIodlb#scrollTo=pCg_b9DOwDN6}</a></p><h3 id="补充:在colab上显示游戏环境"><a href="#补充:在colab上显示游戏环境" class="headerlink" title="补充:在colab上显示游戏环境"></a>补充:在colab上显示游戏环境</h3><p>通过<code>xvfb</code>,我们可以很方便的在colab上观察训练过程与模型训练结果。</p><p>demo: <a href="https://colab.research.google.com/drive/13XzgZo_CZuMYrgbiIJiurD-_tdAuUJOl#scrollTo=4fOPouQND0FA">Policy Gradient display on Colab</a></p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><p><a href="https://arxiv.org/pdf/1511.09249.pdf">On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models</a></p></li><li><p><a href="https://ai.intel.com/demystifying-deep-reinforcement-learning/">Guest Post (Part I): Demystifying Deep Reinforcement Learning</a> </p></li><li><a href="http://kvfrans.com/simple-algoritms-for-solving-cartpole/">Simple reinforcement learning methods to learn CartPole</a></li><li><a href="https://arxiv.org/pdf/1802.08864.pdf">One Big Net For Everything</a></li><li><a href="http://kvfrans.com/simple-algoritms-for-solving-cartpole/">Simple reinforcement learning methods to learn CartPole</a></li><li><a href="https://medium.com/applied-data-science/how-to-build-your-own-world-model-using-python-and-keras-64fb388ba459">Hallucinogenic Deep Reinforcement Learning Using Python and Keras</a></li></ul>]]></content>
<categories>
<category> WorldModel </category>
</categories>
<tags>
<tag> math </tag>
<tag> demo </tag>
</tags>
</entry>
<entry>
<title>通用人工智能(AGI)</title>
<link href="/2018/11/09/artificial-general-intelligence/"/>
<url>/2018/11/09/artificial-general-intelligence/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Artificial-General-Intelligence-Lex-Fridman"><a href="#Artificial-General-Intelligence-Lex-Fridman" class="headerlink" title="Artificial General Intelligence(Lex Fridman)"></a>Artificial General Intelligence(Lex Fridman)</h2><h3 id="Something-about-Lex-Fridman"><a href="#Something-about-Lex-Fridman" class="headerlink" title="Something about Lex Fridman"></a>Something about Lex Fridman</h3><h3 id="MIT-AGI-Misson-Engineer-Intelligence"><a href="#MIT-AGI-Misson-Engineer-Intelligence" class="headerlink" title="MIT AGI Misson: Engineer Intelligence"></a>MIT AGI Misson: Engineer Intelligence</h3><p>Goals:</p><ol><li>avoid the pitfalls of “black box”: Media often reports AI like fiction. Hype is the first enemy to us.</li><li>avoid the pitfalls of “I am just a scientist”.</li></ol><h3 id="How-far-away-from-creating-intelligent-systems"><a href="#How-far-away-from-creating-intelligent-systems" class="headerlink" title="How far away from creating intelligent systems"></a>How far away from creating intelligent systems</h3><p>Analogy: we are in the dark room looking for a switch with no knowledge of where the light switch is.</p><p>Exploration travel for the sake of discovery and adventure is human compulsion.</p><h2 id="Building-machines-that-see-and-think-like-people-Josh-Tenenbaum"><a href="#Building-machines-that-see-and-think-like-people-Josh-Tenenbaum" class="headerlink" title="Building machines that see, and think like people(Josh Tenenbaum)"></a>Building machines that see, and think like people(Josh Tenenbaum)</h2><h3 id="Something-about-Josh-Tenenbaum"><a href="#Something-about-Josh-Tenenbaum" class="headerlink" title="Something about Josh Tenenbaum"></a>Something about Josh Tenenbaum</h3><h3 id="AI-technologies-no-real-AI"><a href="#AI-technologies-no-real-AI" class="headerlink" title="AI technologies no real AI"></a>AI technologies no real AI</h3><p>Intelligence is not just about pattern recognition, it is about modeling the world.</p><h3 id="Sally-port-visual-intelligence-of-our-near-term-focus"><a href="#Sally-port-visual-intelligence-of-our-near-term-focus" class="headerlink" title="Sally port: visual intelligence of our near-term focus"></a>Sally port: visual intelligence of our near-term focus</h3><p>Some part of your brain is tracking the whole world around you. And you track your world model to plan your actions</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx3a8gz9dfj20hs0am0v8.jpg" alt=""></p><h3 id="The-roots-for-common-sense"><a href="#The-roots-for-common-sense" class="headerlink" title="The roots for common sense"></a>The roots for common sense</h3><p>reverse engineer our brain, figure out how brain can formulate a goal and be able to acheive it. </p><h3 id="How-do-we-build-this-architecture"><a href="#How-do-we-build-this-architecture" class="headerlink" title="How do we build this architecture?"></a>How do we build this architecture?</h3><ul><li>symbolic language for knowledge representation</li><li>probabilistic inference in generative models to capture uncertainty</li><li>neural network for pattern recognition</li></ul><p><strong>Inference means that our model runs a few low precision simulations for a few time steps</strong><br><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx3zgiybhzj20e6043js1.jpg" alt=""></p><p>Mental simulation engines based on probabilistic programs<br><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx3zfqc8u1j208x0d8wg5.jpg" alt=""></p><h2 id="OpenAI-Meta-Learning-and-Self-play-Ilya-Sutskever"><a href="#OpenAI-Meta-Learning-and-Self-play-Ilya-Sutskever" class="headerlink" title="OpenAI Meta-Learning and Self-play(Ilya Sutskever)"></a>OpenAI Meta-Learning and Self-play(Ilya Sutskever)</h2><h3 id="Why-do-neural-networks-work"><a href="#Why-do-neural-networks-work" class="headerlink" title="Why do neural networks work?"></a>Why do neural networks work?</h3><p>shortest program that fits training data is the best possible generalization.</p><p>Reinforcement Learning is a good framework for building intelligent agent.</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx443zm4tuj20hd086acq.jpg" alt=""></p><p>But note that RL framework is not quite complete because it assumes the reward is given by the environnment. But in reality, <strong>the agent rewards itself</strong>.</p><h3 id="Reinfocement-Learning-algorithms-in-a-nutshell"><a href="#Reinfocement-Learning-algorithms-in-a-nutshell" class="headerlink" title="Reinfocement Learning algorithms in a nutshell"></a>Reinfocement Learning algorithms in a nutshell</h3><p>Try something new add randomness directions and compare the result to your expectation. </p><p>If the result was better than expected, do more of the same in the future.</p><h4 id="Model-free-RL-Two-classes-of-algorithms"><a href="#Model-free-RL-Two-classes-of-algorithms" class="headerlink" title="Model-free RL: Two classes of algorithms"></a>Model-free RL: Two classes of algorithms</h4><ul><li>Policy Gradients:<ol><li>Just take the gradient</li><li>Stable, easy to use</li><li>Very few tricks needed</li><li><strong>On policy</strong></li></ol></li><li>Q-learning based:<ol><li>Less stable, more sample efficient</li><li>won’t explain how it works</li><li><strong>Off policy</strong>: can be trained on data generated by some other policy</li></ol></li></ul><h4 id="Meta-learning"><a href="#Meta-learning" class="headerlink" title="Meta-learning"></a>Meta-learning</h4><p>Our dream is:</p><ul><li>Learn to learn</li><li>Train a system on many tasks</li><li>Resulting system can solve new tasks quickly</li></ul><h3 id="Exploration-in-RL-a-key-challenge"><a href="#Exploration-in-RL-a-key-challenge" class="headerlink" title="Exploration in RL: a key challenge"></a>Exploration in RL: a key challenge</h3><p>Random behavior must generate some reward and you must get rewards from time to time, otherwise learning will not occur. So if the reward is too sparse, agent cannot learn.</p><h3 id="It-would-be-nice-if-learning-was-hierarchical"><a href="#It-would-be-nice-if-learning-was-hierarchical" class="headerlink" title="It would be nice if learning was hierarchical"></a>It would be nice if learning was hierarchical</h3><blockquote><p>Current RL learns by trying out random actions at each timestep</p></blockquote><p>Agent may require a real “model” to really solve this problem.</p><h3 id="Self-Play-that-is-very-cool"><a href="#Self-Play-that-is-very-cool" class="headerlink" title="Self-Play: that is very cool"></a><strong>Self-Play</strong>: that is very cool</h3><p>Crux: The agents create the environment by virture of the agent acting in the environment</p><p>Here comes the question: can we train AGI via self-play among multi-agents?</p><p>It’s unknown.</p><h2 id="MSRA-presentation-given-by-Yoshua-Bengio"><a href="#MSRA-presentation-given-by-Yoshua-Bengio" class="headerlink" title="MSRA presentation given by Yoshua Bengio"></a>MSRA presentation given by Yoshua Bengio</h2><h3 id="Principle-in-Bengio’s-idea"><a href="#Principle-in-Bengio’s-idea" class="headerlink" title="Principle in Bengio’s idea."></a>Principle in Bengio’s idea.</h3><h4 id="World-Models"><a href="#World-Models" class="headerlink" title="World Models"></a>World Models</h4><p>we(human-being) have a mental model, that could capture facts of our world to some extending and humans generalize better than other animals thanks to a more accurate internal model of the <strong>underlying causal relationships</strong> </p><h4 id="Shortcoming-in-current-model"><a href="#Shortcoming-in-current-model" class="headerlink" title="Shortcoming in current model"></a>Shortcoming in current model</h4><p>So long as our machine learning models ‘cheat’ by relying only on superficial statistical<br>regularities, however they remain vulnerable to out-of-distribution examples.</p><h3 id="Possible-solutions"><a href="#Possible-solutions" class="headerlink" title="Possible solutions"></a>Possible solutions</h3><h4 id="Prediction"><a href="#Prediction" class="headerlink" title="Prediction"></a>Prediction</h4><p>To predict future situations(e.g., the effect of planned actions) far from anything seen before while involving known concepts, an essential component of reasoning intelligence and science.</p><h4 id="Invariance"><a href="#Invariance" class="headerlink" title="Invariance"></a>Invariance</h4><p>Our systems need to be invariant about deep understanding : models for recognition and generation clearly don’t understand in the crucial abstractions. </p><h4 id="Imagination"><a href="#Imagination" class="headerlink" title="Imagination"></a>Imagination</h4><p>Real-life applications often require generalizations in regimes not seen during training, so humans can project themselves in situation they have never seen or never experience.</p><h4 id="Subjective-Knowledge"><a href="#Subjective-Knowledge" class="headerlink" title="Subjective Knowledge"></a>Subjective Knowledge</h4><p>Our brain can come up with control policies that can influence specific aspects of the world: an agent acquires by interacting in the world which is that it’s not universal knowledge, it’s subjective knowledge</p><h3 id="Present-Development-Stage"><a href="#Present-Development-Stage" class="headerlink" title="Present Development Stage"></a>Present Development Stage</h3><h4 id="More-elements-as-prior"><a href="#More-elements-as-prior" class="headerlink" title="More elements as prior"></a>More elements as prior</h4><ol><li>Spatial & temporal scales</li><li>Marginal independence</li><li>Simple dependencies between factors<ol><li>Consciousness prior: arXiv(1709.08568)</li></ol></li><li>Causal / Mechanism independence<ol><li>Controllable factors</li></ol></li></ol><h4 id="Content-based-Attention-Attention-Mechanism"><a href="#Content-based-Attention-Attention-Mechanism" class="headerlink" title="Content-based Attention(Attention Mechanism)"></a>Content-based Attention(Attention Mechanism)</h4><p>to select a few relevant abstract concepts making a thought.</p><h4 id="TODO-future-work"><a href="#TODO-future-work" class="headerlink" title="TODO future work"></a>TODO future work</h4><p>The ability to do credit assignment through very long time spans. There are also shortcoming for current RNN architecture: I can remember something I did laster year.</p><h2 id="Godel-Machines-Meta-Learning-and-LSTMS-Jurgen-Schmidhuber"><a href="#Godel-Machines-Meta-Learning-and-LSTMS-Jurgen-Schmidhuber" class="headerlink" title="Godel Machines, Meta-Learning, and LSTMS(Jurgen Schmidhuber)"></a>Godel Machines, Meta-Learning, and LSTMS(Jurgen Schmidhuber)</h2><ul><li>Simplicity is beauty</li><li>History of science is a history of compression progress. </li><li>Humans are curious and curiosity strategy is a discovery of evolution(A guy who explores the unknown world has a higher chance of solving problems that he needs to survive in this world)</li><li>Consciousness may be a byproduct of problem-solving.</li><li>What we do now since 2015(now is 2018) is CM(controller model) system which we give the controller the opportunity to learn by itself </li></ul><h3 id="AMA-Ask-me-anthing-on-reddit-Jurgen-Schimidhuber"><a href="#AMA-Ask-me-anthing-on-reddit-Jurgen-Schimidhuber" class="headerlink" title="AMA(Ask me anthing) on reddit: Jurgen Schimidhuber"></a>AMA(Ask me anthing) on reddit: Jurgen Schimidhuber</h3><h4 id="Question-What’s-something-that’s-true-but-almost-nobody-agrees-with-you-on"><a href="#Question-What’s-something-that’s-true-but-almost-nobody-agrees-with-you-on" class="headerlink" title="Question: What’s something that’s true, but almost nobody agrees with you on?"></a>Question: What’s something that’s true, but almost nobody agrees with you on?</h4><p>Intelligence is just the product of a few principles that will be considered very simple in<br>hindsignt. There are partial justification:</p><p>Theoretically optimal in some abstract sense although they just consist of a few formulas:</p><p>Humanbeing make predictions based on observations. Every AI scientist wants to find a<br>theoretically optimal way of predicting:</p><p>Normally we do not know the true conditional probability: $P(next|past)$.<br>But assume we do know that $p$ is in some set $P$ of distrikkbutions.</p><p>Given $q$ in $P$, we obtain Bayiesmix: $M(x)=\sum_q w_q q(x)$. We can predict using $M$ instead of the optimal but unknown $p$, </p><p>Let $LM(n)$ and $Lp(n)$ be the total expected losses of the M-predictor and the p-predictor.</p><p>Then LM(n)-Lp(n) is at most of the order of $\sqrt{[Lp(n)]}$. That is, M is not much worse than p. And in general, no other predictor can do better than that!</p><p>Once we have an optimal predictor, in principle we alse should have an optimal decision maker or reinforcement learner that always picking those action sequences with the highest predicted success, that is a universal AI.</p><h4 id="His-favorite-Theory-of-Consciousness-TOC"><a href="#His-favorite-Theory-of-Consciousness-TOC" class="headerlink" title="His favorite Theory of Consciousness(TOC)"></a>His favorite Theory of Consciousness(TOC)</h4><p>Karl Popper famously said: “All life is problem solving.” No theory of consciousness is necessary to define the objectives of a general problem solver. From an AGI point of view, consciousness is at best a by-product of a general problem solving procedure.</p><p>Where do the symbols and self-symbols underlying consciousness and sentience come from? I think they come from data compression during problem solving.</p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ol><li><a href="https://www.youtube.com/watch?v=-GV_A9Js2nM&list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4&index=2&t=0s">MIT AGI: Artificial General Intelligence(Lex Fridman)</a></li><li><a href="https://www.youtube.com/watch?v=7ROelYvo8f0&list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4&index=8&t=0s">MIT AGI: Building machines that see, learn, and think like people (Josh Tenenbaum)</a></li><li><a href="https://www.youtube.com/watch?v=9EN_HoEk3KY&list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4&index=4&t=0s">MIT AGI: OpenAI Meta-Learning and Self-Play (Ilya Sutskever)</a></li><li><a href="https://www.youtube.com/results?search_query=bengio&pbjreload=10">Bengio Interview</a></li><li><a href="https://www.reddit.com/r/MachineLearning/comments/2xcyrl/i_am_j%C3%BCrgen_schmidhuber_ama/">I am Jürgen Schmidhuber, AMA!</a></li><li><a href="https://www.youtube.com/watch?v=3FIo6evmweo">MIT AI: Godel Machines, Meta-Learning, and LSTMs (Juergen Schmidhuber)</a></li></ol>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> note </tag>
<tag> agi </tag>
<tag> mit </tag>
</tags>
</entry>
<entry>
<title>元学习(meta-learning)</title>
<link href="/2018/11/09/meta-learning-introduction/"/>
<url>/2018/11/09/meta-learning-introduction/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Two-problems-we-confront"><a href="#Two-problems-we-confront" class="headerlink" title="Two problems we confront"></a>Two problems we confront</h2><ol><li>Sample efficiency: models typically need 6000 samples per digit to recognize digit handwriting.</li><li>Poor transferablity: models don’t learn from previous experience or learned knowledge. </li></ol><p>So meta-learning is the solution to the two questions above. And we try to define it as “learning how to learn”. Our dream is:</p><ul><li>Learn to learn</li><li>Train a system on many tasks</li><li>Resulting system can solve new tasks quickly</li></ul><h2 id="Some-basic-concepts"><a href="#Some-basic-concepts" class="headerlink" title="Some basic concepts"></a>Some basic concepts</h2><h3 id="Few-shot-Learning"><a href="#Few-shot-Learning" class="headerlink" title="Few-shot Learning"></a>Few-shot Learning</h3><p>In deep learning, we use regularization to make sure we are not overfitting out model with a small dataset, but we are <strong>overfitting our task</strong>. Therefore what we learned cannot be generalized to other tasks.</p><p>We often get stuck when test samples that are <strong>not common</strong> in dataset.</p><p>In <strong>one-shot-learning</strong>, we will only provide one training sample per category. There is an example:</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx2w9gmja6j20m80fzk0s.jpg" alt=""></p><p>In this one-shot learning, we often train a RNN to learn the training data and labels. When we represent with a test input, we should predict its label correctly.</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx2wk8ezr8j20m806rq3e.jpg" alt=""></p><p>In meta-testing, we provide many datasets again with classes that never trained before. Once we have learned from hundred tasks, we should discover the general pattern in classifying objects.</p><h2 id="Recurrent-Models"><a href="#Recurrent-Models" class="headerlink" title="Recurrent Models"></a>Recurrent Models</h2><h3 id="Memory-Augmented-Neural-Networks"><a href="#Memory-Augmented-Neural-Networks" class="headerlink" title="Memory-Augmented Neural Networks"></a>Memory-Augmented Neural Networks</h3><p>One of the meta-learning methods using an external memory network with RNN. Note that in supervised learning, we provide both input and label in the same time step $t$. However, in this model, the label is not provided untild the next time step $t+1$(shown below).</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx2xk82uvzj20m807aaa7.jpg" alt=""></p><p>When updating the model, instead of updating the model immediately, we wait until a batch of tasks is completed. We later merge all we learned from these tasks for a single update.</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx2y80btkjj20m8058wex.jpg" alt=""></p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ol><li><a href="https://medium.com/@jonathan_hui/meta-learning-how-we-address-the-shortcomings-of-our-deep-networks-a008aa4b5b2b">RL — Meta-Learning</a></li><li><a href="https://medium.com/huggingface/from-zero-to-research-an-introduction-to-meta-learning-8e16e677f78a">From zero to research — An introduction to Meta-learning</a></li><li><a href="https://www.youtube.com/watch?v=9EN_HoEk3KY&list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4&index=4&t=0s">MIT AGI: OpenAI Meta-Learning and Self-Play (Ilya Sutskever)</a></li></ol>]]></content>
<categories>
<category> ReinforcementLearning </category>
</categories>
<tags>
<tag> meta-learning </tag>
</tags>
</entry>
<entry>
<title>混合密度网络</title>
<link href="/2018/10/31/mixture-density-network-note/"/>
<url>/2018/10/31/mixture-density-network-note/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="My-Implementation-with-Tensorflow-Eager-Execution"><a href="#My-Implementation-with-Tensorflow-Eager-Execution" class="headerlink" title="My Implementation with Tensorflow Eager Execution"></a>My Implementation with Tensorflow Eager Execution</h2><p><a href="https://colab.research.google.com/drive/1113X6Yx-XglANAzl933nb6vUynpy6VYk#scrollTo=km0IIHkaPH0T">IPython Notebook on Colab</a></p><h2 id="Key-equations-for-Mixture-Density-Networks"><a href="#Key-equations-for-Mixture-Density-Networks" class="headerlink" title="Key equations for Mixture Density Networks"></a>Key equations for Mixture Density Networks</h2><div class="row"><iframe src="https://drive.google.com/file/d/1T7ROj1FnzSdJUjQyd_4T4sRK4klQ1s2y/preview" style="width:100%; height:550px"></iframe></div><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ol><li><a href="http://blog.otoro.net/2015/11/24/mixture-density-networks-with-tensorflow/">Mixture Density Networks with Tensorflow</a></li><li><a href="http://blog.otoro.net/2015/06/14/mixture-density-networks/">Mixture Density Networks</a></li><li><a href="https://github.com/hardmaru/pytorch_notebooks/blob/master/mixture_density_networks.ipynb">Mixture Density Networks with Pytorch</a></li><li><a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/bishop-ncrg-94-004.pdf">Mixture Density Networks</a></li></ol>]]></content>
<categories>
<category> WorldModel </category>
</categories>
<tags>
<tag> math </tag>
<tag> evolutionStategy </tag>
</tags>
</entry>
<entry>
<title>Evolution strategies</title>
<link href="/2018/10/30/CMA-ES-note/"/>
<url>/2018/10/30/CMA-ES-note/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h3 id="Handwriting-Version-for-this-post"><a href="#Handwriting-Version-for-this-post" class="headerlink" title="Handwriting Version for this post"></a>Handwriting Version for this post</h3><div class="row"><iframe src="https://drive.google.com/file/d/1A9MUO2PQ3OjwCoZVr8XBh-yKa5lovO42/preview" style="width:100%; height:550px"></iframe></div><h3 id="Problems-in-BackPropagation-and-Gradient-Descent"><a href="#Problems-in-BackPropagation-and-Gradient-Descent" class="headerlink" title="Problems in BackPropagation and Gradient Descent"></a>Problems in BackPropagation and Gradient Descent</h3><ol><li><p>the gradient of reward signals given to the agent is realised many timesteps in the future. Questions above can be seem as <strong>Credit Assignment</strong></p></li><li><p>there is the issue of being stuck in a local optimum.</p></li></ol><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fwq7har0x0j206o06g3zn.jpg" width="50%" height="30%"></p><h3 id="Pseudo-code-of-Basic-Evolution-Strategy"><a href="#Pseudo-code-of-Basic-Evolution-Strategy" class="headerlink" title="Pseudo code of Basic Evolution Strategy"></a>Pseudo code of Basic Evolution Strategy</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">solver = EvolutionStrategy()</span><br><span class="line"><span class="keyword">while</span> <span class="literal">True</span>:</span><br><span class="line"> <span class="comment"># ask the ES to give us a set of candidate solutions</span></span><br><span class="line"> solutions = solver.ask()</span><br><span class="line"> <span class="comment"># create an array to hold the fitness results.</span></span><br><span class="line"> fitness_list = np.zeros(solver.popsize)</span><br><span class="line"> <span class="comment"># evaluate the fitness for each given solution.</span></span><br><span class="line"> <span class="keyword">for</span> i <span class="keyword">in</span> <span class="built_in">range</span>(solver.popsize):</span><br><span class="line"> fitness_list[i] = evaluate(solutions[i])</span><br><span class="line"> <span class="comment"># give list of fitness results back to ES</span></span><br><span class="line"> solver.tell(fitness_list)</span><br><span class="line"> <span class="comment"># get best parameter, fitness from ES</span></span><br><span class="line"> best_solution, best_fitness = solver.result()</span><br><span class="line"> <span class="keyword">if</span> best_fitness > MY_REQUIRED_FITNESS:</span><br><span class="line"> <span class="keyword">break</span></span><br></pre></td></tr></table></figure><h3 id="Advantages-in-Evolution-Strategies"><a href="#Advantages-in-Evolution-Strategies" class="headerlink" title="Advantages in Evolution Strategies"></a>Advantages in Evolution Strategies</h3><ol><li>Easier to scale in a distributed setting(easy to parallelize).</li><li>It does not suffer in settings with sparse rewards.</li><li>It has fewer hyperparameters.</li><li>It is effective at finding solutions for RL tasks.</li></ol><p><img src="http://blog.otoro.net/assets/20171031/schaffer/simplees.gif" width="30%" height="30%"></p><h3 id="Improvement-of-Covariance-Matrix-Adaptive-Evolution-Strategy"><a href="#Improvement-of-Covariance-Matrix-Adaptive-Evolution-Strategy" class="headerlink" title="Improvement of Covariance Matrix Adaptive Evolution Strategy"></a>Improvement of Covariance Matrix Adaptive Evolution Strategy</h3><p>We want to explore more and increase the standard deviation of our search space.<br>And there are times when we are confident we are close to good optima and just want to fine-tune the solution.<br><img src="http://blog.otoro.net/assets/20171031/schaffer/cmaes.gif" width="30%" height="30%"></p><h3 id="Details-of-algorithm"><a href="#Details-of-algorithm" class="headerlink" title="Details of algorithm"></a>Details of algorithm</h3><ol><li><p>Calculate the fitness score of each candidate solution in generation $(g)$.<br><img src="http://blog.otoro.net/assets/20171031/rastrigin/cmaes_step1.png" width="30%" height="30%"></p></li><li><p>Isolates the best 25% of the population in generation $(g)$, in purple.</p></li></ol><p><img src="http://blog.otoro.net/assets/20171031/rastrigin/cmaes_step2.png" width="30%" height="30%"></p><ol><li>Using only the best solutions, along with the mean $\mu^{(g)}$ of the current generation (the green dot), calculate the covariance matrix $C^{(g+1)}$ of the next generation.</li></ol><p><img src="http://blog.otoro.net/assets/20171031/rastrigin/cmaes_step3.png" width="30%" height="30%"></p><ol><li>Sample a new set of candidate solutions using the updated mean $\mu^{(g+1)}$ and covariance matrix $C^{(g+1)}$.</li></ol><p><img src="http://blog.otoro.net/assets/20171031/rastrigin/cmaes_step4.png" width="30%" height="30%"></p><h3 id="OpenAI-Evolution-Strategy"><a href="#OpenAI-Evolution-Strategy" class="headerlink" title="OpenAI Evolution Strategy"></a>OpenAI Evolution Strategy</h3><p>In particular, $\sigma$ is fixed to a constant number, and only the $\mu$ parameter is updated at each generation. </p><p><img src="http://blog.otoro.net/assets/20171031/schaffer/openes.gif" width="30%" height="30%"></p><p>Although its performance is not the best, it is possible to scale to over a thousand parallel workers.</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx399cbkkfj20p205iq3y.jpg" alt=""><br><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fx39a8kr44j20os0a2di4.jpg" alt=""></p><h3 id="Comparision-among-Evolution-Strategies"><a href="#Comparision-among-Evolution-Strategies" class="headerlink" title="Comparision among Evolution Strategies"></a>Comparision among Evolution Strategies</h3><p><img src="http://blog.otoro.net/assets/20171031/mnist_results.svg" alt=""></p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li><a href="http://blog.otoro.net/2017/10/29/visual-evolution-strategies/">A Visual Guide to Evolution Strategy</a></li><li><a href="http://blog.otoro.net/2017/11/12/evolving-stable-strategies/">Evolving Stable Strategies</a></li><li><a href="https://blog.openai.com/evolution-strategies/">Evolution Strategies as a Scalable Alternative to Reinforcement Learning</a></li><li><a href="https://worldmodels.github.io/">world model</a></li></ul>]]></content>
<categories>
<category> WorldModel </category>
</categories>
<tags>
<tag> math </tag>
<tag> evolutionStategy </tag>
</tags>
</entry>
<entry>
<title>英国人工智能发展战略</title>
<link href="/2018/10/15/UK-AI-strategy/"/>
<url>/2018/10/15/UK-AI-strategy/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="英国2018年国家战略最新进展"><a href="#英国2018年国家战略最新进展" class="headerlink" title="英国2018年国家战略最新进展"></a>英国2018年国家战略最新进展</h2><p>英国政府于2018年4月公布了人工智能行业协议(AI Sector Deal)。这是英国政府产业战略的一部分,旨在将英国定位为人工智能领域的全球领导者。该协议涉及广泛的领域:促进公共和私人研发,投资于STEM教育,改善数字基础设施,开发人工智能人才,并领导全球关于数据伦理的对话。其中包括超过£3亿英镑用于私营部门投资的国内外科技公司,阿兰·图灵研究所创建图灵的奖学金,和促进伦理创新数据中心。该中心是该项目的一个关键项目,因为政府希望领导AI伦理的全球治理。该中心于2018年6月开始进行公众咨询。</p><p>在该行业协议公布的十天前,英国上议院人工智能特别委员会(House of Lords’s Select Committee on AI)发表了一份题为(AI in the UK: ready, willing, and able?)的报告。这份报告是为期10个月的调查的结果,该调查的任务是调查人工智能技术进步的经济、伦理和社会影响。该报告列出了一些建议供政府考虑,包括呼吁审查技术公司对数据的潜在垄断,鼓励开发审计数据集的新方法,并为与人工智能合作的英国中小企业创建一个增长基金。报告还指出,英国有机会领导人工智能的全球治理,并建议在2019年举办一次全球峰会,为人工智能的使用和发展建立国际规范。2018年6月,英国政府发布了一份针对上议院的官方回应,对报告中的每一项建议进行评论。</p><h2 id="英国人工智能分类以及体系结构"><a href="#英国人工智能分类以及体系结构" class="headerlink" title="英国人工智能分类以及体系结构"></a>英国人工智能分类以及体系结构</h2><h3 id="分类一"><a href="#分类一" class="headerlink" title="分类一"></a>分类一</h3><ol><li>医疗健康</li><li>自动驾驶</li><li>金融服务</li></ol><h3 id="分类二"><a href="#分类二" class="headerlink" title="分类二"></a>分类二</h3><ul><li>网络安全</li><li>个性化和综合医疗</li><li>个性化教育和培训</li><li>智能城市综合交通</li><li>提高基础设施的效率</li><li>个性化的公共服务</li><li>在医药和航空航天等关键领域进行数字化制造。</li></ul><h2 id="参考与引用"><a href="#参考与引用" class="headerlink" title="参考与引用"></a>参考与引用</h2><ol><li><a href="https://medium.com/politics-ai/an-overview-of-national-ai-strategies-2a70ec6edfd">https://medium.com/politics-ai/an-overview-of-national-ai-strategies-2a70ec6edfd</a></li><li><a href="https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/652097/Growing_the_artificial_intelligence_industry_in_the_UK.pdf">https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/652097/Growing_the_artificial_intelligence_industry_in_the_UK.pdf</a></li><li><a href="https://publications.parliament.uk/pa/ld201719/ldselect/ldai/100/100.pdf">https://publications.parliament.uk/pa/ld201719/ldselect/ldai/100/100.pdf</a></li><li><a href="https://www.gov.uk/government/publications/artificial-intelligence-sector-deal/ai-sector-deal">https://www.gov.uk/government/publications/artificial-intelligence-sector-deal/ai-sector-deal</a></li></ol>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> note </tag>
<tag> strategy </tag>
</tags>
</entry>
<entry>
<title>胶囊(Capsule)网络</title>
<link href="/2018/10/15/capsule-note-and-demo/"/>
<url>/2018/10/15/capsule-note-and-demo/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="手写笔记"><a href="#手写笔记" class="headerlink" title="手写笔记"></a>手写笔记</h2><p>第一部分根据国外博客进行make sense的直观理解。<br>第二部分根据苏神博客的进行数学方面的推导加强理解。</p><div class="row"><iframe src="https://drive.google.com/file/d/1mA1dSM1q-12pfpr75842DWO0zcvsIFbt/preview" style="width:100%; height:550px"></iframe></div><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ol><li><a href="https://kexue.fm/archives/5155">https://kexue.fm/archives/5155</a></li><li><a href="https://kexue.fm/archives/5112">https://kexue.fm/archives/5112</a></li><li><a href="https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-iii-dynamic-routing-between-capsules-349f6d30418">https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-iii-dynamic-routing-between-capsules-349f6d30418</a></li><li><a href="https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-ii-how-capsules-work-153b6ade9f66">https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-ii-how-capsules-work-153b6ade9f66</a></li></ol>]]></content>
<categories>
<category> Capsule </category>
</categories>
<tags>
<tag> code </tag>
<tag> note </tag>
</tags>
</entry>
<entry>
<title>变分自编码器及其应用</title>
<link href="/2018/10/03/VAE-learning/"/>
<url>/2018/10/03/VAE-learning/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="引言"><a href="#引言" class="headerlink" title="引言"></a>引言</h2><p>笔记主要参考变分自编码器的原论文<a href="https://arxiv.org/pdf/1312.6114.pdf">《Auto-Encoding Variational Bayes》</a>,与<a href="https://kexue.fm/archives/5253">苏神的博客</a></p><h2 id="VAE模型"><a href="#VAE模型" class="headerlink" title="VAE模型"></a>VAE模型</h2><p>VAE的目标(与GAN相同):希望构建一个从隐变量$Z$生成目标数据$X$的模型。<br>VAE的核心是:<strong>进行分布之间的变换。</strong></p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fvzqckwnkhj20my0aegn1.jpg" alt=""></p><p>VAE的Encoder有两个,一个用来计算均值,一个用来计算方差。</p><h3 id="问题所在"><a href="#问题所在" class="headerlink" title="问题所在"></a>问题所在</h3><p>但生成模型的难题是判断生成分布与真实分布的相似度。(即我们只知道抽样结果,不知道分布表达式)<br><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fvuzgu3recj20or0c7400.jpg" alt=""></p><h3 id="大部分教程所述的VAE"><a href="#大部分教程所述的VAE" class="headerlink" title="大部分教程所述的VAE"></a>大部分教程所述的VAE</h3><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fvxa4kzerpj20qt0dy40q.jpg" alt=""><br>模型思路是:先从标准正态分布中采样一个Z,然后根据Z来算一个X。<br>若VAE结构确实是这个图的话,我们其实完全不清楚:究竟经过重新采样出来的$Z_k$,是不是还对应着原来的$X_k$。</p><p>其实,在整个VAE模型中,我们并没有去使用$p(Z)$(隐变量空间的分布)是正态分布的假设,我们用的是假设$p(Z|X)$(后验分布)是正态分布!</p><p>但是,训练好的神经网络<br>并且VAE会让所有的$P(Z|X)$都向标准正态分布看齐:<br>$$p(Z)=\sum_X p(Z|X)p(X)=\sum_X \mathcal{N}(0,I)p(X)=\mathcal{N}(0,I) \sum_X p(X) = \mathcal{N}(0,I)$$</p><h3 id="真实的VAE"><a href="#真实的VAE" class="headerlink" title="真实的VAE"></a>真实的VAE</h3><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fvuzt27ie8j20rf0imjv6.jpg" alt=""><br>VAE是为每个样本构造<strong>专属</strong>的正态分布,然后采样来重构。</p><p>但是神经网络经过训练之后的方差会接近0。采样只会得到确定的结果。</p><p>因此还需要使所有的正态分布都向<strong>标准正态分布</strong>(模型的假设)看齐。为了使所有的P(Z|X)都向$\mathcal{N}(0,I)$看齐,我们需要:</p><h4 id="编码器:使用神经网络方法拟合参数"><a href="#编码器:使用神经网络方法拟合参数" class="headerlink" title="编码器:使用神经网络方法拟合参数"></a>编码器:使用神经网络方法拟合参数</h4><p>构建两个神经网络:$\mu_k=f_1(X<em>k), log</em>{\sigma^2}=f_2(X_k)$来拟合均值和方差。当二者尽量接近零的时候,分布也就达到了$\mathcal{N}(0,I)$。<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">z_mean = Dense(latent_dim)(h)</span><br><span class="line">z_log_var = Dense(latent_dim)(h)</span><br></pre></td></tr></table></figure></p><p>针对两个损失的比例选取,使用KL散度$KL(N(\mu,\sigma^2)||N(0,I))$作为额外的loss。上式的计算结果为:</p><p>$$\mathcal{L}<em>{\mu,\sigma^2}=\frac{1}{2} \sum</em>{i=1}^d \Big(\mu<em>{(i)}^2 + \sigma</em>{(i)}^2 - \log \sigma_{(i)}^2 - 1\Big)$$</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kl_loss = - <span class="number">0.5</span> * K.<span class="built_in">sum</span>(<span class="number">1</span> + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-<span class="number">1</span>)</span><br></pre></td></tr></table></figure><h4 id="解码器:保证生成能力"><a href="#解码器:保证生成能力" class="headerlink" title="解码器:保证生成能力"></a>解码器:保证生成能力</h4><p>我们最终的目标则是最小化误差$\mathcal{D}(\hat{X_k},X_k)^2$。</p><p>解码器重构$X$的过程是希望没噪声的,而$KL loss$则希望有高斯噪声的,两者是对立的。所以,VAE跟GAN一样,内部其实是包含了一个对抗的过程,只不过它们两者是混合起来,共同进化的。</p><h4 id="reparameterization-trick(重参数技巧)"><a href="#reparameterization-trick(重参数技巧)" class="headerlink" title="reparameterization trick(重参数技巧)"></a>reparameterization trick(重参数技巧)</h4><p>在反向传播优化均值和方差的过程中,“采样”操作是<strong>不可导</strong>的,但是采样的结果是可导的。</p><p>因此可以利用标准正太分布采样出的$\epsilon$直接估算$Z=\mu+\epsilon\times\sigma$。<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">sampling</span>(<span class="params">args</span>):</span><br><span class="line"> <span class="string">"""Reparameterization trick by sampling fr an isotropic unit Gaussian.</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string"> # Arguments:</span></span><br><span class="line"><span class="string"> args (tensor): mean and log of variance of Q(z|X)</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string"> # Returns:</span></span><br><span class="line"><span class="string"> z (tensor): sampled latent vector</span></span><br><span class="line"><span class="string"> """</span></span><br><span class="line"></span><br><span class="line"> z_mean, z_log_var = args</span><br><span class="line"> batch = K.shape(z_mean)[<span class="number">0</span>]</span><br><span class="line"> dim = K.int_shape(z_mean)[<span class="number">1</span>]</span><br><span class="line"> <span class="comment"># by default, random_normal has mean=0 and std=1.0</span></span><br><span class="line"> epsilon = K.random_normal(shape=(batch, dim))</span><br><span class="line"> <span class="keyword">return</span> z_mean + K.exp(<span class="number">0.5</span> * z_log_var) * epsilon</span><br></pre></td></tr></table></figure></p><p>于是“采样”操作不再参与梯度下降,改为采样的结果参与,使得整个模型可以训练。</p><h2 id="DEMO-基于CNN和VAE的作诗机器人"><a href="#DEMO-基于CNN和VAE的作诗机器人" class="headerlink" title="DEMO:基于CNN和VAE的作诗机器人"></a>DEMO:基于CNN和VAE的作诗机器人</h2><h3 id="模型结构"><a href="#模型结构" class="headerlink" title="模型结构"></a>模型结构</h3><p><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fvuzt1xl9cj20rf0imjv6.jpg" alt=""><br>先将每个字embedding为向量,然后用层叠CNN来做编码,接着池化得到一个encoder的结果,根据这个结果生成计算均值和方差,然后生成正态分布并重新采样。在解码截断,由于现在只有一个encoder的输出结果,而最后要输出多个字,所以先接了多个不同的全连接层,得到多样的输出,然后再接着全连接层。</p><h4 id="GCNN-Gated-Convolutional-Networks"><a href="#GCNN-Gated-Convolutional-Networks" class="headerlink" title="GCNN(Gated Convolutional Networks)"></a>GCNN(Gated Convolutional Networks)</h4><p>这里的CNN不是普通的CNN+ReLU,而是facebook提出的GCNN,其实就是做两个不同的、外形一样的CNN,一个不加激活函数,一个用sigmoid激活,然后把结果乘起来。这样一来sigmoid那部分就相当于起到了一个“门(gate)”的作用。</p><h2 id="参考与引用"><a href="#参考与引用" class="headerlink" title="参考与引用"></a>参考与引用</h2><ul><li><a href="https://kexue.fm/archives/5253">https://kexue.fm/archives/5253</a></li><li><a href="https://arxiv.org/pdf/1312.6114.pdf">https://arxiv.org/pdf/1312.6114.pdf</a></li><li><a href="https://kexue.fm/archives/5332">https://kexue.fm/archives/5332</a></li><li><a href="https://jaan.io/what-is-variational-autoencoder-vae-tutorial/">https://jaan.io/what-is-variational-autoencoder-vae-tutorial/</a></li></ul>]]></content>
<categories>
<category> VAE </category>
</categories>
<tags>
<tag> code </tag>
<tag> note </tag>
</tags>
</entry>
<entry>
<title>从Pycharm到SpaceVim</title>
<link href="/2018/09/27/space-vim-usage/"/>
<url>/2018/09/27/space-vim-usage/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><blockquote><p>Pycharm太吃内存,因此转向使用SpaceVim。基本配置记录如下:</p></blockquote><h2 id="环境基本配置"><a href="#环境基本配置" class="headerlink" title="环境基本配置"></a>环境基本配置</h2><h3 id="安装SpaceVim"><a href="#安装SpaceVim" class="headerlink" title="安装SpaceVim"></a>安装SpaceVim</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">curl -sLf https://spacevim.org/install.sh | bash</span><br><span class="line"><span class="comment"># 如果当前环境没有安装curl,并且没有管理员权限</span></span><br><span class="line">wget https://spacevim.org/install.sh</span><br><span class="line">bash install.sh</span><br><span class="line"><span class="comment"># 启动vim并且安装插件</span></span><br></pre></td></tr></table></figure><h3 id="配置-SpaceVim-autoload-SpaceVim-vim"><a href="#配置-SpaceVim-autoload-SpaceVim-vim" class="headerlink" title="配置.SpaceVim/autoload/SpaceVim.vim"></a>配置.SpaceVim/autoload/SpaceVim.vim</h3><ul><li><code>set timeoutlen=10</code>:空格键延迟</li><li><code>let g:spacevim_relativenumber=0</code>:禁用相对行号</li></ul><h3 id="配置-SpaceVim-d-init-vim"><a href="#配置-SpaceVim-d-init-vim" class="headerlink" title="配置.SpaceVim.d/init.vim"></a>配置.SpaceVim.d/init.vim</h3><ul><li><code>[[layers]] \n name="lang#python"</code>:使用python</li><li><code>disable_plugins=["neomake.vim"]</code>:禁用插件(当使用conda的时候,这个插件会报错)</li></ul><h3 id="更新-SPUpdate"><a href="#更新-SPUpdate" class="headerlink" title="更新:SPUpdate"></a>更新<code>:SPUpdate</code></h3><h2 id="常用命令备忘"><a href="#常用命令备忘" class="headerlink" title="常用命令备忘"></a>常用命令备忘</h2><h3 id="基本操作"><a href="#基本操作" class="headerlink" title="基本操作"></a>基本操作</h3><ul><li><code>e</code>: 打开一个空的编辑器</li><li><code>:e file_name</code>: 打开文件(该操作<strong>也可以</strong>在nerdtree上完成)</li></ul><h3 id="space-b系列-缓冲区"><a href="#space-b系列-缓冲区" class="headerlink" title="space+b系列(缓冲区)"></a>space+b系列(缓冲区)</h3><ul><li><code>N+l</code>:在右侧新建buffer</li><li><code>d</code>:删除buffer</li></ul><h3 id="space-f系列-文件管理"><a href="#space-f系列-文件管理" class="headerlink" title="space+f系列(文件管理)"></a>space+f系列(文件管理)</h3><ul><li><code>t</code>:开关nerdtree </li></ul><h3 id="space-w系列-窗口管理"><a href="#space-w系列-窗口管理" class="headerlink" title="space+w系列(窗口管理)"></a>space+w系列(窗口管理)</h3><ul><li><code>o</code>:切换到下一个窗口</li><li><code>/</code>和<code>-</code>:分别在右侧和下方分隔窗口</li><li><code>F</code>: 新建Tab</li></ul><h3 id="space-t系列(Toggle管理)"><a href="#space-t系列(Toggle管理)" class="headerlink" title="space+t系列(Toggle管理)"></a>space+t系列(Toggle管理)</h3><ul><li><code>t</code>:打开tab管理</li></ul><h3 id="space-c系列-注释)"><a href="#space-c系列-注释)" class="headerlink" title="space+c系列(注释)"></a>space+c系列(注释)</h3><ul><li><code>l</code>:注释选中行(配合<code>V</code>模式)</li></ul><h3 id="space-s系列-搜索"><a href="#space-s系列-搜索" class="headerlink" title="space+s系列(搜索)"></a>space+s系列(搜索)</h3><ul><li><code>s</code>:直接在当前文件搜索</li></ul><h3 id="space-l系列(语言)"><a href="#space-l系列(语言)" class="headerlink" title="space+l系列(语言)"></a>space+l系列(语言)</h3><ul><li><code>r</code>:执行代码</li></ul><h2 id="参考与引用"><a href="#参考与引用" class="headerlink" title="参考与引用"></a>参考与引用</h2><ul><li><a href="https://spacevim.org/cn/layers">https://spacevim.org/cn/layers</a></li><li><a href="https://everettjf.gitbooks.io/spacevimtutorial">https://everettjf.gitbooks.io/spacevimtutorial</a></li></ul>]]></content>
<categories>
<category> Linux </category>
</categories>
<tags>
<tag> IDE </tag>
<tag> Tools </tag>
</tags>
</entry>
<entry>
<title>对象差分注意力机制</title>
<link href="/2018/09/21/Object-Difference-Attention-paper-note/"/>
<url>/2018/09/21/Object-Difference-Attention-paper-note/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><!-- 论文基本信息:方便查阅和追踪 --><!-- 论文基本信息的获取:1. 直接从论文pdf中获取2. 从paperweekly首页上方搜索论文;若未检索到,点击推荐论文输入论文名即可自动获取信息--><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名:Object-Difference Attention: A Simple Relational Attention for Visual Question Answering</p></li><li><p>论文链接:<a href="http://www.acmmm.org/2018/accepted-papers/">http://www.acmmm.org/2018/accepted-papers/</a></p></li><li><p>论文源码:</p><ul><li>None</li></ul></li><li><p>关于作者:</p><ul><li>吴晨飞,北邮AI Lab博士</li></ul></li><li><p>关于笔记作者:</p><ul><li>朱正源,北京邮电大学研究生,研究方向为多模态与认知计算。 </li></ul></li></ol><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><!-- Ex: 论文摘要的中文翻译最近对话生成的神经模型为会话代理生成响应提供了很大的希望,但往往是短视的,一次预测一个话语而忽略它们对未来结果的影响。对未来的对话方向进行建模对于产生连贯,有趣的对话至关重要,这种对话需要传统的NLP对话模式借鉴强化学习。在本文中,我们将展示如何整合这些目标,应用深度强化学习来模拟聊天机器人对话中的未来奖励。该模型模拟两个虚拟代理之间的对话,使用策略梯度方法来奖励显示三个有用会话属性的序列:信息性,连贯性和易于回答(与前瞻性功能相关)。我们在多样性,长度以及人类评判方面评估我们的模型,表明所提出的算法产生了更多的交互式响应,并设法在对话模拟中促进更持久的对话。这项工作标志着基于对话的长期成功学习神经对话模型的第一步。--><p>注意机制极大地促进了视觉问答技术(VQA)的发展。注意力分配在注意力机制中起着至关重要的作用,它根据对象(如图像区域或定界框)回答问题的重要性对图像中的对象(如图像区域或包围盒)进行不同的权重。现有的工作大多集中在融合图像特征和文本特征来计算注意力分布,而不需要比较<strong>不同的图像对象</strong>。作为注意力的一个主要属性,<strong>分离度</strong>取决于不同对象之间的比较。这种比较为更好地分配注意力提供了更多的信息。为了实现对目标的可感知性,我们提出了一种对象差分注意(ODA)方法,通过在图像中实现不同图像对象之间的差值运算来计算注意概率。实验结果表明,我们基于ODA的VQA模型得到了最先进的结果。此外,还提出了一种关系注意的一般形式。除了ODA之外,本文还介绍了其他一些相关的注意事项。实验结果表明,这些关系关注在不同类型的问题上都有优势。</p><h2 id="对象差分注意力机制:视觉问答中一个简单的关系注意力机制"><a href="#对象差分注意力机制:视觉问答中一个简单的关系注意力机制" class="headerlink" title="对象差分注意力机制:视觉问答中一个简单的关系注意力机制"></a>对象差分注意力机制:视觉问答中一个简单的关系注意力机制</h2><!-- Ex: ## 强化学习在对话生成领域的应用 --><h3 id="引言"><a href="#引言" class="headerlink" title="引言"></a>引言</h3><h4 id="本文术语"><a href="#本文术语" class="headerlink" title="本文术语"></a>本文术语</h4><!-- 针对论文中不常用的术语进行简短的解释,方便读者理解 --><ol><li><p>序列编码的方式:</p><ol><li><strong>RNN</strong>: $y<em>t=f(y</em>{t-1},x_t)$</li><li><strong>CNN</strong>: $y<em>t=f(x</em>{t-1},x<em>t,x</em>{t+1})$</li><li><strong>Attention</strong>: $y_t=f(x_t, A, B), if A = B = X: Self Attention$</li></ol></li><li><p>注意力机制的例子<br>$$Attention(Q,K,V)$$</p></li><li><p>应用于VQA的注意力机制编年史:</p><ol><li>one-step linear fusion</li><li>multi-step linear fusion</li><li>bilinear fusion</li><li>multi-feature attention</li></ol></li><li><p>Mutan机制</p></li></ol><h4 id="论文写作动机"><a href="#论文写作动机" class="headerlink" title="论文写作动机"></a>论文写作动机</h4><!-- 当前研究领域存在的问题Ex:标准的Seq-to-Seq模型用于对话系统时常常使用MLE作为模型的评价标准,但这往往导致下面两个主要缺点:系统倾向于产生一些普适性的回应,也就是dull response,这些响应可以回答很多问题但却并不是我们想要的,我们想要的是有趣、多样性、丰富的回应;系统的回复不具有前瞻性,有时会导致陷入死循环,导致对话轮次较少。也就是产生的响应没有考虑对方是否容易回答的情况。--><ol><li>现有的工作大多集中在融合图像特征和文本特征来计算注意力分布,而<strong>忽略了</strong>比较不同的图像对象之间的差异。<br> <img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvisv9uyyhj20i10cw46l.jpg" alt=""><br> 如上图,想要回答出问题<code>图中最高的花是什么?</code>,我们建立的模型就需要不仅仅关注潜在答案<code>玫瑰</code>,也应该关注<code>兰花</code>。</li><li>如何合理分配现有问题的注意力?</li></ol><h3 id="解决问题的方法"><a href="#解决问题的方法" class="headerlink" title="解决问题的方法"></a>解决问题的方法</h3><h4 id="玫瑰例子"><a href="#玫瑰例子" class="headerlink" title="玫瑰例子"></a>玫瑰例子</h4><p>对于回答<code>图中最高的花是什么?</code>,一共分几步?</p><ol><li>找到图中所有的花。</li><li>比较不同的花对于正确答案的重要性。</li></ol><p>正确的答案就会在<strong>比较</strong>的过程中产生。若以这个例子作为启发,一种新型的注意力机制的思路便产生了:ODA在问题的指导下,通过将每个图像对象与其他所有对象进行对比,计算出图像中物体的注意注意力分布。</p><h4 id="模型细节"><a href="#模型细节" class="headerlink" title="模型细节"></a>模型细节</h4><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvjiq8dptpj20pw0bdgqo.jpg" alt=""></p><ol><li><p>将数据Embedding</p><ol><li>$V^f=RCNN(image)$,其中$v^f$是一个$m\times{d_v}$维的embedding,代表拉出的$m$个框。</li><li>$Q^f=GRU(question)$,其中$Q^f$代表$d_q$维的问题embedding。</li><li>$V=relu(Conv1d(V^f))$</li><li>$Q=relu(Linear(Q^f))$</li></ol></li><li><p>对象差分注意力<br>$$\hat{V}=softmax([(V_i-V<em>j)\odot{Q}]</em>{m\odot{md}}W_f)^{T}V$$<br>该模型的优点:</p><ol><li>通过对比(差分)),我们可以选择更重要的对象。</li><li>计算复杂度相对与传统注意力机制模型(Mutan)低。</li><li>”即插即用“的特性使得该模型十分容易应用到其他领域。</li></ol></li><li><p>决策阶段</p><ol><li><p>通过对$\hat{V}$计算$p$次,并且将结果拼接在一起。<br>$$\hat{Z}=[\hat{V}^{1};\hat{V}^{2};…;\hat{V}^{p}]$$</p><blockquote><p>可以参考Attention is all you need模型的multi-head</p></blockquote></li><li>将图片的特征和问题的特征相结合<br>$$H=\sum^s_{s=1}(\hat{Z}W_v^{(s)}\odot{QW_q^{(s)}})$$</li><li>预测<br>$$\hat{a}=\sigma(W_{h}H)$$</li></ol></li></ol><h4 id="扩展:相关性注意力"><a href="#扩展:相关性注意力" class="headerlink" title="扩展:相关性注意力"></a>扩展:相关性注意力</h4><p>针对模型中$(V_i-V_j)\odot{Q}$部分进行扩展,可以得到不同类型的注意力机制<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvjt8ggw48j20dk06emya.jpg" alt=""></p><h3 id="实验结果分析"><a href="#实验结果分析" class="headerlink" title="实验结果分析"></a>实验结果分析</h3><h4 id="数据集"><a href="#数据集" class="headerlink" title="数据集"></a>数据集</h4><ul><li>VQA1.0 dataset</li><li>VQA2.0 dataset</li><li>COCO-QA dataset</li></ul><h4 id="评估指标"><a href="#评估指标" class="headerlink" title="评估指标"></a>评估指标</h4><ul><li>针对VQA1.0和VQA2.0,使用准确率:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvjtavlwoxj209701hgli.jpg" alt=""></li><li>针对COCO_QA使用:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvjtbn6b1pj207m00tdfo.jpg" alt=""></li></ul><h4 id="实验结果评价"><a href="#实验结果评价" class="headerlink" title="实验结果评价"></a>实验结果评价</h4><ul><li>在VQA1.0上与最先进的模型对比<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvjtf0nyn4j20qs0c8wh1.jpg" alt=""></li><li>在VQA2.0上与最先进的模型对比<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvjtfgxjdxj20ht05qmy9.jpg" alt=""></li><li>在VQA3.0上与最先进的模型对比<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvjtg34t3dj20mm05twfl.jpg" alt=""></li></ul><h3 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h3><p>从感性的角度来说,对象差分注意力机制符合人类根据图片回答问题的思考过程。未来的研究方向应该是通过对世界的常识性知识建立一个世界模型,通过先验知识减少计算量和对大量带有标签的数据的依赖性。</p><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ol><li><a href="https://kexue.fm/archives/4765">https://kexue.fm/archives/4765</a></li></ol>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> note </tag>
</tags>
</entry>
<entry>
<title>世界模型的解读</title>
<link href="/2018/09/21/world-model-to-learn-them-all/"/>
<url>/2018/09/21/world-model-to-learn-them-all/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><!-- 论文基本信息:方便查阅和追踪 --><!-- 论文基本信息的获取:从paperweekly首页上方搜索论文;若未检索到,点击推荐论文输入论文名即可自动获取信息--><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名:World Models</p><!-- Ex: 1. 论文名:Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs. --></li><li><p>论文链接:<a href="https://arxiv.org/pdf/1803.10122.pdf">https://arxiv.org/pdf/1803.10122.pdf</a></p><!-- Ex: 2. https://arxiv.org/abs/1606.01541 --></li><li><p>论文源码:</p><ul><li><a href="https://worldmodels.github.io/">https://worldmodels.github.io/</a></li></ul></li></ol><!-- - https://github.com/liuyuemaicha/Deep-Reinforcement-Learning-for-Dialogue-Generation-in-tensorflow - https://github.com/agsarthak/Goal-oriented-Dialogue-Systems--><ol><li><p>关于作者:</p><!-- 建议从google schoolar获取详细信息 - first_author: position, times_cited--><ul><li>David Ha:</li><li>Jurgen Schimidhuber: LSTM之父,无需多言</li></ul></li><li><p>关于笔记作者:</p><ul><li>朱正源,北京邮电大学研究生,研究方向为多模态与认知计算。</li></ul></li></ol><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><p>通过探索并且建立流行的强化学习环境的生成神经网络模型。<strong>世界模型</strong>可以在无监督的情况下快速训练,以学习环境的压缩时空表示。通过使用从世界模型中提取的特征作为Agent的输入,可以训练一个非常紧凑和简单的策略来解决所需的任务。甚至可以训练Agent完全在它自己的世界模型所产生的<strong>梦</strong>中,并将这个策略转移回实际环境中。</p><h2 id="时代背景:可以预见的寒冬-—-Yann-Lecun"><a href="#时代背景:可以预见的寒冬-—-Yann-Lecun" class="headerlink" title="时代背景:可以预见的寒冬 —- Yann Lecun"></a>时代背景:可以预见的寒冬 —- Yann Lecun</h2><ul><li>深度学习缺少推断能力:关于世界的常识与对于任务背景的认知</li><li>从零开始学习真的很低效:深度学习需要一个记忆模块(预训练)</li><li>自监督学习(Learn how to learn)</li></ul><h2 id="世界模型"><a href="#世界模型" class="headerlink" title="世界模型"></a>世界模型</h2><h3 id="论文写作动机"><a href="#论文写作动机" class="headerlink" title="论文写作动机"></a>论文写作动机</h3><ol><li><p>哲学问题:人类究竟如何认识世界?<br> 人类通过有限的感知能力(眼睛、鼻子、耳朵、皮肤),逐渐建立一个自己的<strong>心智模型</strong>。人类的一切决策和动作则均根据每个人自己的内部模型的<strong>预测</strong>而产生。</p></li><li><p>人类如何处理日常生活的信息流?<br> 通过<strong>注意力机制</strong>学习客观世界时空方面的抽象表达。</p></li><li><p>人类的潜意识如何工作?<br> 以棒球为例子,击球手在如此短的时间(短于视觉信号到达大脑的时间!)内需要作出何时击球的动作。<br> 人类可以完成击球的原因便是因为人类天生的心智模型可以预测棒球的运动路线。</p></li><li><p>我们是否可以让模型根据环境自觉建立特有的模型进行自学习?</p></li><li><p>Jurgen历史性的工作总结:强化学习背景下的RNN-based世界模型!</p></li></ol><h3 id="模型细节"><a href="#模型细节" class="headerlink" title="模型细节"></a>模型细节</h3><h4 id="总览Agent模型"><a href="#总览Agent模型" class="headerlink" title="总览Agent模型"></a>总览Agent模型</h4><ol><li><p>视觉感知元件:<strong>压缩</strong>视觉获取到的信息/环境表征</p></li><li><p>记忆元件:根据历史信息对客观环境进行<strong>预 测</strong></p></li><li><p>决策模块:根据视觉感知元件和记忆元件选择<strong>行动策略</strong></p></li></ol><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvm2l4tc84j20wf0ledkn.jpg" alt=""></p><h4 id="VAE-V-Model"><a href="#VAE-V-Model" class="headerlink" title="VAE(V) Model"></a>VAE(V) Model</h4><p>Agent通过VAE可以从观察的每一帧中学习出抽象的、压缩的表示。</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvm3sg25puj20wn09zq7g.jpg" alt=""></p><h4 id="MDN-RNN-M-Model"><a href="#MDN-RNN-M-Model" class="headerlink" title="MDN-RNN(M) Model"></a>MDN-RNN(M) Model</h4><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvmurcu30uj20u10l8jta.jpg" alt=""><br>其中下一时刻的预测$z<em>{t+1}$使用概率的形式表示为$P(z</em>{t+1}|a_t,z_t,h_t)$,其中$a_t$是在$t$时刻的动作。<br>并且在采样阶段,通过调整温度参数$\tau$来控制模型的模糊度。(这个参数对后续训练控制器$C$十分有效)</p><p>模型最上方的<strong>MDN</strong>表示<strong>Mixture Density Network</strong>,输出的是预测的z的高斯混合模型。</p><h4 id="Controller-C-Model"><a href="#Controller-C-Model" class="headerlink" title="Controller(C) Model"></a>Controller(C) Model</h4><p>这个模块用来根据最大累计Reward决定Agent下一个时刻的行动。论文中故意将这个模块设置的尽量小并且简单。</p><p>因此控制器是一个简单的单层线性模型:<br>$$a_t=W_c[z_t h_t]+b_c$$</p><p>特别指出,优化控制器参数的方法不是传统的梯度下降,而是<strong>Covariance-Matrix Adaptation Evolution Strategy</strong></p><h4 id="结合三个模块"><a href="#结合三个模块" class="headerlink" title="结合三个模块"></a>结合三个模块</h4><p>下面的流程图展示了$V$、$M$和$C$如何与环境进行交互:首先每个时间步$t$原始的观察输入由$V$进行处理生成压缩后的$z(t)$。随后$C$的输入是$z(t)$和$M$的隐状态$h(t)$。随后$C$输出动作矢量$a(t)$影响环境。$M$以当前时刻的$z(t)$和$a(t)$作为输入,预测下一时刻的隐状态$h(t+1)$。</p><blockquote><p>在代码中,对$M$模块的输入有很多种方式。我不太认同图中把$C$选择的动作也当做$M$的输入。</p></blockquote><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvmvf1dkd5j20p50ic77b.jpg" alt=""></p><p>通过伪代码表示模型:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">def</span> <span class="title function_">rollout</span>(<span class="params">controller</span>):</span><br><span class="line"> obs = env.reset()</span><br><span class="line"> h = rnn.initial_state()</span><br><span class="line"> done = <span class="literal">False</span></span><br><span class="line"> cumulative_reward = <span class="number">0</span></span><br><span class="line"> <span class="keyword">while</span> <span class="keyword">not</span> done:</span><br><span class="line"> z = vae.encode(obs)</span><br><span class="line"> a = controller.action([z, h])</span><br><span class="line"> obs, reward, done = env.step(a)</span><br><span class="line"> cumulative_reward += reward</span><br><span class="line"> h = rnn.forward([a, z, h])</span><br><span class="line"> <span class="keyword">return</span> cumulative_reward</span><br></pre></td></tr></table></figure></p><h3 id="实验设计1"><a href="#实验设计1" class="headerlink" title="实验设计1"></a>实验设计1</h3><p>两个实验的环境均选自<code>OpenAI Gym</code></p><h4 id="实验环境"><a href="#实验环境" class="headerlink" title="实验环境"></a>实验环境</h4><p>CarRacing-v0(Car Racing Experiment)</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvmw8osn7xj20vh0kt43n.jpg" alt=""></p><p>动作空间有</p><ol><li>左转</li><li>右转</li><li>加速</li><li>刹车</li></ol><h4 id="实验实现流程"><a href="#实验实现流程" class="headerlink" title="实验实现流程"></a>实验实现流程</h4><ol><li>根据随机的策略收集10,000次游戏过程</li><li>根据每个游戏过程的每一帧训练VAE模型,输出结果为$z\in \mathcal{R}^{32}$</li><li>训练MDN-RNN模型,输出结果为$P(z_{t+1}|a_t,z_t,h_t)$</li><li>定义控制器(c),$a_t=W_c[z_t h_t]+b_c$</li><li>使用CMS-ES算法得到最大化累计Reward的$$W_b$与$b_c$</li></ol><p>模型参数共有:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvmwg6zv1ej20gg063jru.jpg" alt=""></p><h4 id="实验结果"><a href="#实验结果" class="headerlink" title="实验结果"></a>实验结果</h4><ol><li><p>只添加V Model(VAE)<br>如果没有M Model(MDN-RNN)模块,控制器的公式便为:$a<em>t=W</em>{c}z_t+b_c$。<br>实验结果表明这会导致Agent不稳定的驾驶行为。<br>在这种情况下,尝试控制器添加一层隐含层,虽然实验效果有所提升,但是仍然没能达到很好的效果。</p></li><li><p>世界模型完全体(VAE+MDN-RNN)<br>实验结果表明,Agent驾驶得更加稳定。<br>因为$h_t$包含了当前环境关于未来信息的概率分布,因此Agent可以向一级方程式选手和棒球手一样迅速做出判断。</p></li><li><p>世界模型与其他模型的对比:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvmxb7t9zdj20op09odi1.jpg" alt=""></p></li><li><p>对世界模型当前状态$z_{t+1}$进行可视化<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvmxeae7jyj20g00g4jwu.jpg" alt=""><br>上图将$\tau$设置为0.25(这个参数可以调节生成环境的模糊程度)</p></li></ol><h3 id="实验设计2"><a href="#实验设计2" class="headerlink" title="实验设计2"></a>实验设计2</h3><p><strong>我们是否可以让Agent在自己的梦境中学习,并且改变其对真实环境的策略</strong><br>如果世界模型对其<strong>目的</strong>有了充分的认识,那么我们就可以使用世界模型代替Agent真实观察到的环境。(类比我们下楼梯的时候,根本不需要小心翼翼地看着楼梯)<br>最终,Agent将不会直接观察到现实世界,而只会看到世界模型<strong>让</strong>它看到的事物。</p><h4 id="实验环境-1"><a href="#实验环境-1" class="headerlink" title="实验环境"></a>实验环境</h4><p>VizDoom Experiment</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvmxq3o7ihj20g10bywio.jpg" alt=""></p><p>游戏目的是控制Agent躲避怪物发出的火球。</p><h4 id="实验实现流程-1"><a href="#实验实现流程-1" class="headerlink" title="实验实现流程"></a>实验实现流程</h4><p>模型的M Model(MDN-RNN)主要负责预测Agent下一时刻(帧)是否会死亡。<br>当Agent在其世界模型中进行训练的时候,便不需要V Model对真实环境的像素进行编码了。</p><ol><li>从随机策略中选取10,000局游戏(同实验一)</li><li>根据每次游戏的每一帧训练VAE模型,得到$z\in \mathcal{R}^{64}$($z$的维度变成了64),之后使用VAE模型将收集的图像转换为隐空间表示。</li><li>训练MDN-RNN模型,输出结果为$P(z<em>{t+1},d</em>{t+1}|a_t,z_t,h_t)$</li><li>定义控制器为$a_t=W_c[z_t h_t]$</li><li>使用CMA-ES算法从世界模型构建的虚拟环境中得到最大化累计生存时间的$W_c$</li></ol><p>模型的参数共有:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvn0lmh3rdj20li07dwf8.jpg" alt=""></p><h4 id="模糊化世界模型"><a href="#模糊化世界模型" class="headerlink" title="模糊化世界模型"></a>模糊化世界模型</h4><p>通过增加模糊度参数$\tau$,会使得游戏变得更难(世界模型生成的环境更加模糊)。<br>如果Agent在高模糊度参数表现的很好的话,那么在正常模式下通常表现的更好。</p><p>也就是说,即使<strong>V model(VAE)不能够正确的捕捉每一帧全部的信息</strong>,Agent也能够完成真实环境给定的任务。</p><p>实验结果表明,模糊度参数太低相当于没有利用这个参数,但是太高的话模型又相当于”近视“了。因此需要找到一个合适的模糊度参数值。</p><h3 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h3><h4 id="泛化:迭代式训练程序"><a href="#泛化:迭代式训练程序" class="headerlink" title="泛化:迭代式训练程序"></a>泛化:迭代式训练程序</h4><ol><li>随机初始化M Model(MDN-RNN)和C Model(Controller)的参数</li><li>对真实环境进行N次试验。保存每次试验的动作$a_t$和观察$x_t$</li><li>训练M Model(MDN-RNN),得到$P(x<em>{t+1},r</em>{t+1},a<em>{t+1},d</em>{t+1}|x_t,a_t,h_t)$;训练C Model(Controller)并且M中的最优化期望rewards。</li><li>回到第2步如果任务没有结束</li></ol><p>这个泛化程序的特点是从M model中不仅仅要得到预测的观察$x$和是否结束任务$done$,</p><p>一般的seq2seq模型,倾向于生成安全、普适的响应,因为这种响应更符合语法规则,在训练集中出现频率也较高,最终生成的概率也最大,而有意义的响应生成概率往往比他们小。通过MMI来计算输入输出之间的依赖性和相关性,可以减少模型对他们的生成概率。</p><h4 id="从信息到记忆:海马体的魔术"><a href="#从信息到记忆:海马体的魔术" class="headerlink" title="从信息到记忆:海马体的魔术"></a>从信息到记忆:海马体的魔术</h4><p>神经科学的研究(2017 Foster)发现了海马体重映现象:当动物休息或者睡觉的时候,其大脑会重新放映最近的经历。并且海马体重映现象对巩固记忆十分重要。</p><h4 id="注意力:只关心任务相关的特征"><a href="#注意力:只关心任务相关的特征" class="headerlink" title="注意力:只关心任务相关的特征"></a>注意力:只关心任务相关的特征</h4><p>神经科学的研究(2013 Pi)发现,主要视觉神经元只有在受到奖励的时候才会被从抑制状态激活。这表明人类通常从任务相关的特征中学习,而非接收到的所有特征。(该结论至少在成年人中成立)</p><h4 id="未来的展望"><a href="#未来的展望" class="headerlink" title="未来的展望"></a>未来的展望</h4><p>当前的问题主要出现在M Model(MDN-RNN)上:受限于RNN模型的信息存储能力。人类的大脑能够存储几十年甚至几百年的记忆,但是神经网络会因为梯度消失导致训练困难。</p><p>如果想让Agent可以探索更加复杂的世界,那么未来的工作可能是设计出一个可以<strong>代替MDN-RNN结构的模型</strong>,或者开发出一个<strong>外部记忆模块</strong>。</p><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ol><li><a href="https://arxiv.org/pdf/1803.10122.pdf">https://arxiv.org/pdf/1803.10122.pdf</a></li></ol>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> note </tag>
<tag> JurgenSchmidhuber </tag>
</tags>
</entry>
<entry>
<title>自复制自动机理论读书笔记</title>
<link href="/2018/09/18/theory-of-self-reproducing-automata/"/>
<url>/2018/09/18/theory-of-self-reproducing-automata/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><p>未完待续已完成10%</p><h1 id="Theory-of-Self-Reproducing-Automata-自复制自动机理论"><a href="#Theory-of-Self-Reproducing-Automata-自复制自动机理论" class="headerlink" title="Theory of Self-Reproducing Automata(自复制自动机理论)"></a>Theory of Self-Reproducing Automata(自复制自动机理论)</h1><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>冯诺依曼从20世纪40年代后期开始研究自动机理论。按照时间顺序他完成了五篇著作:</p><ol><li>自动机的通用逻辑理论(The General and Logical Theory of Automata)</li><li>复杂自动机的理论和结构(Theory and Organization of Complicated Automata)</li><li>概率逻辑:从不可靠组件合成可靠整体(Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components)</li><li>自动机理论:构建、复制以及同质性(The Theory of Automata:Constructions, Reproduction, Homogeneity)</li><li>计算机与人脑(The Computer and the Brain)</li></ol><p>冯诺依曼构思了一套基本逻辑单元构成的复杂系统理论。</p><blockquote><p>如果一个数学主题已经远离了所有的实证源头,而且仅仅跟一些非常‚抽象‛的领域有交叉的时候,这个数学主题就会濒临衰退了……无论这一阶段何时来到,唯一补救的办法就是在它的源头处重生:重新注入或多或少的实证经验 —— 冯诺依曼</p></blockquote><h2 id="冯诺依曼的自动机理论"><a href="#冯诺依曼的自动机理论" class="headerlink" title="冯诺依曼的自动机理论"></a>冯诺依曼的自动机理论</h2><h3 id="生物与人工自动机"><a href="#生物与人工自动机" class="headerlink" title="生物与人工自动机"></a>生物与人工自动机</h3><p>通过考察两种主要类型的自动机:人工的和生物自动机。</p><p>模拟与数字计算机是最重要的一类人工自动机,但通讯或信息处理目的而造的其他的人造系统也包括在其中,如电话和收音机广播系统等。生物自动机则包括了<code>神经系统</code>、<code>自复制</code>和<code>自修复系统</code>,以及<code>生命的进化与适应</code>等特性。</p><p>冯纽曼花费了很大的精力来比较生物与人工自动机的异同。我们可以将<br>这些结论概括成如下几个方面:</p><ol><li>模拟与数字的不同: <ul><li>自然生命体是一种混合体,同时包含了模拟与数字过程。</li><li>神经元是“有或无”的,因此数字真值函数逻辑是神经行为的一种初级近似。</li><li>神经元的激活有有赖于空间上的刺激加总,这些都是连续而非离散的过程。</li><li>在复杂的有机体中,数字运算通常与模拟过程交替进行。</li></ul></li><li>基本元件所用到的物理和生物的材料<ul><li>计算机的基本元件要比神经元大得多,而且需要更多的能量,但是它们的速度要快很多。</li><li>生物自动机是通过一种更加并行的方式工作的,而数字计算机则是串行结构。【注:此为冯诺依曼受限于时代背景的结论】</li><li>真空管和神经元尺寸的不同是由于它们所用材料的机械稳定性不同而引起的。真空管要更容易被损坏却不好修复。而当神经元膜受到破坏以后,会很容易地被修复。</li></ul></li><li>复杂性<ul><li>人,也包括天地万物,是一种比他们能够构建的人工自动机更复杂得多的生物自动机。</li><li>人类对于它自己的逻辑设计的细节理解要远远比不上对他所构建的最大型计算机的理解。</li></ul></li><li>逻辑组织<ul><li>在一个特定的计算机中,有高速的电子寄存器,有低速的磁芯,以及更慢的磁带单元。【注:当前时代背景下可以类比多级存储体系】</li><li>神经环路中的脉冲、神经阈值的改变、神经系统的组织以及基因中的编码就也构成了这层级实例。</li></ul></li><li>可靠性<ul><li>生物自动机在这方面显然要胜过人工自动机,就是因为它们有着强大的自检验自修复的功能。【注:癌症和衰老呢?】</li></ul></li></ol><h2 id="自动机理论的数学原理"><a href="#自动机理论的数学原理" class="headerlink" title="自动机理论的数学原理"></a>自动机理论的数学原理</h2><p>起始于数理逻辑,而朝向分析、概率以及热力学靠近。</p><h3 id="控制与信息理论"><a href="#控制与信息理论" class="headerlink" title="控制与信息理论"></a>控制与信息理论</h3><p>图灵机和<code>McCulloch & Pitts</code>的神经网络分别处于信息理论的两个极端。</p><h4 id="McCulloch-amp-Pitts-的神经网络:组合方法"><a href="#McCulloch-amp-Pitts-的神经网络:组合方法" class="headerlink" title="McCulloch & Pitts 的神经网络:组合方法"></a>McCulloch & Pitts 的神经网络:<strong>组合方法</strong></h4><blockquote><p>神经网络由非常简单的零件组成复杂结构。因此只需要对底层的零件作公理化定义就可以得到非常复杂的组合</p></blockquote><p>神经元的定义如下:我们用一个小圆圈代表一个神经元,从圆圈延伸出的直线则代表神经突触。箭头表示某神经元的突触作用于另一个神经元之上,也就是信号的传送方向。神经元有两个状态:激发和非激发。</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fve2xio54pj20d908szlq.jpg" alt=""></p><p>人类神经元有神奇的涌现结果:没有轮廓的三角形,但是你的眼睛却可以帮你勾勒出它的轮廓。<br><img src="http://ww1.sinaimg.cn/large/ca26ff18gy1fwexofxywsj20mu0iagov.jpg" alt=""></p><h4 id="图灵机"><a href="#图灵机" class="headerlink" title="图灵机"></a>图灵机</h4><blockquote><p>是对于整个自动机进行了公理化的定义,他仅仅定义了自动机的功能,并没有涉及到具体的零件。</p></blockquote><p>对于高复杂度的形式逻辑对象,很难提前预测它的行为,最好的办法就是把它实际制造出来运行。这是根据哥德尔定理得出的结论:</p><blockquote><p>从逻辑上说,对于一个对象的描述要比这个对象本身要高一个级别。因此,前者总是比后者要长。</p></blockquote><h2 id="大数之道"><a href="#大数之道" class="headerlink" title="大数之道"></a>大数之道</h2><p>生命应该是同概率完全整合在一起的,生命可以在<strong>错误里面持续运行</strong>!在生命中的误差,不会像在计算过程中那样不断的扩散放大。生命是十分完善且具有适应性的系统,一旦中间发生了某种问题,系统会自动地认识到这个问题的严重程度。</p><pre><code>1. 如果无关紧要,那么系统就会无视问题,继续运作2. 如果问题对系统比较重要,系统就会把发生故障的区域封闭起来,绕过它,通过其他补救渠道继续运行。</code></pre><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ul><li>Theory of Self-Reproducing Automata[von Neumann]</li><li><a href="http://swarmagents.cn.13442.m8849.cn/thesis/program/jake_358.pdf">http://swarmagents.cn.13442.m8849.cn/thesis/program/jake_358.pdf</a></li></ul>]]></content>
<categories>
<category> AI </category>
</categories>
<tags>
<tag> note </tag>
</tags>
</entry>
<entry>
<title>Neural-Machine-Translation-by-tensorflow</title>
<link href="/2018/09/09/Neural-Machine-Translation-by-tensorflow/"/>
<url>/2018/09/09/Neural-Machine-Translation-by-tensorflow/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="模型使用的公开数据集"><a href="#模型使用的公开数据集" class="headerlink" title="模型使用的公开数据集"></a>模型使用的公开数据集</h2><p>IWSLT Evaluation Campaign<br>WMT Evaluation Campaign</p><hr><h2 id="基本seq2seq模型"><a href="#基本seq2seq模型" class="headerlink" title="基本seq2seq模型"></a>基本seq2seq模型</h2>]]></content>
<categories>
<category> slide </category>
</categories>
<tags>
<tag> tensorflow </tag>
<tag> slide </tag>
</tags>
</entry>
<entry>
<title>论文笔记:A-Diversity-Promoting-Objective-Function-for-Neural-Conversation-Models</title>
<link href="/2018/09/09/paper-note-A-Diversity-Promoting-Objective-Function-for-Neural-Conversation-Models/"/>
<url>/2018/09/09/paper-note-A-Diversity-Promoting-Objective-Function-for-Neural-Conversation-Models/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><!-- 论文基本信息:方便查阅和追踪 --><!-- 论文基本信息的获取:从paperweekly首页上方搜索论文;若未检索到,点击推荐论文输入论文名即可自动获取信息--><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名:A Diversity-Promoting Objective Function for Neural Conversation Models</p><!-- Ex: 1. 论文名:Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs. --></li><li><p>论文链接:<a href="https://arxiv.org/pdf/1510.03055.pdf">https://arxiv.org/pdf/1510.03055.pdf</a></p><!-- Ex: 2. https://arxiv.org/abs/1606.01541 --></li><li><p>论文源码:</p><ul><li>None</li></ul></li></ol><!-- - https://github.com/liuyuemaicha/Deep-Reinforcement-Learning-for-Dialogue-Generation-in-tensorflow - https://github.com/agsarthak/Goal-oriented-Dialogue-Systems--><ol><li><p>关于作者:</p><!-- 建议从google schoolar获取详细信息 - first_author: position, times_cited--><ul><li>Jiwei Li:斯坦福大学博士毕业生,截至发稿被引次数:2156</li></ul></li><li><p>关于笔记作者:</p><ul><li>朱正源,北京邮电大学研究生,研究方向为多模态与认知计算。</li></ul></li></ol><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><p>文章提出使用最大互信息(Maximum Mutual Information MMI)代替原始的最大似然(Maximum Likelihood)作为目标函数,目的是使用互信息减小“I don’t Know”这类无聊响应的生成概率。</p><h2 id="一种促进神经对话模型多样性的目标函数"><a href="#一种促进神经对话模型多样性的目标函数" class="headerlink" title="一种促进神经对话模型多样性的目标函数"></a>一种促进神经对话模型多样性的目标函数</h2><h3 id="预备知识"><a href="#预备知识" class="headerlink" title="预备知识"></a>预备知识</h3><ul><li>Seq2Seq模型:<br> <img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fv3ej4gjdpj20cu09a41n.jpg" alt=""></li></ul><h3 id="论文写作动机"><a href="#论文写作动机" class="headerlink" title="论文写作动机"></a>论文写作动机</h3><p>越来越多的研究者开始探索数据驱动的对话生成方法。主要分为三派:</p><ul><li>基于短语的统计方法(Ritter 2011): 传统的基于短语的翻译系统通过将源句分成多个块,然后逐句翻译来完成任务.</li><li>神经网络方法</li><li>Seq2Seq模型(Sordoni 2015)</li></ul><p>Seq2Seq神经网络模型生成的回复往往十分保守。(I don’t know)</p><h3 id="问题的解决思路"><a href="#问题的解决思路" class="headerlink" title="问题的解决思路"></a>问题的解决思路</h3><h4 id="最大互信息模型"><a href="#最大互信息模型" class="headerlink" title="最大互信息模型"></a>最大互信息模型</h4><ol><li><p>符号表示</p><ul><li>$S={s_1, s<em>2, …, S</em>{N_s}}$: 输入句子序列</li><li>$T={t_1, t<em>2, …, t</em>{N_s}, EOS}$: 目标句子序列,其中$EOS$表示句子结束。</li></ul></li><li><p>MMI评判标准</p><ol><li>MMI-antiLM:<br>对标准的目标函数:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fv3aq0qijej209x02i747.jpg" alt=""><br>进行了改进:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fv3da2abhvj20az01zglj.jpg" alt=""><br>在原始目标函数基础上添加了目标序列本身的概率$logp(T)$,$p(T)$就是一句话存在的概率,也就是一个模型,前面的lambda是惩罚因子,越大说明对语言模型惩罚力度越大。由于这里用的是减号,所以相当于在原本的目标上减去语言模型的概率,也就降低了“I don’t know”这类高频句子的出现概率。</li><li>MMI-bidi:<br>在标准的目标函数基础上添加$logp(S|T)$,也就是T的基础上产生S的概率,而且可以通过改变lambda的大小衡量二者的重要性。后者可以表示在响应输入模型时产生输入的概率,自然像“I don’t know”这种答案的概率会比较低,而这里使用的是相加,所以会降低这种相应的概率。<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fv3b3rrm7wj20ex04naad.jpg" alt=""></li></ol></li></ol><h4 id="MMI-antiLM"><a href="#MMI-antiLM" class="headerlink" title="MMI-antiLM"></a>MMI-antiLM</h4><p>如上所说,MMI-antiLM模型使用第一个目标函数,引入了$logp(T)$,如果lambda取值不合适可能会导致产生的响应不符合语言模型,所以在实际使用过程中会对其进行修正。由于解码过程中往往第一个单词或者前面几个单词是根据encode向量选择的,后面的单词更倾向于根据前面decode的单词和语言模型选择,而encode的信息影响较小。也就是说我们只需要对前面几个单词进行惩罚,后面的单词直接根据语言模型选择即可,这样就不会使整个句子不符合语言模型了。使用下式中的$U(T)$代替$p(T)$,式中$g(k)$表示要惩罚的句子长度:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fv3btiaol9j20dd08g0ta.jpg" alt=""><br>此外,我们还想要加入响应句子的长度这个因素,也作为模型相应的依据,所以将上面的目标函数修正为下式:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fv3bu955smj209q01la9y.jpg" alt=""></p><h4 id="MMI-bidi"><a href="#MMI-bidi" class="headerlink" title="MMI-bidi"></a>MMI-bidi</h4><p>MMI-bidi模型引入了$p(S|T)$项,这就需要先计算出完整的T序列再将其传入一个提前训练好的反向seq2seq模型中计算该项的值。但是考虑到S序列会产生无数个可能的T序列,我们不可能将每一个T都进行计算,所以这里引入beam-search只计算前200个序列T来代替。然后再计算两项和,进行得分重排。论文中也提到了这么做的缺点,比如最终的效果会依赖于选择的前N个序列的效果等等,但是实际的效果还是可以的。</p><h3 id="实验设计"><a href="#实验设计" class="headerlink" title="实验设计"></a>实验设计</h3><h4 id="数据集"><a href="#数据集" class="headerlink" title="数据集"></a>数据集</h4><ol><li>Twitter Conversation Triple Dataset: 包含2300万个对话片段。</li><li>OpenSubtitiles Dataset</li></ol><h4 id="对比实验方法:"><a href="#对比实验方法:" class="headerlink" title="对比实验方法:"></a>对比实验方法:</h4><ol><li>SEQ2SEQ</li><li>SEQ2SEQ(greedy)</li><li>SMT(statistical machine translation): 2011</li><li>SMT + neural reranking: 2015</li></ol><h4 id="评估指标"><a href="#评估指标" class="headerlink" title="评估指标"></a>评估指标</h4><ol><li>BLEU</li><li>distinct-1</li><li>distinct-2</li></ol><h3 id="实验结果分析"><a href="#实验结果分析" class="headerlink" title="实验结果分析"></a>实验结果分析</h3><h4 id="实验结果评价"><a href="#实验结果评价" class="headerlink" title="实验结果评价"></a>实验结果评价</h4><p>最终在Twitter和OpenSubtitle两个数据集上面进行测试,效果展示BLEU得分都比标准的seq2seq模型要好。</p><ul><li>Twitter<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fv3coi0jx5j20qk06nq4x.jpg" alt=""></li><li>OpenSubtitle<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fv3cp1j5taj20cs05tgmj.jpg" alt=""></li></ul><h3 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h3><p>一般的seq2seq模型,倾向于生成安全、普适的响应,因为这种响应更符合语法规则,在训练集中出现频率也较高,最终生成的概率也最大,而有意义的响应生成概率往往比他们小。通过MMI来计算输入输出之间的依赖性和相关性,可以减少模型对他们的生成概率。</p><!-- ### 批注版论文> 1. 黄色表示研究领域的问题> 2. 紫色表示论文叙述内容的重点> 3. 绿色表示该论文的解决思路> 4. 蓝色表示该论文的公式以及定义 --><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ol><li><a href="http://paperweek.ly/">http://paperweek.ly/</a></li><li><a href="https://scholar.google.com/">https://scholar.google.com/</a></li></ol>]]></content>
<categories>
<category> Paper </category>
</categories>
<tags>
<tag> note </tag>
</tags>
</entry>
<entry>
<title>demo驱动学习:Image_Caption</title>
<link href="/2018/08/26/Image-Caption-demo-by-tensorflow/"/>
<url>/2018/08/26/Image-Caption-demo-by-tensorflow/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Introduction-to-demo"><a href="#Introduction-to-demo" class="headerlink" title="Introduction to demo"></a>Introduction to demo</h2><p>Source Code:<a href="https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/generative_examples/image_captioning_with_attention.ipynb">image_captioning_with_attention</a></p><h3 id="Related-Papers"><a href="#Related-Papers" class="headerlink" title="Related Papers"></a>Related Papers</h3><p><a href="https://arxiv.org/pdf/1502.03044.pdf">Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.</a></p><h3 id="Goal-of-this-end2end-model"><a href="#Goal-of-this-end2end-model" class="headerlink" title="Goal of this end2end model"></a>Goal of this end2end model</h3><ol><li>Generate a caption, such as “a surfer riding on a wave”, according to an image.<br><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fun2xxvjt8j20hs0buamq.jpg" alt=""></li><li>Use an attention based model that enables us to see which parts of the image the model focuses on as it generates a caption.<br><img src="http://ww1.sinaimg.cn/mw690/ca26ff18ly1fun2yatwwhj20zz0ehk1c.jpg" alt=""></li></ol><h3 id="Dateset"><a href="#Dateset" class="headerlink" title="Dateset"></a>Dateset</h3><p><strong>MS-COCO</strong>:This dataset contains >82,000 images, each of which has been annotated with at least 5 different captions.</p><h2 id="Frame-work-of-demo"><a href="#Frame-work-of-demo" class="headerlink" title="Frame work of demo:"></a>Frame work of demo:</h2><ol><li>Download and prepare the MS-COCO dataset</li><li>Limit the size of the training set for faster training</li><li><p>Preprocess the images using InceptionV3: extract features from the last convolutional layer.</p><ol><li>Initialize InceptionV3 and load the pretrained Imagenet weights</li><li>Caching the features extracted from InceptionV3</li></ol></li><li><p>Preprocess and tokenize the captions</p><ol><li>First, tokenize the captions will give us a vocabulary of all the unique words in the data (e.g., “surfing”, “football”, etc).</li><li>Next, limit the vocabulary size to the top 5,000 words to save memory. We’ll replace all other words with the token “UNK” (for unknown).</li><li>Finally, we create a word –> index mapping and vice-versa.</li><li>We will then pad all sequences to the be same length as the longest one.</li></ol></li><li><p>create a tf.data dataset to use for training our model.</p></li></ol><ol><li><p>Model</p><ol><li>extract the features from the lower convolutional layer of InceptionV3 giving us a vector of shape (8, 8, 2048).</li><li>This vector is then passed through the CNN Encoder(which consists of a single Fully connected layer).</li><li>The RNN(here GRU) attends over the image to predict the next word.</li></ol></li><li><p>Training</p><ol><li>We extract the features stored in the respective .npy files and then pass those features through the encoder.</li><li>The encoder output, hidden state(initialized to 0) and the decoder input (which is the start token) is passed to the decoder.</li><li>The decoder returns the predictions and the decoder hidden state.</li><li>The decoder hidden state is then passed back into the model and the predictions are used to calculate the loss.</li><li>Use teacher forcing to decide the next input to the decoder.</li><li>Teacher forcing is the technique where the target word is passed as the next input to the decoder.</li><li>The final step is to calculate the gradients and apply it to the optimizer and backpropagate.</li></ol></li><li><p>Caption</p><ol><li>The evaluate function is similar to the training loop, except we don’t use teacher forcing here. The input to the decoder at each time step is its previous predictions along with the hidden state and the encoder output.</li><li>Stop predicting when the model predicts the end token.</li><li>And store the attention weights for every time step.</li></ol></li></ol><h2 id="Problems-undesirable"><a href="#Problems-undesirable" class="headerlink" title="Problems undesirable"></a>Problems undesirable</h2><h3 id="Version"><a href="#Version" class="headerlink" title="Version"></a>Version</h3><ul><li>The code requires TensorFlow version <strong>>=1.9</strong>. 1.10.0 is better.</li><li><code>cudatoolkit</code></li></ul><h3 id="GPU-lose-connect"><a href="#GPU-lose-connect" class="headerlink" title="GPU lose connect"></a>GPU lose connect</h3><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><ol><li><a href="https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/generative_examples/image_captioning_with_attention.ipynb">https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/generative_examples/image_captioning_with_attention.ipynb</a></li></ol>]]></content>
<categories>
<category> Demo </category>
</categories>
<tags>
<tag> imageCaption </tag>
</tags>
</entry>
<entry>
<title>Visual-Question-Learning(VQA)学习笔记</title>
<link href="/2018/08/23/Visual-Question-Learning/"/>
<url>/2018/08/23/Visual-Question-Learning/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><!-- 论文基本信息:方便查阅和追踪 --><!-- 论文基本信息的获取:1. 直接从论文pdf中获取2. 从paperweekly首页上方搜索论文;若未检索到,点击推荐论文输入论文名即可自动获取信息--><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li><p>论文名:Visual Question Answering: Datasets, Algorithms, and Future Challenges</p></li><li><p>论文链接:<a href="https://arxiv.org/pdf/1610.01465.pdf">https://arxiv.org/pdf/1610.01465.pdf</a></p><!-- Ex: https://arxiv.org/abs/1606.01541 --></li><li><p>论文源码</p><ul><li>None</li></ul></li><li><p>关于作者</p><ul><li>Kushal Kafle</li><li>Christopher Kanan</li></ul></li><li><p>关于笔记作者:</p><ul><li>朱正源,北京邮电大学研究生,研究方向为多模态与认知计算。 </li></ul></li></ol><h2 id="论文推荐理由"><a href="#论文推荐理由" class="headerlink" title="论文推荐理由"></a>论文推荐理由</h2><!-- Ex: 论文摘要的中文翻译最近对话生成的神经模型为会话代理生成响应提供了很大的希望,但往往是短视的,一次预测一个话语而忽略它们对未来结果的影响。对未来的对话方向进行建模对于产生连贯,有趣的对话至关重要,这种对话需要传统的NLP对话模式借鉴强化学习。在本文中,我们将展示如何整合这些目标,应用深度强化学习来模拟聊天机器人对话中的未来奖励。该模型模拟两个虚拟代理之间的对话,使用策略梯度方法来奖励显示三个有用会话属性的序列:信息性,连贯性和易于回答(与前瞻性功能相关)。我们在多样性,长度以及人类评判方面评估我们的模型,表明所提出的算法产生了更多的交互式响应,并设法在对话模拟中促进更持久的对话。这项工作标志着基于对话的长期成功学习神经对话模型的第一步。 --><p> 视觉问答(Visual Question answering, VQA)是近年来计算机视觉和自然语言处理领域的一个热点问题。在VQA中,一个算法需要回答关于图像的基于文本的问题。自2014年发布第一个VQA数据集以来,已经发布了更多的数据集,并提出了许多算法。在这篇综述中,我们从问题的形成、现有数据集、评估指标和算法的角度,批判性地研究了VQA的当前状态。特别地,我们讨论了当前数据集在正确训练和评估VQA算法方面的局限性。然后,我们详尽地回顾了VQA的现有算法。最后,我们讨论了未来VQA和图像理解研究的可能方向。</p><h2 id="视觉问答:数据集,算法和未来的挑战"><a href="#视觉问答:数据集,算法和未来的挑战" class="headerlink" title="视觉问答:数据集,算法和未来的挑战"></a>视觉问答:数据集,算法和未来的挑战</h2><!-- Ex: ## 强化学习在对话生成领域的应用 --><h3 id="引言"><a href="#引言" class="headerlink" title="引言"></a>引言</h3><h4 id="VQA的研究价值"><a href="#VQA的研究价值" class="headerlink" title="VQA的研究价值"></a>VQA的研究价值</h4><ol><li><p>大部分计算机视觉任务不能完整的理解图像<br>图像分类、物体检测、动作识别等任务很难获取到物体的<strong>空间位置信息</strong>并且根据它们的属性和关系进行<strong>推理</strong>。</p></li><li><p>人类对<strong>Grand Unified Theory</strong>的痴迷追求</p><ul><li>目标识别任务:图像里面有什么?[分类]</li><li>目标检测任务:图像里面有猫吗?[拉框]</li><li>属性分类任务:图像里面的猫是什么颜色的?</li><li>场景分类:图像是在室内吗?</li><li>计数任务:图像里面有多少猫?<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvah1uwsvij208c04ogn4.jpg" alt=""></li></ul></li><li><p>通过视觉图灵测试:</p><ul><li>基准问题测试</li><li>建立评价指标 </li></ul></li></ol><h3 id="VQA的数据集"><a href="#VQA的数据集" class="headerlink" title="VQA的数据集"></a>VQA的数据集</h3><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvajh4e1wbj20tz09eacb.jpg" alt=""></p><h3 id="VQA的评价标准"><a href="#VQA的评价标准" class="headerlink" title="VQA的评价标准"></a>VQA的评价标准</h3><ul><li>Open-ended(OE): 开放式的</li><li>Multiple Choice(MC): 选择式的</li></ul><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvb6r1njlkj20kp0dmjuj.jpg" alt=""></p><h4 id="流行的评价标准"><a href="#流行的评价标准" class="headerlink" title="流行的评价标准"></a>流行的评价标准</h4><p>选择式任务的评价标准直接使用正确率即可。但是开放式任务的评价标准呢?</p><ol><li>Simple accuracy:<ol><li>Q: What animals are in the photo<br>若<code>dogs</code>是正确答案,那么<code>dog</code>和<code>zebra</code>的惩罚竟然是一样的</li><li>Q: What is in the tree<br>若<code>bald eagle</code>是正确答案,<code>eagle</code>或是<code>bird</code> 与 <code>yes</code>的惩罚竟然也是一样的</li></ol></li><li><p>Wu-Palmer Similarity</p><ol><li>语义相似度<br><code>Black</code>、<code>White</code>两个单词的<code>WUPS score</code>是0.91。所以这可能会给错误答案一个相当高的分数。</li><li>只可以评价单词,句子不可使用</li></ol></li><li><p>$Accuracy_{VQA}=min(\frac{n}{3}, 1)$<br>同样是语义相似度,大致正确就ok: 人为构造一个答案集合,$n$是算法和人类拥有的相同的答案数量。</p></li></ol><h3 id="VQA的算法"><a href="#VQA的算法" class="headerlink" title="VQA的算法"></a>VQA的算法</h3><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvbaa7e7d4j20qr098n45.jpg" alt=""><br>存在的算法大致结构均包括:</p><ol><li>提取图像特征</li><li>提取问题特征</li><li>利用特征产生结果的算法</li></ol><h4 id="Baseline和模型性能"><a href="#Baseline和模型性能" class="headerlink" title="Baseline和模型性能"></a>Baseline和模型性能</h4><ol><li>瞎猜最有可能的答案。“yes”/“no”</li><li>MLP(multi-layer percepton)</li></ol><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvbapjz9zkj20ky0lradq.jpg" alt=""></p><h4 id="模型架构一览"><a href="#模型架构一览" class="headerlink" title="模型架构一览"></a>模型架构一览</h4><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvbasmehclj20qg0l278r.jpg" alt=""></p><ol><li>基于贝叶斯和问题导向的模型</li><li>基于注意力机制的模型<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvbb0ttq7cj20r60dj7bs.jpg" alt=""></li><li>非线性池化方法</li></ol><ul><li>MULTI-WORLD: A multi-world approach to question answering about real- world scenes based on uncertain input, NIPS2014</li><li>ASK-NEURon: Ask your neurons: A neural-based ap- proach to answering questions about images, ICCV2015</li><li>ENSEMSBLE: Exploring models and data for image question answering, NIPS2015</li><li>LSTM Q+I: VQA: Visual question answering, ICCV2015</li><li>iBOWIMG: Simple baseline for visual question answering, arxiv</li><li>DPPNET: Image question answering using convolutional neural network with dynamic parameter prediction, CVPR2016</li><li>SMem: Ask, attend and answer: Exploring question-guided spatial attention for visual question answering, ECCV2016</li><li>SAN: Stacked attention networks for image question answering, CVPR2016</li><li>NMN: Deep compositional question answering with neural module networks, CVPR2016</li><li>FDA: A focused dynamic attention model for visual question answering, arxiv2016</li><li>HYBRID: Answer-type prediction for visual question answering, CVPR2016</li><li>DMN+: Dynamic memory networks for visual and textual question answering, ICML2016</li><li>MRN: Multimodal residual learning for visual qa, NIPS2016</li><li>HieCoAtten: Hierarchical question-image co-attention for visual question answering, NIPS2016</li><li>RAU_ResNet: Training recurrent answering units with joint loss minimization for VQA, arxiv2016</li><li>DAN: Dual attention networks for multimodal reasoning and matching, arxiv2016</li><li>MCB+Att: Multi-modal compact bilinear pooling for visual question answering and visual grounding, EMNLP2016</li><li>MLB: Hadamard product for low-rank bilinear pooling, arxiv2016</li><li>AMA: Ask me anything: Free-form visual question answering based on knowledge from external sources, CVPR2016</li><li>MCB-ensemble: Multi-modal compact bilinear pooling for visual question answering and visual grounding, EMNLP2016</li></ul><h3 id="VQA仍然存在很多问题"><a href="#VQA仍然存在很多问题" class="headerlink" title="VQA仍然存在很多问题"></a>VQA仍然存在很多问题</h3><p>虽然VQA已经取得了长足的进步,但是现有的算法仍然距离人类有巨大的差距。<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fvbbxjxinej20n10gvmzr.jpg" alt=""></p><p>现有问题有:</p><ol><li>现有的VQA系统太依赖于问题而不是图片内容,并且语言的偏差会严重影响VQA系统性能。<ol><li>只需要问题或者图片就能猜出来答案,甚至一个差的数据集(通常包含具有偏差的问题)会降低VQA系统的性能。也即越具体的问题越好![do->play->sport play]</li></ol></li><li>算法性能的提升是否真的来自于注意力机制?<ol><li>通过多全局图片特征(预训练的VGG-19,ResNet-101)也能达到很好的效果。</li><li>注意力机制有时候会误导VQA系统。</li></ol></li></ol><h3 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h3><p>可以回答任意关于图片的问题的算法将会是人工智能的里程碑。</p><h4 id="研究方向潜力股"><a href="#研究方向潜力股" class="headerlink" title="研究方向潜力股"></a>研究方向潜力股</h4><ol><li>更<strong>大</strong>更<strong>无偏</strong>更<strong>丰富</strong>的数据集:每个问题权重不应该一样;问题的质量应该更高;答案不应该是二元的;多选题应当被淘汰</li><li>更加巧妙地模型评估方式</li><li>重点:可以对图片内容进行<strong>推理</strong>的算法!<ol><li>常识推理。</li><li>空间位置。</li><li>根据不同粒度回复问题。</li></ol></li></ol><!-- TODO: ### 批注版论文 > 1. 黄色表示研究领域的问题> 2. 紫色表示论文叙述内容的重点> 3. 绿色表示该论文的解决思路> 4. 蓝色表示该论文的公式以及定义 --><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><!--Ex:1. https://www.paperweekly.site/papers/notes/2212. https://scholar.google.com/-->]]></content>
<categories>
<category> VQA </category>
</categories>
<tags>
<tag> summarize </tag>
</tags>
</entry>
<entry>
<title>深度学习模块文档备忘录</title>
<link href="/2018/08/23/colab-tensorflow-usage/"/>
<url>/2018/08/23/colab-tensorflow-usage/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Colab-study-notes"><a href="#Colab-study-notes" class="headerlink" title="Colab study notes"></a>Colab study notes</h2><h3 id="Install-commonly-used-packages"><a href="#Install-commonly-used-packages" class="headerlink" title="Install commonly used packages"></a>Install commonly used packages</h3><p>Although Colab has already installed some packages such as Tensorflow Matplotlib .etc, there are lots of commonly ised packages:</p><ul><li>Keras:<code>pip install keras</code></li><li>OpenCV:<code>!apt-get -qq install -y libsm6 libxext6 && pip install -q -U opencv-python</code></li><li>Pytorch:<code>!pip install -q http://download.pytorch.org/whl/cu75/torch-0.2.0.post3-cp27-cp27mu-manylinux1_x86_64.whl torchvision</code></li><li>tqdm:<code>!pip install tqdm</code><h3 id="Authorized-to-log-in"><a href="#Authorized-to-log-in" class="headerlink" title="Authorized to log in"></a>Authorized to log in</h3><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 安装 PyDrive 操作库,该操作每个 notebook 只需要执行一次</span></span><br><span class="line">!pip install -U -q PyDrive</span><br><span class="line"><span class="keyword">from</span> pydrive.auth <span class="keyword">import</span> GoogleAuth</span><br><span class="line"><span class="keyword">from</span> pydrive.drive <span class="keyword">import</span> GoogleDrive</span><br><span class="line"><span class="keyword">from</span> google.colab <span class="keyword">import</span> auth</span><br><span class="line"><span class="keyword">from</span> oauth2client.client <span class="keyword">import</span> GoogleCredentials</span><br><span class="line"></span><br><span class="line"><span class="comment"># 授权登录,仅第一次的时候会鉴权</span></span><br><span class="line">auth.authenticate_user()</span><br><span class="line">gauth = GoogleAuth()</span><br><span class="line">gauth.credentials = GoogleCredentials.get_application_default()</span><br><span class="line">drive = GoogleDrive(gauth)</span><br></pre></td></tr></table></figure><h3 id="File-IO"><a href="#File-IO" class="headerlink" title="File IO"></a>File IO</h3><h4 id="Read-file-from-Google-Drive"><a href="#Read-file-from-Google-Drive" class="headerlink" title="Read file from Google Drive"></a>Read file from Google Drive</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Get the file by id</span></span><br><span class="line">downloaded = drive.CreateFile({<span class="string">'id'</span>:<span class="string">'yourfileID'</span>}) <span class="comment"># replace the id with id of file you want to access</span></span><br><span class="line"><span class="comment"># Download file to colab</span></span><br><span class="line">downloaded.GetContentFile(<span class="string">'yourfileName'</span>) </span><br><span class="line"><span class="comment"># Read file as panda dataframe</span></span><br><span class="line"><span class="keyword">import</span> pandas <span class="keyword">as</span> pd</span><br><span class="line">xyz = pd.read_csv(<span class="string">'yourfileName'</span>)</span><br></pre></td></tr></table></figure><h4 id="Write-file-to-Google-Drive"><a href="#Write-file-to-Google-Drive" class="headerlink" title="Write file to Google Drive"></a>Write file to Google Drive</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Create a Content file as Cache</span></span><br><span class="line">xyz.to_csv(<span class="string">'over.csv'</span>)</span><br><span class="line"><span class="comment"># Create & upload a text file.</span></span><br><span class="line">uploaded = drive.CreateFile({<span class="string">'title'</span>: <span class="string">'OK.csv'</span>})</span><br><span class="line"><span class="comment"># You will have a file named 'OK.csv' which has content of 'over.csv'</span></span><br><span class="line">uploaded.SetContentFile(<span class="string">'over.csv'</span>)</span><br><span class="line">uploaded.Upload()</span><br><span class="line"><span class="comment"># checkout your upload file's ID</span></span><br><span class="line"><span class="built_in">print</span>(<span class="string">'Uploaded file with ID {}'</span>.<span class="built_in">format</span>(uploaded.get(<span class="string">'id'</span>)))</span><br></pre></td></tr></table></figure></li></ul><h2 id="Tensorflow-commonly-used"><a href="#Tensorflow-commonly-used" class="headerlink" title="Tensorflow commonly used"></a>Tensorflow commonly used</h2><h3 id="tf"><a href="#tf" class="headerlink" title="tf"></a>tf</h3><h4 id="cast"><a href="#cast" class="headerlink" title="cast"></a>cast</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># cast a tensor[x] to a new type[dtype]</span></span><br><span class="line">tf.cast(</span><br><span class="line"> x,</span><br><span class="line"> dtype,</span><br><span class="line"> name=<span class="literal">None</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure><h4 id="expand-dims"><a href="#expand-dims" class="headerlink" title="expand_dims"></a>expand_dims</h4><p>Inserts a dimension of 1 into a tensor’s shape.<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">tf.expand_dims(</span><br><span class="line"> <span class="built_in">input</span>,</span><br><span class="line"> axis=<span class="literal">None</span></span><br><span class="line">)</span><br><span class="line"><span class="comment"># 't' is a tensor of shape [2]</span></span><br><span class="line">tf.shape(tf.expand_dims(t, <span class="number">0</span>)) <span class="comment"># [1, 2]</span></span><br><span class="line">tf.shape(tf.expand_dims(t, <span class="number">1</span>)) <span class="comment"># [2, 1]</span></span><br><span class="line">tf.shape(tf.expand_dims(t, -<span class="number">1</span>)) <span class="comment"># [2, 1]</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 't2' is a tensor of shape [2, 3, 5]</span></span><br><span class="line">tf.shape(tf.expand_dims(t2, <span class="number">0</span>)) <span class="comment"># [1, 2, 3, 5]</span></span><br><span class="line">tf.shape(tf.expand_dims(t2, <span class="number">2</span>)) <span class="comment"># [2, 3, 1, 5]</span></span><br><span class="line">tf.shape(tf.expand_dims(t2, <span class="number">3</span>)) <span class="comment"># [2, 3, 5, 1]</span></span><br></pre></td></tr></table></figure></p><h4 id="read-file"><a href="#read-file" class="headerlink" title="read_file"></a>read_file</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">tf.read_file(</span><br><span class="line"> filename,</span><br><span class="line"> name=<span class="literal">None</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure><h4 id="device"><a href="#device" class="headerlink" title="device"></a>device</h4><ol><li>manual mode<ul><li><code>with tf.device('/cpu:0')</code>: cpu</li><li><code>with tf.device('/gpu:0')</code>or<code>with tf.device('/device:GPU:0')</code> </li></ul></li><li>GPU config<ul><li><code>import os</code></li><li><code>os.environ['CUDA_VISIBLE_DEVICES']='0, 1'</code></li></ul></li></ol><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">tf.device(device_name_or_function)</span><br><span class="line"></span><br><span class="line"><span class="keyword">with</span> tf.device(<span class="string">'/cpu:0'</span>):</span><br><span class="line"><span class="keyword">with</span> tf.device(<span class="string">'/gpu:0'</span>):</span><br></pre></td></tr></table></figure><h4 id="random-normal"><a href="#random-normal" class="headerlink" title="random_normal"></a>random_normal</h4><p>Outputs random values from a normal distribution.<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">tf.random_normal(</span><br><span class="line"> shape,</span><br><span class="line"> mean=<span class="number">0.0</span>,</span><br><span class="line"> stddev=<span class="number">1.0</span>,</span><br><span class="line"> dtype=tf.float32,</span><br><span class="line"> seed=<span class="literal">None</span>,</span><br><span class="line"> name=<span class="literal">None</span></span><br><span class="line">)</span><br><span class="line">tf.random_normal((<span class="number">100</span>, <span class="number">100</span>, <span class="number">100</span>, <span class="number">3</span>))</span><br></pre></td></tr></table></figure></p><h4 id="ConfigProto"><a href="#ConfigProto" class="headerlink" title="ConfigProto"></a>ConfigProto</h4><p>allowing GPU memory growth by the process.<br><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">config = tf.ConfigProto()</span><br><span class="line">config.gpu_options.allow_growth = <span class="literal">True</span></span><br><span class="line">sess = tf.Session(config=config)</span><br></pre></td></tr></table></figure></p><h4 id="reduce-sum-reduce-mean"><a href="#reduce-sum-reduce-mean" class="headerlink" title="reduce_sum/reduce_mean"></a>reduce_sum/reduce_mean</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">tf.reduce_sum(</span><br><span class="line"> input_tensor,</span><br><span class="line"> axis=<span class="literal">None</span>,</span><br><span class="line"> keepdims=<span class="literal">None</span>,</span><br><span class="line"> name=<span class="literal">None</span>,</span><br><span class="line"> reduction_indices=<span class="literal">None</span>,</span><br><span class="line"> keep_dims=<span class="literal">None</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure><p>Returns: The reduced tensor</p><h3 id="tf-app"><a href="#tf-app" class="headerlink" title="tf.app"></a>tf.app</h3><p>Generic entry point</p><h4 id="flag-module"><a href="#flag-module" class="headerlink" title="flag module"></a><code>flag</code> module</h4><p>process command line parameters. Just like <code>argparse</code></p><h4 id="run"><a href="#run" class="headerlink" title="run(...)"></a><code>run(...)</code></h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># run program with an optional 'main' function and 'argv' list</span></span><br><span class="line">tf.app.run(</span><br><span class="line"> main=<span class="literal">None</span>,</span><br><span class="line"> argv=<span class="literal">None</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure><h3 id="tf-contrib"><a href="#tf-contrib" class="headerlink" title="tf.contrib"></a>tf.contrib</h3><h4 id="eager"><a href="#eager" class="headerlink" title="eager"></a>eager</h4><ul><li>Saver: A tf.train.Saver adapter for use when eager execution is enabled.</li></ul><h3 id="tf-data"><a href="#tf-data" class="headerlink" title="tf.data"></a>tf.data</h3><h4 id="Dataset"><a href="#Dataset" class="headerlink" title="Dataset"></a>Dataset</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># usage example</span></span><br><span class="line">tf.data.Dataset.from_tensor_slices(encode_train).<span class="built_in">map</span>(load_image).batch(<span class="number">16</span>)</span><br></pre></td></tr></table></figure><ul><li>from_tensor_slices(tensors): Creates a Dataset whose elements are slices of the given tensors. Returns: A dataset</li><li>map(map_func,num_parallel_calls=None) </li><li>batch(batch_size,drop_remainder=False)</li><li>prefetch(buffersize): Creates a Dataset that prefetches elements from this dataset.</li></ul><h3 id="tf-image"><a href="#tf-image" class="headerlink" title="tf.image"></a>tf.image</h3><h4 id="decode-jpeg"><a href="#decode-jpeg" class="headerlink" title="decode_jpeg"></a>decode_jpeg</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">tf.image.decode_jpeg(</span><br><span class="line"> contents,</span><br><span class="line"> channels=<span class="number">0</span>, <span class="comment"># 3: output an RGB image.</span></span><br><span class="line"> ratio=<span class="number">1</span>,</span><br><span class="line"> fancy_upscaling=<span class="literal">True</span>,</span><br><span class="line"> try_recover_truncated=<span class="literal">False</span>,</span><br><span class="line"> acceptable_fraction=<span class="number">1</span>,</span><br><span class="line"> dct_method=<span class="string">''</span>,</span><br><span class="line"> name=<span class="literal">None</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure><h4 id="resize-images"><a href="#resize-images" class="headerlink" title="resize_images"></a>resize_images</h4><h3 id="tf-layers"><a href="#tf-layers" class="headerlink" title="tf.layers"></a>tf.layers</h3><h4 id="conv2d"><a href="#conv2d" class="headerlink" title="conv2d"></a>conv2d</h4><figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line">tf.layers.conv2d(</span><br><span class="line"> inputs,</span><br><span class="line"> filters,</span><br><span class="line"> kernel_size,</span><br><span class="line"> strides=(<span class="number">1</span>, <span class="number">1</span>),</span><br><span class="line"> padding=<span class="string">'valid'</span>,</span><br><span class="line"> data_format=<span class="string">'channels_last'</span>,</span><br><span class="line"> dilation_rate=(<span class="number">1</span>, <span class="number">1</span>),</span><br><span class="line"> activation=<span class="literal">None</span>,</span><br><span class="line"> use_bias=<span class="literal">True</span>,</span><br><span class="line"> kernel_initializer=<span class="literal">None</span>,</span><br><span class="line"> bias_initializer=tf.zeros_initializer(),</span><br><span class="line"> kernel_regularizer=<span class="literal">None</span>,</span><br><span class="line"> bias_regularizer=<span class="literal">None</span>,</span><br><span class="line"> activity_regularizer=<span class="literal">None</span>,</span><br><span class="line"> kernel_constraint=<span class="literal">None</span>,</span><br><span class="line"> bias_constraint=<span class="literal">None</span>,</span><br><span class="line"> trainable=<span class="literal">True</span>,</span><br><span class="line"> name=<span class="literal">None</span>,</span><br><span class="line"> reuse=<span class="literal">None</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line">random_image_gpu = tf.random_normal((<span class="number">100</span>, <span class="number">100</span>, <span class="number">100</span>, <span class="number">3</span>))</span><br><span class="line">net_gpu = tf.layers.conv2d(random_image_gpu, <span class="number">32</span>, <span class="number">7</span>)</span><br></pre></td></tr></table></figure><p>Returns: Output tensor.</p><h3 id="tf-test"><a href="#tf-test" class="headerlink" title="tf.test"></a>tf.test</h3><ul><li>gpu_device_name(): Check out GPU whether can be found.</li></ul><h3 id="tf-train"><a href="#tf-train" class="headerlink" title="tf.train"></a>tf.train</h3><ul><li>Saver</li></ul><h2 id="scikit-learn-sklearn"><a href="#scikit-learn-sklearn" class="headerlink" title="scikit-learn(sklearn)"></a>scikit-learn(sklearn)</h2><h3 id="utils"><a href="#utils" class="headerlink" title="utils"></a>utils</h3><ul><li>shuffle(*array):Shuffle arrays or sparse matrices in a consistent way<h3 id="model-selection"><a href="#model-selection" class="headerlink" title="model_selection"></a>model_selection</h3></li><li>train_test_split(*array): Split arrays or matrices into random train and test subsets<ul><li>Parameters<ul><li>arrays_data</li><li>arrays_label</li><li>test_size</li><li>random_state</li></ul></li></ul></li></ul><h2 id="Keras"><a href="#Keras" class="headerlink" title="Keras"></a>Keras</h2><p>A high-API to build and train deep learning models.</p><h3 id="applications"><a href="#applications" class="headerlink" title="applications"></a>applications</h3><h4 id="inception-v3"><a href="#inception-v3" class="headerlink" title="inception_v3"></a>inception_v3</h4><ul><li>InceptionV3(…): Instantiates the Inception v3 architecture. <figure class="highlight py"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">tf.keras.applications.InceptionV3(</span><br><span class="line">include_top=<span class="literal">True</span>, <span class="comment"># whether to include the fully-connected layer at the top of the network.</span></span><br><span class="line">weights=<span class="string">'imagenet'</span>,</span><br><span class="line">input_tensor=<span class="literal">None</span>,</span><br><span class="line">input_shape=<span class="literal">None</span>,</span><br><span class="line">pooling=<span class="literal">None</span>,</span><br><span class="line">classes=<span class="number">1000</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure></li><li>decode_predictions(…): Decodes the prediction of an ImageNet model.</li><li>preprocess_input(…): Preprocesses a numpy array encoding a batch of images.</li></ul><h3 id="backend"><a href="#backend" class="headerlink" title="backend"></a>backend</h3><h3 id="layers"><a href="#layers" class="headerlink" title="layers"></a>layers</h3><ul><li>Dense: regular densely-connected NN layer<ul><li>Arguments:<ul><li>units:</li><li>input_shape: </li></ul></li></ul></li><li>GRU/CuDNNGRU<ul><li>Arguments:<ul><li>units: Positive integer, dimensionality of the output space.</li><li>return_sequences: Boolean. Whether to return the last output in the output sequence, or the full sequence.</li><li>return_state: Boolean. Whether to return the last state in addition to the output.</li><li>recurrent_activation: Default is <code>hard sigmoid</code>.<code>sigmoid</code> is avaliable. <ul><li>hard sigmoid: a combination of sigmoid and relu</li></ul></li><li>recurrent_initializer: Defaul is <code>orthogonal</code>.</li></ul></li></ul></li></ul><h3 id="preprocessing"><a href="#preprocessing" class="headerlink" title="preprocessing"></a>preprocessing</h3><h4 id="image"><a href="#image" class="headerlink" title="image"></a>image</h4><h4 id="sequence"><a href="#sequence" class="headerlink" title="sequence"></a>sequence</h4><ul><li>pad_sequences:<ul><li>Arguments:<ul><li>sequences: List of lists, where each element is a sequence.</li><li>padding: String, ‘pre’ or ‘post’: pad either before or after each sequence.<h4 id="text"><a href="#text" class="headerlink" title="text"></a>text</h4></li></ul></li></ul></li><li>hashing_trick</li><li>one_hot</li><li>text_to_word_sequence</li><li>Tokenizer(vetorize a text corpus)<ul><li>Arguments:<ul><li>num_words: the maximum number of words to keep, based on <strong>word frequency</strong>. </li><li>oov_token: if given, it will be added to word_index and used to replace out-of-vocabulary words during text_to_sequence calls</li><li>filters: a string where each element is a character that will be filtered from the texts. </li></ul></li><li>Methods:<ul><li>fit_on_texts: Updates internal vocabulary based on a list of texts.</li><li>texts_to_sequences: Transforms each text in texts in a sequence of integers.</li><li></li></ul></li></ul></li></ul><h3 id="utils-1"><a href="#utils-1" class="headerlink" title="utils"></a>utils</h3><ul><li>get_file: Downloads a file from a URL if it not already in the cache.</li></ul><blockquote><p>Reference:</p><ol><li><a href="https://segmentfault.com/a/1190000012731724">https://segmentfault.com/a/1190000012731724</a></li><li><a href="https://tensorflow.google.cn/api_docs/">https://tensorflow.google.cn/api_docs/</a></li><li><a href="https://www.jianshu.com/p/d7283bc427b1">https://www.jianshu.com/p/d7283bc427b1</a></li><li><a href="http://scikit-learn.org/stable/modules">http://scikit-learn.org/stable/modules</a></li></ol></blockquote>]]></content>
<categories>
<category> DeepLearning </category>
</categories>
<tags>
<tag> colab </tag>
<tag> tensorflow </tag>
<tag> sklearn </tag>
<tag> Keras </tag>
</tags>
</entry>
<entry>
<title>Zero-shot Learning学习笔记</title>
<link href="/2018/08/22/zero-shot-learning/"/>
<url>/2018/08/22/zero-shot-learning/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="Graph-Convolutional-Networks"><a href="#Graph-Convolutional-Networks" class="headerlink" title="Graph Convolutional Networks"></a>Graph Convolutional Networks</h2><p><img src="http://tkipf.github.io/graph-convolutional-networks/images/gcn_web.png" alt="Multi-layer Graph Convolutional Network (GCN) with first-order filters."></p><h3 id="Problem"><a href="#Problem" class="headerlink" title="Problem"></a>Problem</h3><p>Generalizing well-stablished neural models like RNNs or CNNs to work on arbitrarily structured graphs is a challenging problem.</p><h2 id="Introduction"><a href="#Introduction" class="headerlink" title="Introduction"></a>Introduction</h2><p>Zero-shot Learning is a concept from Transfer-Learning. In traditional machine learning method, Generalization is difficult since big data and time-consuming training are needed in general. Therefore more and more researchers pay attention to <strong>Zero-shot Learning</strong>/<strong>One-shot Learning</strong>/<strong>Few-shot Learning</strong></p><h3 id="types-of-Learning"><a href="#types-of-Learning" class="headerlink" title="types of Learning"></a>types of Learning</h3><h4 id="Zero-shot-Learning"><a href="#Zero-shot-Learning" class="headerlink" title="Zero-shot Learning"></a>Zero-shot Learning</h4><p>A model can create a map $X\rightarrowY$ automatically for the categories which have not appeared in a training set.</p><h4 id="One-shot-Learning"><a href="#One-shot-Learning" class="headerlink" title="One-shot Learning"></a>One-shot Learning</h4><p>One-shot learning is an object categorization problem in computer vision. Whereas most machine learning based object categorization algorithms require training on hundreds or thousands of images and very large datasets, one-shot learning aims to learn information about object categories from one, or only a few, training images.</p><h4 id="Few-shot-Leaning"><a href="#Few-shot-Leaning" class="headerlink" title="Few-shot Leaning"></a>Few-shot Leaning</h4><h2 id="Papers"><a href="#Papers" class="headerlink" title="Papers"></a>Papers</h2><h3 id="DeVise-A-Deep-Visual-Semantic-Embedding-Model"><a href="#DeVise-A-Deep-Visual-Semantic-Embedding-Model" class="headerlink" title="DeVise: A Deep Visual-Semantic Embedding Model"></a>DeVise: A Deep Visual-Semantic Embedding Model</h3><h4 id="Core-idea"><a href="#Core-idea" class="headerlink" title="Core idea"></a>Core idea</h4><p>Combine <strong>feature vector</strong> from Computer Vision and <strong>semantic vector</strong> from NLP to realize zero-shot learning.</p><h3 id="Zero-shot-Learning-by-Convex-Combination-of-Semantic-Embeddings"><a href="#Zero-shot-Learning-by-Convex-Combination-of-Semantic-Embeddings" class="headerlink" title="Zero-shot Learning by Convex Combination of Semantic Embeddings"></a>Zero-shot Learning by Convex Combination of Semantic Embeddings</h3><h3 id="Objects2action-Classifying-and-localizing-actions-without-any-video-example"><a href="#Objects2action-Classifying-and-localizing-actions-without-any-video-example" class="headerlink" title="Objects2action: Classifying and localizing actions without any video example"></a>Objects2action: Classifying and localizing actions without any video example</h3><blockquote><p>Reference:</p><ol><li><a href="https://en.wikipedia.org/wiki/One-shot_learning">https://en.wikipedia.org/wiki/One-shot_learning</a></li><li><a href="https://blog.csdn.net/jningwei/article/details/79235019">https://blog.csdn.net/jningwei/article/details/79235019</a></li><li><a href="http://tkipf.github.io/graph-convolutional-networks/">http://tkipf.github.io/graph-convolutional-networks/</a></li></ol></blockquote>]]></content>
<categories>
<category> Zero-shot </category>
</categories>
<tags>
<tag> graphConvolutionalNetwork </tag>
</tags>
</entry>
<entry>
<title>论文笔记:Deep Reinforcement Learning for Dialogue Generation</title>
<link href="/2018/08/18/paper-note-deep-reinfocement-learning-for-Dialogue-Generation/"/>
<url>/2018/08/18/paper-note-deep-reinfocement-learning-for-Dialogue-Generation/</url>
<content type="html"><![CDATA[<script src="/assets/js/APlayer.min.js"> </script><h2 id="论文基本信息"><a href="#论文基本信息" class="headerlink" title="论文基本信息"></a>论文基本信息</h2><ol><li>论文名:Deep Reinforcement Learning for Dialogue Generation</li><li>论文链接:<a href="https://arxiv.org/abs/1606.01541">https://arxiv.org/abs/1606.01541</a></li><li>论文源码:<ul><li><a href="https://github.com/liuyuemaicha/Deep-Reinforcement-Learning-for-Dialogue-Generation-in-tensorflow">https://github.com/liuyuemaicha/Deep-Reinforcement-Learning-for-Dialogue-Generation-in-tensorflow</a></li><li><a href="https://github.com/agsarthak/Goal-oriented-Dialogue-Systems">https://github.com/agsarthak/Goal-oriented-Dialogue-Systems</a></li><li><a href="https://github.com/jiweil/Neural-Dialogue-Generation">https://github.com/jiweil/Neural-Dialogue-Generation</a></li></ul></li><li>关于作者:<ul><li>Jiwei Li:斯坦福大学博士毕业生,截至发稿被引次数:2156</li><li>Will Monroe:斯坦福大学博士在读,截至发稿被引次数:562</li><li>Alan Ritter:俄亥俄州立大学教授,截至发稿被引次数:4608</li><li>Michel Galley:微软高级研究员,截至发稿被引次数:4529</li><li>Jianfeng Gao:雷德蒙德微软研究院(总部),截至发稿被引次数:11944</li><li>Dan Jurafsky:,斯坦福大学教授,截至发稿被引次数:32973</li></ul></li><li>关于笔记作者:<ul><li>朱正源,北京邮电大学研究生,研究方向为多模态与认知计算。</li></ul></li></ol><h2 id="论文推荐理由与摘要"><a href="#论文推荐理由与摘要" class="headerlink" title="论文推荐理由与摘要"></a>论文推荐理由与摘要</h2><p>最近对话生成的神经模型为会话Agent生成响应提供了很大的帮助,但其结果往往是短视的:一次预测一个话语会忽略它们对未来结果的影响。对未来的对话方向进行建模,这对于产生连贯,有趣的对话至关重要。这种对话需要在传统的NLP对话模式的技术上使用强化学习。在本文中,我们将展示如何整合这些目标,应用深度强化学习来模拟聊天机器人对话中的未来奖励。该模型模拟两个虚拟代理之间的对话,使用策略梯度方法来奖励显示三个有用会话属性的序列:信息性,连贯性和易于回答(与前瞻性功能相关)。我们在多样性,长度以及人类评判方面评估我们的模型,表明所提出的算法产生了更多的交互式响应,并设法在对话模拟中促进更持久的对话。这项工作标志着基于对话的长期成功学习神经对话模型的第一步。</p><h2 id="对话系统的缺点不再致命:深度强化学习带来的曙光"><a href="#对话系统的缺点不再致命:深度强化学习带来的曙光" class="headerlink" title="对话系统的缺点不再致命:深度强化学习带来的曙光"></a>对话系统的缺点不再致命:深度强化学习带来的曙光</h2><h3 id="引言"><a href="#引言" class="headerlink" title="引言"></a>引言</h3><h4 id="论文的写作动机"><a href="#论文的写作动机" class="headerlink" title="论文的写作动机"></a>论文的写作动机</h4><blockquote><p>Seq2Seq Model:将一个领域的序列(如英文句子)转换为另一个领域(如中文句子)的序列。在论文中是一种神经生成模型,它能最大限度地根据在前面的对话,生成回复的概率。</p></blockquote><p>Seq2Seq模型用于对话生成系统虽然已经取得一些成功,但是还存在两个问题:</p><ol><li><p>SEQ2SEQ模型是通过使用最大似然估计(MLE)目标函数,预测给定上下文中的下一个会话来训练的。SEQ2SEQ模型倾向于生成高度通用的响应,例如“我不知道”等。然而,“我不知道”显然不是一个好的回复。</p></li><li><p>基于最大似然估计的Seq2Seq模型无法结局重复的问题,因此对话系统通常会陷入重复性应答的无限循环之中。</p></li></ol><p>以上问题如下图所示:</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuf3bujstcj20b509zgmm.jpg" alt=""></p><h4 id="论文思路的亮点"><a href="#论文思路的亮点" class="headerlink" title="论文思路的亮点"></a>论文思路的亮点</h4><p>首先提出对话系统应当具备的两种能力:</p><ol><li>结合开发人员定义的奖励函数,更好地模拟聊天机器人开发的真正目标。</li><li>在正在进行的对话中,对生成应答的长期影响进行建模。</li></ol><p>紧接着提出利用强化学习的生成方法来改进对话系统:</p><blockquote><p>encoder-decoder architecture:一种标准的神经机器翻译方法,用于解决seq2seq问题的递归神经网络。<br>Policy Gradient 策略梯度:</p></blockquote><p>该模型以encoder-decoder结构为骨干,模拟两个Agent之间的对话,在学习最大化预期回报的同时,探索可能的活动空间(回复的可能性)。Agent通过从正在进行的对话中优化长期Reward函数来学习策略。学习方式则使用策略梯度而非最大似然。</p><p>改进后的模型如下图所示:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuf47elgdkj20af09yjse.jpg" alt=""></p><h3 id="论文模型的细节"><a href="#论文模型的细节" class="headerlink" title="论文模型的细节"></a>论文模型的细节</h3><h4 id="符号以及定义"><a href="#符号以及定义" class="headerlink" title="符号以及定义"></a>符号以及定义</h4><ol><li>$p$: 第一个Agent生成的句子</li><li>$q$: 第二个Agent生成的句子</li><li>$p_1,q_1,p_2,q_2,…,p_i,q_i$: 一段对话,或者称之为上下文.</li><li>$[p_i,q_i]$: Agent所处的状态,也即Agent的前两轮对话。</li><li>$p<em>{RL}(p</em>{i+1}|p_i,q_i)$: 策略(policy),论文中以LSTM encoder-decoder的形式出现。</li><li>$r$: 每个动作(每轮对话)的奖励函数。</li><li>$\mathbb{S}$: 人工构建的”迟钝回复”,例如“我不知道你在说什么”。</li><li>$N<em>{\mathbb{S}}$: 表示$N</em>{\mathbb{S}}$的基数</li><li>$N_{s}$: 表示“迟钝回复”$s$的符号数量。</li><li>$p_{seq2seq}$: 表示SEQ2SEQ模型的似然输出</li><li>$h_{p<em>i}$和$h</em>{p_{i+1}}$: 从encoder中获取的,代表Agent两轮连续对话$p<em>i$和$p</em>{i+1}$的表示。</li></ol><h4 id="Reward的定义和作用:"><a href="#Reward的定义和作用:" class="headerlink" title="Reward的定义和作用:"></a>Reward的定义和作用:</h4><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fug2dvizhhj209o01wdfq.jpg" alt=""></p><blockquote><p>$N<em>{\mathbb{S}}$:表示$N</em>{\mathbb{S}}$的基数<br>$N<em>{s}$: 表示“迟钝回复”$s$的符号数量<br>$p</em>{seq2seq}$: 表示SEQ2SEQ模型的似然输出</p><ul><li>$r_1$是为了降低回复的困难程度。这个奖励函数的灵感来自于前瞻性函数:计算当模型产生的响应$a$作为输入时模型输出$s$的概率,在对$\mathbb{S}$集合中的每一句话进行求和。因为$p_seq2seq}可定小于1,所以log项大于零,则r1小于零。通过r1的奖励机制,模型最终产生的action会慢慢的远离dull response,而且也会一定程度上估计到下一个人的回复,让对方可以更容易回复。</li></ul></blockquote><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fug5a6naokj20b302d747.jpg" alt=""></p><blockquote><p>$h_{p<em>i}$和$h</em>{p_{i+1}}$: 从encoder中获取的,代表Agent两轮连续对话$p<em>i$和$p</em>{i+1}$的表示。</p><ul><li>$r_2$是为了增加信息流的丰富程度,避免两次回复之间相似程度很高的情况。所以r2使用余弦相似度来计算两个句子之间的语义相似程度,很容易发现r2也是一个小于零的数,用来惩罚相似的句子。</li></ul></blockquote><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fug5av9bkgj20bp028mx4.jpg" alt=""></p><blockquote><p>$p_{seq2seq}(a|p_i, q_i)$: 表示在给定对话上文$[p_i,q<em>i]$的情况下生成回复a的概率<br>$p^{backward}</em>{seq2seq}(q_i|a)$: 表示基于响应$a$来生成之前的对话$q_i$的概率。</p><ul><li>$r_3$是为了增强语义连贯性,避免模型只产生那些高reward的响应,而丧失回答的充分性和连贯性。为了解决这个问题模型采用互信息来实现。反向的seq2seq是使用source和target反过来训练的另外一个模型,这样做的目的是为了提高q和a之间的相互关系,让对话更具有可持续性。可以看出来,$r_3$的两项都是正值。</li></ul></blockquote><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fug5beurspj209y01dt8k.jpg" alt=""></p><ul><li>最终的奖励函数式对$r_1,r_2,r_3$进行加权求和,论文中设定$\lambda_1=0.25, \lambda_2=0.25, \lambda3=0.5$。最后总模型在训练的时候也是先使用Seq2Seq模型先预训练一个基础模型,然后在其基础上在使用reward进行policy gradient的训练来优化模型的效果。</li></ul><h4 id="强化学习模型细节"><a href="#强化学习模型细节" class="headerlink" title="强化学习模型细节"></a>强化学习模型细节</h4><blockquote><p>完全监督环境(fully supervised setting): 一个预先训练的SEQ2SEQ模型,用作初始化强化学习模型。<br>注意力模型(Attention): 模型在产生输出的时候,还会产生一个“注意力范围”表示接下来输出的时候要重点关注输入序列中的哪些部分,然后根据关注的区域来产生下一个输出,如此往复。</p></blockquote><p>论文采用了AlphaGo风格的模型:通过一个完全监督的环境下的一般响应生成策略来初始化强化学习模型。其中,SEQ2SEQ模型加入了Attention机制并且该模型在<strong>OpenSubtitles dataset</strong>数据集上训练。</p><p>论文并未采用预训练的Seq2Seq模型来初始化强化学习策略模型,而是使用了第一作者本人在2016年提出的生成最大互信息响应的encoder-decoder模型: 使用$p_{SEQ2SEQ}(a|p_i, q<em>i)$来初始化$p</em>{RL}$。从生成的候选集$A={\hat{a}|\hat{a}~p_{RL}}$中的$\hat{a}$获取互信息的得分$m(\hat{a}, [p_i, q_i])$,那么对一个sequence的期望奖励函数为:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuhd5cu9doj208j01ddfo.jpg" alt=""></p><p>通过似然率估计的梯度为:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuhd6s09lyj20ax01awed.jpg" alt=""></p><p>通过随机梯度下降就可以更新encoder-decoder的参数。论文中通过借鉴curriculum learning strategy对梯度进行了改进。</p><p>最终的梯度为:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuhdbxz4kij20b401tjra.jpg" alt=""></p><p>优化模型过程中则使用策略梯度来寻找可以最大化奖励函数的参数:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuief5dqqxj209j01n0sn.jpg" alt=""></p><h3 id="仿真实验细节"><a href="#仿真实验细节" class="headerlink" title="仿真实验细节"></a>仿真实验细节</h3><h4 id="对话仿真流程:"><a href="#对话仿真流程:" class="headerlink" title="对话仿真流程:"></a>对话仿真流程:</h4><ol><li>从训练集中挑选一个message给Agent-A</li><li>Agent-A对message进行编码并解码出一个响应作为输出。</li><li>Agent-B以Agent-A的输出作为输入,并且通过encoder-decoder来</li></ol><p>而策略policy就是Seq2Seq模型生成的相应的概率分布。我们可以把这个问题看成是上下文的对话历史输入到神经网络中,然后输出是一个response的概率分布:$pRL(pi+1|pi,qi)$。所谓策略就是进行随机采样,选择要进行的回答。最后使用policy gradient进行网络参数的训练。</p><p>两个agent互相对话最终得到的reward来调整base model的参数。</p><p><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuf4u98teyj20lv0aatav.jpg" alt=""></p><h3 id="实验结果分析"><a href="#实验结果分析" class="headerlink" title="实验结果分析"></a>实验结果分析</h3><h4 id="评估指标"><a href="#评估指标" class="headerlink" title="评估指标"></a>评估指标</h4><blockquote><p>BLEU: bilingual evaluation understudy,一个评估机器翻译准确度的算法。<br>论文并没有使用 广泛应用的BLEU作为评价标准。</p></blockquote><ol><li><p>对话的长度,作者认为当对话出现dull response的时候就算做对话结束,所以使用对话的轮次来作为了评价指标:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuf0a67z3dj20fi051q3c.jpg" alt=""></p></li><li><p>不同unigrams、bigrams元组的数量和多样性,用于评测模型产生回答的丰富程度:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuf0boign2j20e104a3z0.jpg" alt=""></p></li><li><p>人类评分:<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuf0gegq71j20gc03sgm9.jpg" alt=""></p></li><li><p>最终对话效果<br><img src="http://ww1.sinaimg.cn/large/ca26ff18ly1fuf0hq4jzaj20g203k3z5.jpg" alt=""></p></li></ol><h3 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h3><p>作者使用深度强化学习的方法来改善多轮对话的效果,并提出了三种reward的定义方式。可以算是DRL与NLP结合的一个比较不错的例子。但是从最后的结果部分也可以看得出,作者无论是在reward的定义、还是最后的评价指标都没有采用使用比较广泛的BLUE指标。这种手工定义的reward函数不可能涵盖一段理想对话所具有特点的的方方面面。</p><h3 id="引用与参考"><a href="#引用与参考" class="headerlink" title="引用与参考"></a>引用与参考</h3><ol><li><a href="https://www.paperweekly.site/papers/notes/221">https://www.paperweekly.site/papers/notes/221</a></li><li><a href="https://scholar.google.com/">https://scholar.google.com/</a></li><li><a href="https://blog.csdn.net/u014595019/article/details/52826423">https://blog.csdn.net/u014595019/article/details/52826423</a></li></ol>]]></content>
<categories>
<category> Paper </category>
</categories>
<tags>
<tag> note </tag>
</tags>