-
Notifications
You must be signed in to change notification settings - Fork 0
/
world_scopes.html
219 lines (193 loc) · 9.22 KB
/
world_scopes.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
<html>
<head>
<title>Language Grounding World Scopes</title>
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-KyZXEAg3QhqLMpG8r+8fhAXLRk2vvoC2f3B09zVXn8CA5QIVfZOJ3BCsw2P0p/We" crossorigin="anonymous">
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js" integrity="sha384-MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM" crossorigin="anonymous"></script>
<base href="/">
<!-- General CSS for this page -->
<style>
#container, #container {
text-align:center;
padding: 2em 2em 4em;
margin: 0 auto;
font-size:16px;
min-width: 900px;
max-width: 900px;
}
.table {
align:center;
}
.table>tbody>tr>td,
.table>tbody>tr>th {
border-top: none;
border-bottom: none;
}
.centering {
text-align: center;
vertical-align: center;
}
a:link {
text-decoration: none;
}
h6 a:link {
color: #990000;
}
.table-sm{
font-size:20px;
}
h1{
font-size:66px;
}
h2{
font-size:54px;
}
h3{
font-size:42px;
}
h4{
font-size:36px;
}
h5{
font-size:24px;
}
h6{
font-size:20px;
}
.btn{
font-size:18px !important;
}
.fixed-td {
width: 120px;
}
.glamor-thumbnail {
height: 70px;
}
.tiny-thumbnail {
height: 32px;
padding: 2px;
}
.img-thumbnail {
width: 100px;
height: 100px;
}
.face-thumbnail {
width: 200px;
height: 200px;
padding: 10px;
}
.face-card {
width: 14rem;
height: 16rem;
}
</style>
<!-- Defines .bib block -->
<style>
.bib {
font-family: 'Courier New', Courier, 'Lucida Sans Typewriter', 'Lucida Typewriter', monospace;
font-size: 12px;
padding: 8px 8px 8px 8px;
border-style: ridge;
background-color: aliceblue;
text-align: left;
}
</style>
</head>
<body>
<div id="container">
<div class="row">
<div class="col-md-12">
<h4 style="color:#990000;text-align:left">Language Grounding World Scopes</h4>
<p align="justify" style="font-size:20px">
"Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates. Despite the incredible effectiveness of language processing models to tackle tasks after being trained on text alone, successful linguistic communication relies on a shared experience of the world. It is this shared experience that makes utterances meaningful. Natural language processing is a diverse field, and progress throughout its development has come from new representational theories, modeling techniques, data collection paradigms, and tasks. We posit that the present success of representation learning approaches trained on large, text-only corpora requires the parallel tradition of research on the broader physical and social context of language to address the deeper questions of communication."
</p>
<p align="justify" style="font-size:20px">
<b>Experience Grounds Language</b><br/><a href="https://yonatanbisk.com/" style="color:#000000;" target="_blank">Yonatan Bisk</a>, <a href="https://ari-holtzman.github.io/" style="color:#000000;" target="_blank">Ari Holtzman</a>, Jesse Thomason, <a href="https://www.mit.edu/~jda/" style="color:#000000;" target="_blank">Jacob Andreas</a>, <a href="https://yoshuabengio.org/" style="color:#000000;" target="_blank">Yoshua Bengio</a>, <a href="https://web.eecs.umich.edu/~chaijy/" style="color:#000000;" target="_blank">Joyce Chai</a>, <a href="https://homepages.inf.ed.ac.uk/mlap/" style="color:#000000;" target="_blank">Mirella Lapata</a>, <a href="https://angelikilazaridou.github.io/" style="color:#000000;" target="_blank">Angeliki Lazaridou</a>, <a href="https://viterbi.usc.edu/directory/faculty/May/Jonathan" style="color:#000000;" target="_blank">Jonathan May</a>, <a href="http://alex.nisnevich.com/portfolio/" style="color:#000000;" target="_blank">Aleksandr Nisnevich</a>, <a href="https://scholar.google.com/citations?hl=en&user=eMwEEXUAAAAJ" style="color:#000000;" target="_blank">Nicolas Pinto</a>, and <a href="https://scholar.google.com/citations?user=eQ1uJ6UAAAAJ&hl=en" style="color:#000000;" target="_blank">Joseph Turian</a>.<br/><i>Empirical Methods in Natural Language Processing (EMNLP)</i>, 2020.<br/><a class="btn btn-light" style="padding:5px;" target="_blank" href="https://arxiv.org/abs/2004.10151">paper</a><a class="btn btn-light" style="padding:5px;" target="_blank" href="https://www.youtube.com/watch?v=cQasYLUC_00&feature=youtu.be">video</a><a class="btn btn-light" style="padding:5px;" target="_blank" href="https://youtu.be/-niprVHNrgI">coverage</a><button type="button" class="btn btn-light" style="padding:5px;" data-bs-toggle="collapse" data-bs-target="#bisk_emnlp20"aria-expanded="false" aria-controls="bisk_emnlp20">bib</button><div id="bisk_emnlp20" class="collapse bib">@inproceedings{egl,<br/> title={Experience Grounds Language},<br/> author={Yonatan Bisk and Ari Holtzman and Jesse Thomason and Jacob Andreas and Yoshua Bengio and Joyce Chai and Mirella Lapata and Angeliki Lazaridou and Jonathan May and Aleksandr Nisnevich and Nicolas Pinto and Joseph Turian},<br/> booktitle={Empirical Methods in Natural Language Processing (EMNLP)},<br/> year={2020},<br/> url={https://arxiv.org/abs/2004.10151}<br/>}</div>
</p>
<p align="justify" style="font-size:20px">
The paper outlines five World Scopes for language grounding that provide a taxonomy for the kinds of input and output a model of language considers during learning and inference. We outline those World Scopes briefly below.
</p>
</div>
</div>
<div class="row" style="text-align:left">
<div class="col-md-12">
<div class="card mb-12"><a id="ws1"></a>
<div class="row g-0">
<div class="col-md-4">
<img src="thumbnails/ws1.png" class="img-fluid rounded-start" title="World Scope 1">
</div>
<div class="col-md-8">
<div class="card-body">
<h5 class="card-title">World Scope 1<br/>Corpora and Representations</h5>
<p class="card-text">Annotated corpora such as the Penn Treebank and the Brown Corpus as well as structured resources like WordNet.
This world scope is curated and mostly consists of English.</p>
</div>
</div>
</div>
</div>
<div class="card mb-12"><a id="ws2"></a>
<div class="row g-0">
<div class="col-md-4">
<img src="thumbnails/ws2.png" class="img-fluid rounded-start" title="World Scope 2">
</div>
<div class="col-md-8">
<div class="card-body">
<h5 class="card-title">World Scope 2<br/>The Written Word</h5>
<p class="card-text">Internet-scale, unstructured, multi-domain, multilingual natural language <i>text</i>.
This scope learns language "from the radio," through only text, often primarily unstructured web text.</p>
</div>
</div>
</div>
</div>
<div class="card mb-12"><a id="ws3"></a>
<div class="row g-0">
<div class="col-md-4">
<img src="thumbnails/ws3.png" class="img-fluid rounded-start" title="World Scope 3">
</div>
<div class="col-md-8">
<div class="card-body">
<h5 class="card-title">World Scope 3<br/>The World of Sights and Sounds</h5>
<p class="card-text">Language paired with sensory perception like vision, audio, and haptics.
This scope includes audio-visual speech recognition, visual question answering, and recognizing <i>heavy</i> means increased physical weight.</p>
</div>
</div>
</div>
</div>
<div class="card mb-12"><a id="ws4"></a>
<div class="row g-0">
<div class="col-md-4">
<img src="thumbnails/ws4.png" class="img-fluid rounded-start" title="World Scope 4">
</div>
<div class="col-md-8">
<div class="card-body">
<h5 class="card-title">World Scope 4<br/>Embodiment and Action</h5>
<p class="card-text">Language paired with or leading to world actions.
This scope includes learning that <i>left</i> corresponds to a spatial orientation change, and that <i>it's hot</i> is a pragmatic warning against physically touching an object.</p>
</div>
</div>
</div>
</div>
<div class="card mb-12"><a id="ws5"></a>
<div class="row g-0">
<div class="col-md-4">
<img src="thumbnails/ws5.png" class="img-fluid rounded-start" title="World Scope 5">
</div>
<div class="col-md-8">
<div class="card-body">
<h5 class="card-title">World Scope 5<br/>The Social World</h5>
<p class="card-text">Language is what language does, and so language use in social contexts to cause changes in others' behavior and states of mind is the highest scope for grounded natural langauge use.</p>
</div>
</div>
</div>
</div>
</div></div>
<div class="row">
<div class="col-md-12">
<br/><hr/>
<p align="left">
World scope robot images were comissioned from the talented <a href="https://www.isoplod.com/" target="_blank">Isoplod</a> and may not be used or distributed without permission from <a href="https://jessethomason.com/" target="_blank">Jesse Thomason</a> or <a href="https://yonatanbisk.com/">Yonatan Bisk</a>.
</p>
</div>
</div>
</div>
</body>
</html>