world_scopes.html

<html>

<head>
	<title>Language Grounding World Scopes</title>

	<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.0/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-KyZXEAg3QhqLMpG8r+8fhAXLRk2vvoC2f3B09zVXn8CA5QIVfZOJ3BCsw2P0p/We" crossorigin="anonymous">
	<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/js/bootstrap.bundle.min.js" integrity="sha384-MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM" crossorigin="anonymous"></script>

    <base href="/">

    <!-- General CSS for this page -->
    <style>
    	#container, #container {
			text-align:center;
			padding: 2em 2em 4em;
			margin: 0 auto;
			font-size:16px;
			min-width: 900px;
			max-width: 900px;
		}

		.table {
			align:center;
		}

		.table>tbody>tr>td,
		.table>tbody>tr>th {
		  border-top: none;
		  border-bottom: none;
		}

		.centering {
			text-align: center;
			vertical-align: center;
		}

		a:link { 
			text-decoration: none; 
		}

		h6 a:link {
		  color: #990000;
		}

		.table-sm{
			font-size:20px;
		}
		h1{
			font-size:66px;
		}
		h2{
			font-size:54px;
		}
		h3{
			font-size:42px;
		}
		h4{
			font-size:36px;
		}
		h5{
			font-size:24px;
		}
		h6{
			font-size:20px;
		}

		.btn{
			font-size:18px !important;
		}

		.fixed-td {
			width: 120px;
		}

		.glamor-thumbnail {
			height: 70px;
		}

		.tiny-thumbnail {
			height: 32px;
			padding: 2px;
		}

		.img-thumbnail {
			width: 100px;
			height: 100px;
		}

		.face-thumbnail {
			width: 200px;
			height: 200px;
			padding: 10px;
		}

		.face-card {
			width: 14rem;
			height: 16rem;
		}
	</style>

	<!-- Defines .bib block -->
	<style>
		.bib {
			font-family: 'Courier New', Courier, 'Lucida Sans Typewriter', 'Lucida Typewriter', monospace;
			font-size: 12px;
			padding: 8px 8px 8px 8px;
			border-style: ridge;
			background-color: aliceblue;
			text-align: left;
		}
	</style>

</head>

<body>

	<div id="container">

	<div class="row">
		<div class="col-md-12">
			<h4 style="color:#990000;text-align:left">Language Grounding World Scopes</h4>
			<p align="justify" style="font-size:20px">
				"Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates. Despite the incredible effectiveness of language processing models to tackle tasks after being trained on text alone, successful linguistic communication relies on a shared experience of the world. It is this shared experience that makes utterances meaningful. Natural language processing is a diverse field, and progress throughout its development has come from new representational theories, modeling techniques, data collection paradigms, and tasks. We posit that the present success of representation learning approaches trained on large, text-only corpora requires the parallel tradition of research on the broader physical and social context of language to address the deeper questions of communication."
			</p>
			<p align="justify" style="font-size:20px">
				<b>Experience Grounds Language</b><br/><a href="https://yonatanbisk.com/" style="color:#000000;" target="_blank">Yonatan Bisk</a>, <a href="https://ari-holtzman.github.io/" style="color:#000000;" target="_blank">Ari Holtzman</a>, Jesse Thomason, <a href="https://www.mit.edu/~jda/" style="color:#000000;" target="_blank">Jacob Andreas</a>, <a href="https://yoshuabengio.org/" style="color:#000000;" target="_blank">Yoshua Bengio</a>, <a href="https://web.eecs.umich.edu/~chaijy/" style="color:#000000;" target="_blank">Joyce Chai</a>, <a href="https://homepages.inf.ed.ac.uk/mlap/" style="color:#000000;" target="_blank">Mirella Lapata</a>, <a href="https://angelikilazaridou.github.io/" style="color:#000000;" target="_blank">Angeliki Lazaridou</a>, <a href="https://viterbi.usc.edu/directory/faculty/May/Jonathan" style="color:#000000;" target="_blank">Jonathan May</a>, <a href="http://alex.nisnevich.com/portfolio/" style="color:#000000;" target="_blank">Aleksandr Nisnevich</a>, <a href="https://scholar.google.com/citations?hl=en&user=eMwEEXUAAAAJ" style="color:#000000;" target="_blank">Nicolas Pinto</a>, and <a href="https://scholar.google.com/citations?user=eQ1uJ6UAAAAJ&hl=en" style="color:#000000;" target="_blank">Joseph Turian</a>.<br/><i>Empirical Methods in Natural Language Processing (EMNLP)</i>, 2020.<br/><a class="btn btn-light" style="padding:5px;" target="_blank" href="https://arxiv.org/abs/2004.10151">paper</a><a class="btn btn-light" style="padding:5px;" target="_blank" href="https://www.youtube.com/watch?v=cQasYLUC_00&feature=youtu.be">video</a><a class="btn btn-light" style="padding:5px;" target="_blank" href="https://youtu.be/-niprVHNrgI">coverage</a><button type="button" class="btn btn-light" style="padding:5px;" data-bs-toggle="collapse" data-bs-target="#bisk_emnlp20"aria-expanded="false" aria-controls="bisk_emnlp20">bib</button><div id="bisk_emnlp20" class="collapse bib">@inproceedings{egl,<br/>&nbsp;&nbsp;title={Experience Grounds Language},<br/>&nbsp;&nbsp;author={Yonatan Bisk and Ari Holtzman and Jesse Thomason and Jacob Andreas and Yoshua Bengio and Joyce Chai and Mirella Lapata and Angeliki Lazaridou and Jonathan May and Aleksandr Nisnevich and Nicolas Pinto and Joseph Turian},<br/>&nbsp;&nbsp;booktitle={Empirical Methods in Natural Language Processing (EMNLP)},<br/>&nbsp;&nbsp;year={2020},<br/>&nbsp;&nbsp;url={https://arxiv.org/abs/2004.10151}<br/>}</div>
			</p>
			<p align="justify" style="font-size:20px">
				The paper outlines five World Scopes for language grounding that provide a taxonomy for the kinds of input and output a model of language considers during learning and inference. We outline those World Scopes briefly below.
			</p>
		</div>
	</div>

	<div class="row" style="text-align:left">
	<div class="col-md-12">
		<div class="card mb-12"><a id="ws1"></a>
		  <div class="row g-0">
		    <div class="col-md-4">
		      <img src="thumbnails/ws1.png" class="img-fluid rounded-start" title="World Scope 1">
		    </div>
		    <div class="col-md-8">
		      <div class="card-body">
		        <h5 class="card-title">World Scope 1<br/>Corpora and Representations</h5>
		        <p class="card-text">Annotated corpora such as the Penn Treebank and the Brown Corpus as well as structured resources like WordNet.
		        	This world scope is curated and mostly consists of English.</p>
		      </div>
		    </div>
		  </div>
		</div>
		<div class="card mb-12"><a id="ws2"></a>
		  <div class="row g-0">
		    <div class="col-md-4">
		      <img src="thumbnails/ws2.png" class="img-fluid rounded-start" title="World Scope 2">
		    </div>
		    <div class="col-md-8">
		      <div class="card-body">
		        <h5 class="card-title">World Scope 2<br/>The Written Word</h5>
		        <p class="card-text">Internet-scale, unstructured, multi-domain, multilingual natural language <i>text</i>.
		        This scope learns language "from the radio," through only text, often primarily unstructured web text.</p>
		      </div>
		    </div>
		  </div>
		</div>
		<div class="card mb-12"><a id="ws3"></a>
		  <div class="row g-0">
		    <div class="col-md-4">
		      <img src="thumbnails/ws3.png" class="img-fluid rounded-start" title="World Scope 3">
		    </div>
		    <div class="col-md-8">
		      <div class="card-body">
		        <h5 class="card-title">World Scope 3<br/>The World of Sights and Sounds</h5>
		        <p class="card-text">Language paired with sensory perception like vision, audio, and haptics.
		        	This scope includes audio-visual speech recognition, visual question answering, and recognizing <i>heavy</i> means increased physical weight.</p>
		      </div>
		    </div>
		  </div>
		</div>
		<div class="card mb-12"><a id="ws4"></a>
		  <div class="row g-0">
		    <div class="col-md-4">
		      <img src="thumbnails/ws4.png" class="img-fluid rounded-start" title="World Scope 4">
		    </div>
		    <div class="col-md-8">
		      <div class="card-body">
		        <h5 class="card-title">World Scope 4<br/>Embodiment and Action</h5>
		        <p class="card-text">Language paired with or leading to world actions.
		        	This scope includes learning that <i>left</i> corresponds to a spatial orientation change, and that <i>it's hot</i> is a pragmatic warning against physically touching an object.</p>
		      </div>
		    </div>
		  </div>
		</div>
		<div class="card mb-12"><a id="ws5"></a>
		  <div class="row g-0">
		    <div class="col-md-4">
		      <img src="thumbnails/ws5.png" class="img-fluid rounded-start" title="World Scope 5">
		    </div>
		    <div class="col-md-8">
		      <div class="card-body">
		        <h5 class="card-title">World Scope 5<br/>The Social World</h5>
		        <p class="card-text">Language is what language does, and so language use in social contexts to cause changes in others' behavior and states of mind is the highest scope for grounded natural langauge use.</p>
		      </div>
		    </div>
		  </div>
		</div>
	</div></div>

	<div class="row">
		<div class="col-md-12">
			<br/><hr/>
			<p align="left">
				World scope robot images were comissioned from the talented <a href="https://www.isoplod.com/" target="_blank">Isoplod</a> and may not be used or distributed without permission from <a href="https://jessethomason.com/" target="_blank">Jesse Thomason</a> or <a href="https://yonatanbisk.com/">Yonatan Bisk</a>.
			</p>
		</div>
	</div>

</div>
</body>

</html>