-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
301 lines (269 loc) · 14 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification</title>
<!-- Bootstrap -->
<link href="css/bootstrap-4.4.1.css" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Open+Sans" rel="stylesheet" type="text/css">
<style>
body {
background: rgb(255, 255, 255) no-repeat fixed top left;
font-family:'Open Sans', sans-serif;
}
</style>
</head>
<!-- cover -->
<section>
<div class="jumbotron text-center mt-0">
<div class="container-fluid">
<div class="row">
<div class="col">
<h2 style="font-size:30px;">3D-Aware Object Goal Navigation <br>via Simultaneous Exploration and Identification</h2>
<h4 style="color:#6e6e6e;"> CVPR 2023 </h4>
<hr>
<h6> <a href="https://jzhzhang.github.io/" target="_blank">Jiazhao Zhang</a><sup>1,2*</sup>,
<a href="https://liudai-homepage.github.io/" target="_blank">Liu Dai</a><sup>3*</sup>,
<a href="https://mfp0610.github.io/" target="_blank">Fanpeng Meng</a><sup>4</sup>,
<a href="https://fqnchina.github.io/" target="_blank">Qingnan Fan</a><sup>5</sup>,
<a href="https://xuelin-chen.github.io/" target="_blank">Xuelin Chen</a><sup>5</sup>,
<a href="https://kevinkaixu.net/" target="_blank">Kai Xu</a><sup>6</sup>,
<a href="https://hughw19.github.io/" target="_blank">He Wang</a><sup>1†</sup>
<br>
<br>
<p> <sup>1</sup> CFCS, Peking University
<sup>2</sup> BAAI
<sup>3</sup> CEIE, Tongji University
<sup>4</sup> Huazhong University of Science and Technology
<br>
<sup>5</sup> Tencent AI Lab
<sup>6</sup> National University of Defense Technology
<br>
</p>
<p> <sup>*</sup> equal contributions
<sup>†</sup> corresponding author
<br>
</p>
<!-- <p> <a class="btn btn-secondary btn-lg" href="" role="button">Paper</a>
<a class="btn btn-secondary btn-lg" href="" role="button">Code</a>
<a class="btn btn-secondary btn-lg" href="" role="button">Data</a> </p> -->
<div class="row justify-content-center">
<div class="column">
<p class="mb-5"><a class="btn btn-large btn-light" href="https://arxiv.org/abs/2212.00338" role="button" target="_blank">
<i class="fa fa-file"></i> Paper </a> </p>
</div>
<div class="column">
<p class="mb-5"><a class="btn btn-large btn-light" id="code_soon" href="https://github.com/jzhzhang/3DAwareNav" role="button" target="_blank" disabled=1>
<i class="fa fa-github-alt"></i> Code </a> </p>
</div>
<!-- <div class="column">
<p class="mb-5"><a class="btn btn-large btn-light" href="files/" role="button" target="_blank">
<i class="fa fa-file"></i> Supplementary(Coming soon)</a> </p>
</div> -->
</div>
</div>
</div>
</div>
</div>
</section>
<section>
<div class="container" style="width:58%">
<div class="row">
<div class="col-12 text-center">
<hr style="margin-top:0px">
<div class="row justify-content-center" style="align-items:center; display:flex;">
<img src="images/teaser.png" alt="pipeline" class="img-responsive" width="95%"/>
<br>
</div>
<!-- <div class="row justify-content-center" style="display:flex;"></div> -->
<br>
<p class="text-justify">
<strong>Teaser</strong>.
We present a 3D-aware ObjectNav framework along with simultaneous exploration and identification policies: <b>A$\rightarrow$B</b> , the agent was guided by an exploration policy to look for its target; <b> B$\rightarrow$ C</b> , the agent consistently identified a target object and finally called STOP.
</p>
<!-- </div> -->
</div>
</div>
</div>
</section>
<br>
<!-- abstract -->
<section>
<div class="container" style="width:58%">
<div class="row">
<div class="col-12">
<h2><strong>Abstract</strong></h2>
<hr style="margin-top:0px">
<!-- <div class="row justify-content-center" style="align-items:center; display:flex;">
<img src="images/teaser.png" alt="input" class="img-responsive" width="95%"/>
<br>
</div> -->
<!-- <p class="text-justify">
<h6 style="color:#8899a5;text-align:left"> Figure 1. Framework overview (From the left to right): we leverage domain randomization-enhanced depth simulation to generate paired data, on which we can train our depth restoration network SwinDRNet, and the restored depths will be fed to downstream tasks and improves estimating category-level pose and grasping for specular and transparent objects.</h6>
</p> -->
<p class="text-justify">
Object goal navigation (ObjectNav) in unseen environments is a fundamental task for Embodied AI.
Agents in existing works learn ObjectNav policies based on 2D maps, scene graphs, or image sequences.
Considering this task happens in 3D space, a 3D-aware agent can advance its ObjectNav capability via learning from fine-grained spatial information.
However, leveraging 3D scene representation can be prohibitively unpractical for policy learning in this floor-level task, due to low sample efficiency and expensive computational cost.
In this work, we propose a framework for the challenging 3D-aware ObjectNav based on two straightforward sub-policies.
The two sub-polices, namely corner-guided exploration policy and category-aware identification policy, simultaneously perform by utilizing online fused 3D points as observation.
</p>
</div>
</div>
</div>
</section>
<br>
<!-- Methods -->
<section>
<div class="container" style="width:58%">
<div class="row">
<div class="col-12">
<h2><strong>Methods</strong></h2>
<hr style="margin-top:0px">
<!-- <h3 style="margin-top:20px; margin-bottom:20px; color:#717980"><b>Part 1</b></h3> -->
<div class="row justify-content-center" style="align-items:center; display:flex;">
<img src="images/overview.png" alt="input" class="img-responsive" width="85%"/>
</div>
<p class="text-justify">
<strong>We take in a posed RGB-D image at time step $t$ and perform point-based construction algorithm to online fuse a 3D scene representation ($M_{3D}^{(t)}$), along with a $M_{2D}^{(t)}$ from semantics projection. Then, we simultaneously leverage two policies, including a <i>corner-guided exploration policy</i> $\pi_e$ and <i>category-awre identification policy</i> $\pi_f$, to predict a discrete corner goal $g_e^{(t)}$ and a target goal $g_f^{(t)}$ (if exist) respectively. Finally, the local planning module will drive the agent to the given target goal $g_f^{(t)}$ (top priority) or the corner goal $g_e^{(t)}$.</strong>
</p>
<br>
<h3 style="margin-top:20px; margin-bottom:20px; color:#717980"><b>Online points fusion*</b></h3>
<div class="row justify-content-center" style="align-items:center; display:flex;">
<img src="images/points_fusion.png" alt="input" class="img-responsive" width="55%"/>
</div>
<p class="text-justify">
<strong><b>Left:</b> A robot takes multi-view observations during navigation. <b>Right:</b> The points $p$ are organized by dynamically allocated blocks $B$ and per-point octrees $O$, which can be used to query neighborhood points of any given point.</strong>
</p>
<br>
</div>
</div>
</div>
</div>
</section>
<br>
<br>
<section>
<div class="container" style="width:58%">
<div class="row">
<div class="col-12">
<h2><strong>Visualization</strong></h2>
<hr style="margin-top:0px">
<div class="row justify-content-center" style="align-items:center; display:flex;">
<img src="images/vis_seq.png" alt="input" class="img-responsive" width="95%"/>
<br>
</div>
<p class="text-justify">
<center>More results can be found in our paper.</center>
</p>
</div>
</div>
</div>
</section>
<br>
<!-- Video -->
<section>
<div class="container" style="width:58%">
<div class="row">
<div class="col-12">
<h2><strong>Video</strong></h2>
<hr style="margin-top:0px">
<div class="row justify-content-center" style="align-items:center; display:flex;">
<iframe width="560" height="315" src="https://www.youtube.com/embed/-50kIfOYTBM" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
</div>
</div>
</div>
</div>
</section>
<br>
<br><br>
<!-- Team -->
<section>
<div class="container" style="width:58%">
<div class="row">
<div class="col-12">
<h2><strong>Team</strong></h2>
<hr style="margin-top:0px">
<table style="width:100%">
<p>
<tr>
<td> <center> <a href="https://jzhzhang.github.io/" target="_blank"> <img alt src="images/team_jiazhao.png" height="100"/> </a> </center></td>
<td> <center> <a href="https://liudai-homepage.github.io/" target="_blank"> <img alt src="images/team_liu.jpg" height="100"/> </a> </center></td>
<td> <center> <a href="https://mfp0610.github.io/" target="_blank"> <img alt src="images/team_fanpeng.png" height="100"/> </a> </center></td>
<td> <center> <a href="https://fqnchina.github.io/" target="_blank"> <img alt src="images/team_qingnan.png" height="100"/> </a></center> </td>
<td> <center> <a href="https://xuelin-chen.github.io/" target="_blank"> <img alt src="images/team_xuelin.png" height="100"/> </a> </center></td>
<td> <center> <a href="https://kevinkaixu.net/" target="_blank"> <img alt src="images/team_kai.png" height="100"/> </a> </center></td>
<td> <center> <a href="https://hughw19.github.io/" target="_blank"> <img alt src="images/team_he.png" height="100"/> </a> </center></td>
</tr>
<tr>
<td> <center> <font size= "2">Jiazhao Zhang<sup>1,2*</sup></font> </center></td>
<td> <center> <font size= "2">Liu Dai<sup>3*</sup></font></center> </td>
<td> <center> <font size= "2">Fanpeng Meng<sup>4</sup></font></center> </td>
<td> <center> <font size= "2">Qingnan Fan<sup>5</sup></font> </center></td>
<td> <center> <font size= "2">Xuelin Chen<sup>5</sup></font> </center></td>
<td> <center> <font size= "2">Kai Xu<sup>6</sup></font> </center></td>
<td> <center> <font size= "2">He Wang<sup>1†</sup></font> </center></td>
</tr>
</p>
</table>
</div>
<div class="col-12">
<p> <font size= "2"><sup>1</sup> CFCS, Peking University</font>
<font size= "2"><sup>2</sup> BAAI</font>
<font size= "2"><sup>3</sup> CEIE, Tongji University</font>
<font size= "2"><sup>4</sup> Huazhong University of Science and Technology</font>
<font size= "2"><sup>5</sup> Tencent AI Lab</font>
<font size= "2"><sup>6</sup> National University of Defense Technology</font>
<br>
<font size= "2"><sup>*</sup> equal contributions</font>
<font size= "2"><sup>†</sup> corresponding author</font>
</p>
</div>
</div>
</div>
</section>
<br>
<!-- citing -->
<div class="container" style="width:58%">
<div class="row ">
<div class="col-12">
<h2><strong>Citation</strong></h2>
<hr style="margin-top:0px">
<pre style="background-color: #e9eeef;padding: 1.25em 1.5em">
<code>
@article{zhang20223d,
title={3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification},
author={Zhang, Jiazhao and Dai, Liu and Meng, Fanpeng and Fan, Qingnan and Chen, Xuelin and Xu, Kai and Wang, He},
journal={arXiv preprint arXiv:2212.00338},
year={2022}
}</code></pre>
</div>
</div>
</div>
<br>
<!-- Contact -->
<div class="container" style="width:58%">
<div class="row ">
<div class="col-12">
<h2><strong>Contact</strong></h2>
<hr style="margin-top:0px">
<p>If you have any questions, please feel free to contact <b>Jiazhao Zhang</b> at zhngjizh_at_gmail_dot_com, <b>Liu Dai</b> at dailiu_dot_cndl_at_gmail_dot_com, and <b>He Wang</b> at hewang_at_pku_dot_edu_dot_cn </p>
</pre>
</div>
</div>
</div>
<footer class="text-center" style="margin-bottom:10px; font-size: medium;">
<hr>
Thanks to <a href="https://lioryariv.github.io/" target="_blank">Lior Yariv</a> for the <a href="https://lioryariv.github.io/idr/" target="_blank">website template</a>.
</footer>
<script>
MathJax = {
tex: {inlineMath: [['$', '$'], ['\\(', '\\)']]}
};
</script>
<script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script>
</body>
</html>