How to get multi-view images #9

ZJHTerry18 · 2024-06-09T08:41:50Z

Hi! Thanks for your great work! I am curious about how to get multi-view object images in the "Object Caption" step of your annotation pipeline. It seems that only a 3D point cloud and object bounding box is needed? But, how to decide the camera pose of each view, and how to render the images to make them look realistic?

I am also wondering will the code implementation for the entire SceneVerse annotation pipeline be released :)

Buzz-Beater · 2024-07-03T06:09:40Z

Thanks for the interest, for most of the datasets considered (e.g. ScanNet) it does contain RGBD videos for the original capture, you can use those for projecting 3D bounding boxes to 2D for VLMs to generate.

As for the second question, we are wrapping up the code release, stay tuned.

Hoyyyaard · 2024-08-01T09:16:31Z

Can you please provide the link of RGBD video of HM3D?

Buzz-Beater · 2024-09-02T07:11:32Z

Hi, as HM3D originally did not provide multi-view images, we did not generate RGBD videos, we instead tried synthesizing viewpoints for objects for captioning.

Hoyyyaard · 2024-10-03T02:46:27Z

Can you please share the synthesizing viewpoints?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get multi-view images #9

How to get multi-view images #9

ZJHTerry18 commented Jun 9, 2024

Buzz-Beater commented Jul 3, 2024

Hoyyyaard commented Aug 1, 2024

Buzz-Beater commented Sep 2, 2024

Hoyyyaard commented Oct 3, 2024

How to get multi-view images #9

How to get multi-view images #9

Comments

ZJHTerry18 commented Jun 9, 2024

Buzz-Beater commented Jul 3, 2024

Hoyyyaard commented Aug 1, 2024

Buzz-Beater commented Sep 2, 2024

Hoyyyaard commented Oct 3, 2024