Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get multi-view images #9

Open
ZJHTerry18 opened this issue Jun 9, 2024 · 4 comments
Open

How to get multi-view images #9

ZJHTerry18 opened this issue Jun 9, 2024 · 4 comments

Comments

@ZJHTerry18
Copy link

Hi! Thanks for your great work! I am curious about how to get multi-view object images in the "Object Caption" step of your annotation pipeline. It seems that only a 3D point cloud and object bounding box is needed? But, how to decide the camera pose of each view, and how to render the images to make them look realistic?

I am also wondering will the code implementation for the entire SceneVerse annotation pipeline be released :)

@Buzz-Beater
Copy link
Contributor

Thanks for the interest, for most of the datasets considered (e.g. ScanNet) it does contain RGBD videos for the original capture, you can use those for projecting 3D bounding boxes to 2D for VLMs to generate.

As for the second question, we are wrapping up the code release, stay tuned.

@Hoyyyaard
Copy link

Can you please provide the link of RGBD video of HM3D?

@Buzz-Beater
Copy link
Contributor

Hi, as HM3D originally did not provide multi-view images, we did not generate RGBD videos, we instead tried synthesizing viewpoints for objects for captioning.

@Hoyyyaard
Copy link

Can you please share the synthesizing viewpoints?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants