Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hello I have some question about model colored #19

Open
bjy12 opened this issue Nov 25, 2024 · 1 comment
Open

Hello I have some question about model colored #19

bjy12 opened this issue Nov 25, 2024 · 1 comment

Comments

@bjy12
Copy link

bjy12 commented Nov 25, 2024

Hello, I have some questions about point cloud coloring in your code. When coloring point clouds, I noticed that you still chose to add noise to the point cloud coordinates and directly predict colors using the network. I would like to know if the point cloud with added noise during the coloring process is one that has already completed its geometric shape generation?

@bjy12
Copy link
Author

bjy12 commented Nov 26, 2024

Dear Author,
I have been studying your paper on projection conditioned point cloud diffusion with great interest and would like to express my sincere admiration for this outstanding work. While implementing and analyzing your code, I've encountered some aspects that I would greatly appreciate your clarification on.
Specifically, I noticed an interesting design pattern in both your PointCloudTransformerModel and PVCNN2 implementations. In the PointCloudTransformerModel, the input points undergo an initial transformation through self.input_projection:
pythonCopydef forward(self, inputs: Tensor) -> Tensor:
x = self.input_projection(inputs)
x = self.blocks(x)
x = self.output_projection(x)
return x
Similarly, in your PVCNN2 implementation, even after processing the input features, the first three dimensions are explicitly treated as spatial coordinates:
pythonCopydef forward(self, inputs: torch.Tensor, t: torch.Tensor):
# ...
# Separate input coordinates and features
coords = inputs[:, :3, :].contiguous() # (B, 3, N)
features = inputs # (B, 3 + S, N)
I would be very grateful if you could help me understand the theoretical foundation behind this design choice. My main concern is that after the mapping to higher-dimensional space through input_projection, it seems the first three dimensions may no longer necessarily preserve their interpretation as spatial coordinates. Could you please explain:

The reasoning behind maintaining this spatial interpretation after dimensional expansion
How the model ensures the first three dimensions remain meaningful as spatial coordinates throughout the network's operations

Thank you very much for your time and consideration. I look forward to your insights on this matter.
Best regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant