You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our labels are defined as follows. At pixel p, the calibrated
depth D(p) allows us to compute the 3D camera
space coordinate x. Using homogeneous coordinates, this
camera position can be transformed into the scene’s world
coordinate frame as m = Hx. Our labels are simply defined as these scene world positions, m
The labels are the depth image values projected into camera space and then transformed by the pose which is acquired from either KinectFusion, a motion tracker, or some other ground-truth. In the case of the TUM data-sets this is an external mocap system. The code you provided does the camera space projection.
The pixels are not what is acquired from KinectFusion but rather the poses to transform the camera space coordinates.
I'm not sure what happened to your question, but I saw it in the email.
No description provided.
The text was updated successfully, but these errors were encountered: