Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to decompress data in image_client.py #19

Closed
BBBBBBob opened this issue Dec 4, 2024 · 19 comments
Closed

Unable to decompress data in image_client.py #19

BBBBBBob opened this issue Dec 4, 2024 · 19 comments

Comments

@BBBBBBob
Copy link

BBBBBBob commented Dec 4, 2024

Thank you for the open-source project!

When I was trying to receive the image from the camera using image_client, an error occurred:
An error occurred while receiving data: Error -3 while decompressing data: incorrect header check

It seems the error comes from zlib.decompress().

Can you give a solution to this issue? Thank you!

@BBBBBBob BBBBBBob changed the title Unable to detect camera when running image_server.py Unable to decompress data in image_client.py Dec 4, 2024
@silencht
Copy link
Collaborator

silencht commented Dec 4, 2024

Please try setting the Unit_Test parameters for both the image client and server to False.

@BBBBBBob
Copy link
Author

BBBBBBob commented Dec 4, 2024

Please try setting the Unit_Test parameters for both the image client and server to False.

Thank you for your reply. Where can I found Unit_Test in both files? Is it a parameter for a function?

@silencht
Copy link
Collaborator

silencht commented Dec 4, 2024

this is image_server.py 's param location,
and this is image_client.py's param location
They are initialization parameters of the class, used for unit testing. When running the normal teleoperation program, they can be set to False.

@BBBBBBob
Copy link
Author

BBBBBBob commented Dec 4, 2024

this is image_server.py 's param location, and this is image_client.py's param location They are initialization parameters of the class, used for unit testing. When running the normal teleoperation program, they can be set to False.

Thank you! I am currently using the h1_2 branch. The link you provided comes from the g1 branch. Should I switch to g1 for the further H1_2 teleoperation development?

@silencht
Copy link
Collaborator

silencht commented Dec 4, 2024

ok, I misunderstand your robot type. There is no need to switch branches currently.
Could you please provide more information to help resolve the issue? For example, terminal output messages (Including both the client and the server.)

@BBBBBBob
Copy link
Author

BBBBBBob commented Dec 4, 2024

ok, I misunderstand your robot type. There is no need to switch branches currently. Could you please provide more information to help resolve the issue? For example, terminal output messages (Including both the client and the server.)

Here is the screenshot from the server. I have changed cap = cv2.VideoCapture(0, cv2.CAP_V4L2) to cap = cv2.VideoCapture(2, cv2.CAP_V4L2) so that the camera can be connected:
20241204-190044
When starting the client, the first error occurs:
20241204-190050
and another error appears after restarting the client:
20241204-190054

@silencht
Copy link
Collaborator

silencht commented Dec 5, 2024

Currently, in the H1_2 branch and the original TeleVision project, the default head camera device is a stereo RGB camera (this parameter is currently set to a resolution of (1280*2)*720).

I noticed that the device you connected is a RealSense D435i, which might be causing the issue. The inconsistency in devices requires some modifications in the code.

Here are some possible solutions:

  1. Use software like Cheese to confirm that the /dev/video*number* device (currently you set to 2) is indeed an RGB camera (and not a depth or IR camera).

  2. Try adjusting the [image resolution code on the server-side](

    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 2560)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
    ) to match a resolution suitable for RealSense (e.g., height 480 * width 640).

  3. Modify the TeleVision.py file to include a function similar to ([main_image_monocular in the G1 branch](

    async def main_image_monocular(self, session, fps=60):
    )) and use it.

    The current implementation in the H1_2 branch is similar to the main_image_binocular function in the G1 branch.

  4. Remove all width coefficients related to stereo camera resolutions.
    For instance, [this line of code](

    self.img_shape = (img_shape[0], 2*img_shape[1], 3)
    ) should be modified to:

    self.img_shape = (img_shape[0], img_shape[1], 3)

Due to the inconsistency in camera hardware, the current image server code may not meet the needs of all users.
If you have further questions, feel free to update your issue here.

@BBBBBBob
Copy link
Author

BBBBBBob commented Dec 5, 2024

Currently, in the H1_2 branch and the original TeleVision project, the default head camera device is a stereo RGB camera (this parameter is currently set to a resolution of (1280*2)*720).

I noticed that the device you connected is a RealSense D435i, which might be causing the issue. The inconsistency in devices requires some modifications in the code.

Here are some possible solutions:

  1. Use software like Cheese to confirm that the /dev/video*number* device (currently you set to 2) is indeed an RGB camera (and not a depth or IR camera).
  2. Try adjusting the [image resolution code on the server-side](
    cap.set(cv2.CAP_PROP_FRAME_WIDTH, 2560)
    cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)

    ) to match a resolution suitable for RealSense (e.g., height 480 * width 640).
  3. Modify the TeleVision.py file to include a function similar to ([main_image_monocular in the G1 branch](
    async def main_image_monocular(self, session, fps=60):

    )) and use it.
    The current implementation in the H1_2 branch is similar to the main_image_binocular function in the G1 branch.
  4. Remove all width coefficients related to stereo camera resolutions.
    For instance, [this line of code](
    self.img_shape = (img_shape[0], 2*img_shape[1], 3)

    ) should be modified to:
    self.img_shape = (img_shape[0], img_shape[1], 3)

Due to the inconsistency in camera hardware, the current image server code may not meet the needs of all users. If you have further questions, feel free to update your issue here.

Thank you for your answer! Since our H1_2 also uses D435i, can I modify image_server.py similar to the G1 one?

@silencht
Copy link
Collaborator

silencht commented Dec 5, 2024

Sure, but some modifications will be needed to adapt it to files like TeleVision.py
To be honest, the code in the G1 branch is more refined and has better readability. However, adapting it to H1_2 will take some effort on your part. If you encounter any issues, feel free to reach out to us in the issues section.

@BBBBBBob
Copy link
Author

BBBBBBob commented Dec 5, 2024

Sure, but some modifications will be needed to adapt it to files like TeleVision.py To be honest, the code in the G1 branch is more refined and has better readability. However, adapting it to H1_2 will take some effort on your part. If you encounter any issues, feel free to reach out to us in the issues section.

I am currrently testing whether our station can receive the image from the camera, but TeleVision.py is not relevant to this part if I understand corretly?

@silencht
Copy link
Collaborator

silencht commented Dec 5, 2024

Yes. The image_receiver/image_client is responsible for receiving images and then passing them to TeleVision.

@BBBBBBob
Copy link
Author

BBBBBBob commented Dec 12, 2024

We are using the newest generation of inspire hands, which finger angle range is between 0 and 1000, but code in H1_2 branch is using other range of hand. How can I project the angle calcualted from IK to the inspire hand range?

@BBBBBBob
Copy link
Author

BBBBBBob commented Dec 12, 2024

Another issue is that the computation of IK becomes quite slow (around 0.5 seconds to find a solution) when using Vision Pro. Is that normal?

@silencht
Copy link
Collaborator

We are using the newest generation of inspire hands, which finger angle range is between 0 and 1000, but code in H1_2 branch is using other range of hand. How can I project the angle calcualted from IK to the inspire hand range?

We have not yet used or adapted the new generation of Inspire dexterous hands. The hand mapping uses the dex-retargeting library, and perhaps you can start by referring to and debugging with this library.

@silencht
Copy link
Collaborator

Another issue is that the computation of IK becomes quite slow (around 0.5 seconds to find a solution) when using Vision Pro. Is that normal?

The inverse kinematics solving and Apple Vision data transmission and reception are not closely related.

Did you enable Meshcat visualization? Meshcat visualization is quite time-consuming, so it is only recommended to enable Meshcat during debugging.

@BBBBBBob
Copy link
Author

BBBBBBob commented Dec 12, 2024

Another issue is that the computation of IK becomes quite slow (around 0.5 seconds to find a solution) when using Vision Pro. Is that normal?

The inverse kinematics solving and Apple Vision data transmission and reception are not closely related.

Did you enable Meshcat visualization? Meshcat visualization is quite time-consuming, so it is only recommended to enable Meshcat during debugging.

Hi I have adapted the part of if __name__ == '__main__' in robot_arm.py to our H1_2 code and IK runs around 0.01-0.02 seconds per loop, but it costs around 0.5 seconds in teleop_hand_and_arm.py without any modification on the IK code. Besides, we did not activate MeshcatVisualizer.

@silencht
Copy link
Collaborator

I am not sure how you are calculating the program elapsed time. But please notice the time.sleep(0.01) function in the robot_arm.py example program.

On my device (intel i7-11700 and nvidia rtx 3090), the solve_ik() function typically takes about 5 milliseconds. If I set Visualization to True in arm_ik = G1_29_ArmIK(Unit_Test = True, Visualization = False), the elapsed time goes up to 10ms or more.

Here's my measurement of the time-consuming robot_arm_ik.py program and the terminal output:

  • code:
if __name__ == "__main__":
    arm_ik = G1_29_ArmIK(Unit_Test = True, Visualization = False)

    # initial positon
    L_tf_target = pin.SE3(
        pin.Quaternion(1, 0, 0, 0),
        np.array([0.25, +0.25, 0.1]),
    )

    R_tf_target = pin.SE3(
        pin.Quaternion(1, 0, 0, 0),
        np.array([0.25, -0.25, 0.1]),
    )

    rotation_speed = 0.005
    noise_amplitude_translation = 0.001
    noise_amplitude_rotation = 0.1

    user_input = input("Please enter the start signal (enter 's' to start the subsequent program):\n")
    if user_input.lower() == 's':
        step = 0
        while True:
            # Apply rotation noise with bias towards y and z axes
            rotation_noise_L = pin.Quaternion(
                np.cos(np.random.normal(0, noise_amplitude_rotation) / 2),0,np.random.normal(0, noise_amplitude_rotation / 2),0).normalized()  # y bias

            rotation_noise_R = pin.Quaternion(
                np.cos(np.random.normal(0, noise_amplitude_rotation) / 2),0,0,np.random.normal(0, noise_amplitude_rotation / 2)).normalized()  # z bias
            
            if step <= 120:
                angle = rotation_speed * step
                L_tf_target.rotation = (rotation_noise_L * pin.Quaternion(np.cos(angle / 2), 0, np.sin(angle / 2), 0)).toRotationMatrix()  # y axis
                R_tf_target.rotation = (rotation_noise_R * pin.Quaternion(np.cos(angle / 2), 0, 0, np.sin(angle / 2))).toRotationMatrix()  # z axis
                L_tf_target.translation += (np.array([0.001,  0.001, 0.001]) + np.random.normal(0, noise_amplitude_translation, 3))
                R_tf_target.translation += (np.array([0.001, -0.001, 0.001]) + np.random.normal(0, noise_amplitude_translation, 3))
            else:
                angle = rotation_speed * (240 - step)
                L_tf_target.rotation = (rotation_noise_L * pin.Quaternion(np.cos(angle / 2), 0, np.sin(angle / 2), 0)).toRotationMatrix()  # y axis
                R_tf_target.rotation = (rotation_noise_R * pin.Quaternion(np.cos(angle / 2), 0, 0, np.sin(angle / 2))).toRotationMatrix()  # z axis
                L_tf_target.translation -= (np.array([0.001,  0.001, 0.001]) + np.random.normal(0, noise_amplitude_translation, 3))
                R_tf_target.translation -= (np.array([0.001, -0.001, 0.001]) + np.random.normal(0, noise_amplitude_translation, 3))
            time1 = time.time()
            arm_ik.solve_ik(L_tf_target.homogeneous, R_tf_target.homogeneous)
            time2 = time.time()
            print(f"ik time: {time2-time1}")

            step += 1
            if step > 240:
                step = 0
            time.sleep(0.1)
  • output:
(tv) unitree@unitree:~/avp_teleoperate/teleop/robot_control$ python robot_arm_ik.py 
Please enter the start signal (enter 's' to start the subsequent program):
s

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

ik time: 0.07529926300048828
ik time: 0.0051305294036865234
ik time: 0.005105018615722656
ik time: 0.0052225589752197266
ik time: 0.005151033401489258
ik time: 0.00518345832824707
ik time: 0.005174398422241211
ik time: 0.005191326141357422
ik time: 0.005120515823364258
ik time: 0.005214214324951172
ik time: 0.005171775817871094
ik time: 0.010285377502441406
ik time: 0.005167245864868164
ik time: 0.005070209503173828
ik time: 0.005173444747924805
ik time: 0.005149126052856445
ik time: 0.005143880844116211

Also, if your input data is very jittery (in other words, the actual data provides unreliable ik initial values), again the elapsed time will go up significantly. For example, as shown by this code and its results:

  • code
if __name__ == "__main__":
    arm_ik = G1_29_ArmIK(Unit_Test = True, Visualization = False)

    # initial positon
    L_tf_target = pin.SE3(
        pin.Quaternion(1, 0, 0, 0),
        np.array([0.25, +0.25, 0.1]),
    )

    R_tf_target = pin.SE3(
        pin.Quaternion(1, 0, 0, 0),
        np.array([0.25, -0.25, 0.1]),
    )

    rotation_speed = 0.005
    noise_amplitude_translation = 1.0
    noise_amplitude_rotation = 1.0

    user_input = input("Please enter the start signal (enter 's' to start the subsequent program):\n")
    if user_input.lower() == 's':
        step = 0
        while True:
            # Apply rotation noise with bias towards y and z axes
            rotation_noise_L = pin.Quaternion(
                np.cos(np.random.normal(0, noise_amplitude_rotation) / 2),0,np.random.normal(0, noise_amplitude_rotation / 2),0).normalized()  # y bias

            rotation_noise_R = pin.Quaternion(
                np.cos(np.random.normal(0, noise_amplitude_rotation) / 2),0,0,np.random.normal(0, noise_amplitude_rotation / 2)).normalized()  # z bias
            
            if step <= 120:
                angle = rotation_speed * step
                L_tf_target.rotation = (rotation_noise_L * pin.Quaternion(np.cos(angle / 2), 0, np.sin(angle / 2), 0)).toRotationMatrix()  # y axis
                R_tf_target.rotation = (rotation_noise_R * pin.Quaternion(np.cos(angle / 2), 0, 0, np.sin(angle / 2))).toRotationMatrix()  # z axis
                L_tf_target.translation += (np.array([0.001,  0.001, 0.001]) + np.random.normal(0, noise_amplitude_translation, 3))
                R_tf_target.translation += (np.array([0.001, -0.001, 0.001]) + np.random.normal(0, noise_amplitude_translation, 3))
            else:
                angle = rotation_speed * (240 - step)
                L_tf_target.rotation = (rotation_noise_L * pin.Quaternion(np.cos(angle / 2), 0, np.sin(angle / 2), 0)).toRotationMatrix()  # y axis
                R_tf_target.rotation = (rotation_noise_R * pin.Quaternion(np.cos(angle / 2), 0, 0, np.sin(angle / 2))).toRotationMatrix()  # z axis
                L_tf_target.translation -= (np.array([0.001,  0.001, 0.001]) + np.random.normal(0, noise_amplitude_translation, 3))
                R_tf_target.translation -= (np.array([0.001, -0.001, 0.001]) + np.random.normal(0, noise_amplitude_translation, 3))
	    
            time1 = time.time()
            arm_ik.solve_ik(L_tf_target.homogeneous, R_tf_target.homogeneous)
            time2 = time.time()
            print(time2-time1)

            step += 1
            if step > 240:
                step = 0
            time.sleep(0.1)
  • output
(tv) unitree@unitree:~/avp_teleoperate/teleop/robot_control$ python robot_arm_ik.py 
Please enter the start signal (enter 's' to start the subsequent program):
s

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

0.06438446044921875
0.009028434753417969
0.012813329696655273
0.01125478744506836
0.011279582977294922
0.011170148849487305
0.011179447174072266
0.010428905487060547
0.00896596908569336
0.010314464569091797
0.009021282196044922
0.010303497314453125
0.011131763458251953
0.009010553359985352
0.008991003036499023
0.010547161102294922
0.009634017944335938
0.009776830673217773
0.009676218032836914
0.008214712142944336
0.009602546691894531
0.009057283401489258
0.009717941284179688

@BBBBBBob
Copy link
Author

BBBBBBob commented Dec 24, 2024

Hi I am reading the comments about coordinate systems in tv_wrapper. I am a little bit confused by the wrist coordinates as stated in the comments:

 (Left Wrist) XR/AppleVisionPro Convention:
    - the x-axis pointing from wrist toward middle.
    - the y-axis pointing from index toward pinky.
    - the z-axis pointing from palm toward back of the hand.

 (Right Wrist) XR/AppleVisionPro Convention:
    - the x-axis pointing from wrist toward middle.
    - the y-axis pointing from pinky toward index.
    - the z-axis pointing from palm toward back of the hand.
    
 p.s. From website: https://registry.khronos.org/OpenXR/specs/1.1/man/html/openxr.html.
     You can find **(Left/Right Wrist) XR/AppleVisionPro Convention** related information like this below:
       "The wrist joint is located at the pivot point of the wrist, which is location invariant when twisting the hand without moving the forearm. 
        The backward (+Z) direction is parallel to the line from wrist joint to middle finger metacarpal joint, and points away from the finger tips. 
        The up (+Y) direction points out towards back of the hand and perpendicular to the skin at wrist. 
        The X direction is perpendicular to the Y and Z directions and follows the right hand rule."
     Note: The above context is of course under **(basis) OpenXR Convention**.

The coordinates of both hands do not align with the website statement. Can you explain what is the difference?

@silencht
Copy link
Collaborator

Hi I am reading the comments about coordinate systems in tv_wrapper. I am a little bit confused by the wrist coordinates as stated in the comments:

 (Left Wrist) XR/AppleVisionPro Convention:
    - the x-axis pointing from wrist toward middle.
    - the y-axis pointing from index toward pinky.
    - the z-axis pointing from palm toward back of the hand.

 (Right Wrist) XR/AppleVisionPro Convention:
    - the x-axis pointing from wrist toward middle.
    - the y-axis pointing from pinky toward index.
    - the z-axis pointing from palm toward back of the hand.
    
 p.s. From website: https://registry.khronos.org/OpenXR/specs/1.1/man/html/openxr.html.
     You can find **(Left/Right Wrist) XR/AppleVisionPro Convention** related information like this below:
       "The wrist joint is located at the pivot point of the wrist, which is location invariant when twisting the hand without moving the forearm. 
        The backward (+Z) direction is parallel to the line from wrist joint to middle finger metacarpal joint, and points away from the finger tips. 
        The up (+Y) direction points out towards back of the hand and perpendicular to the skin at wrist. 
        The X direction is perpendicular to the Y and Z directions and follows the right hand rule."
     Note: The above context is of course under **(basis) OpenXR Convention**.

The coordinates of both hands do not align with the website statement. Can you explain what is the difference?

To ensure that each issue only addresses one specific problem, this issue is being closed, and a new issue #25 has been created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants