Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on Landmark Scaling and Adjustments for Different Image Sizes #2718

Open
bejujEbtNfubGvtjp opened this issue Jan 13, 2025 · 3 comments

Comments

@bejujEbtNfubGvtjp
Copy link

bejujEbtNfubGvtjp commented Jan 13, 2025

Hello ! First of all, congratulations for such amazing work so far.

I have a few questions regarding the logic used for adapting landmarks to different image sizes in the following code snippet:

src = np.array([
    [30.2946, 51.6963],
    [65.5318, 51.5014],
    [48.0252, 71.7366],
    [33.5493, 92.3655],
    [62.7299, 92.2041]
], dtype=np.float32)

if image_size[1] == 112:
    src[:, 0] += 8.0

I have come to understand that this adjustment (src[:, 0] += 8.0) is made when resizing for images of size 112x112, as opposed to the default size of 112x96. However, I have the following questions:

  1. Why is the horizontal adjustment specifically +8.0? Since 96 + 8 = 104, it doesn’t fully align with 112. Could you clarify the reasoning behind this choice?
  2. If I need to adapt this logic for images of size 512x512, how should the src values and adjustments be scaled? Is it simply a linear scaling by a factor of 512 X 512​, or is there additional logic to consider?
  3. How were the original src values determined? Were they derived from statistical averages of keypoints across a specific dataset, or do they originate from a standard alignment template? I would greatly appreciate an explanation of the mathematical principles involved. Additionally, if possible, could you provide a reference or source for this information?

Any clarification or guidance on these points would be greatly appreciated. Thank you for your time and for maintaining this repository!

@gremlation
Copy link

gremlation commented Jan 14, 2025

Why is the horizontal adjustment specifically +8.0? Since 96 + 8 = 104, it doesn’t fully align with 112.

Translating it right by 8 means that a 96-wide image is centered within 112 with padding of 8 on the left and padding of 8 on the right. Or in other words, left(8) + image(96) + right(8) = 112.

@bejujEbtNfubGvtjp
Copy link
Author

Why is the horizontal adjustment specifically +8.0? Since 96 + 8 = 104, it doesn’t fully align with 112.

Translating it right by 8 means that a 96-wide image is centered within 112 with padding of 8 on the left and padding of 8 on the right. Or in other words, left(8) + image(96) + right(8) = 112.

Okay, I get it now. Thank you so much.

@bejujEbtNfubGvtjp
Copy link
Author

I would really appreciate if someone could answer the remainder of the queries. Thank You !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants