Clarification on Landmark Scaling and Adjustments for Different Image Sizes #2718

bejujEbtNfubGvtjp · 2025-01-13T07:25:50Z

Hello ! First of all, congratulations for such amazing work so far.

I have a few questions regarding the logic used for adapting landmarks to different image sizes in the following code snippet:

src = np.array([
    [30.2946, 51.6963],
    [65.5318, 51.5014],
    [48.0252, 71.7366],
    [33.5493, 92.3655],
    [62.7299, 92.2041]
], dtype=np.float32)

if image_size[1] == 112:
    src[:, 0] += 8.0

I have come to understand that this adjustment (src[:, 0] += 8.0) is made when resizing for images of size 112x112, as opposed to the default size of 112x96. However, I have the following questions:

Why is the horizontal adjustment specifically +8.0? Since 96 + 8 = 104, it doesn’t fully align with 112. Could you clarify the reasoning behind this choice?
If I need to adapt this logic for images of size 512x512, how should the src values and adjustments be scaled? Is it simply a linear scaling by a factor of 512 X 512, or is there additional logic to consider?
How were the original src values determined? Were they derived from statistical averages of keypoints across a specific dataset, or do they originate from a standard alignment template? I would greatly appreciate an explanation of the mathematical principles involved. Additionally, if possible, could you provide a reference or source for this information?

Any clarification or guidance on these points would be greatly appreciated. Thank you for your time and for maintaining this repository!

gremlation · 2025-01-14T18:04:34Z

Why is the horizontal adjustment specifically +8.0? Since 96 + 8 = 104, it doesn’t fully align with 112.

Translating it right by 8 means that a 96-wide image is centered within 112 with padding of 8 on the left and padding of 8 on the right. Or in other words, left(8) + image(96) + right(8) = 112.

bejujEbtNfubGvtjp · 2025-01-15T05:14:54Z

Why is the horizontal adjustment specifically +8.0? Since 96 + 8 = 104, it doesn’t fully align with 112.

Translating it right by 8 means that a 96-wide image is centered within 112 with padding of 8 on the left and padding of 8 on the right. Or in other words, left(8) + image(96) + right(8) = 112.

Okay, I get it now. Thank you so much.

bejujEbtNfubGvtjp · 2025-01-15T05:16:07Z

I would really appreciate if someone could answer the remainder of the queries. Thank You !

bejujEbtNfubGvtjp closed this as completed Jan 15, 2025

bejujEbtNfubGvtjp reopened this Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on Landmark Scaling and Adjustments for Different Image Sizes #2718

Clarification on Landmark Scaling and Adjustments for Different Image Sizes #2718

bejujEbtNfubGvtjp commented Jan 13, 2025 •

edited

Loading

gremlation commented Jan 14, 2025 •

edited

Loading

bejujEbtNfubGvtjp commented Jan 15, 2025

bejujEbtNfubGvtjp commented Jan 15, 2025

Clarification on Landmark Scaling and Adjustments for Different Image Sizes #2718

Clarification on Landmark Scaling and Adjustments for Different Image Sizes #2718

Comments

bejujEbtNfubGvtjp commented Jan 13, 2025 • edited Loading

gremlation commented Jan 14, 2025 • edited Loading

bejujEbtNfubGvtjp commented Jan 15, 2025

bejujEbtNfubGvtjp commented Jan 15, 2025

bejujEbtNfubGvtjp commented Jan 13, 2025 •

edited

Loading

gremlation commented Jan 14, 2025 •

edited

Loading