Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image Optimizer can't detect duplicate images #77

Open
austinwendt-wp opened this issue Dec 20, 2022 · 1 comment
Open

Image Optimizer can't detect duplicate images #77

austinwendt-wp opened this issue Dec 20, 2022 · 1 comment

Comments

@austinwendt-wp
Copy link

austinwendt-wp commented Dec 20, 2022

What’s not working: if you upload the same image twice, image optimizer will only detect it once.

  • Reason: Images are considered unique based on hash of the image file content so if you upload same file twice b/c there is a duplicate hash

What should be happening: if you upload the same image twice, you should be able to optimize both of them.

  • Possible fixes: Look at last modified time, time created or file path

Steps to recreate:

  • Upload an image (cool-image.jpg) to the WordPress media library.
  • Scan for images with image optimizer
  • Optimize all found images
  • Upload the same image (cool-image.jpg) to the WordPress media library again.
  • Scan for images. The overview tab will report that there are 10 new images found to optimize.
  • Click "view images" and you won't see any images available for optimization.
@veryspry
Copy link
Contributor

veryspry commented Dec 20, 2022

Can't believe this didn't come up sooner!

This is something that I made a mental note of when first writing the addon but then never had the time to implement a fix. My two cents: I think using a file path is the best route here.

Off the top of my head, there are two ways a file path could be used to accomplish this:

  1. append the file path to the file buffer passing to the md5 hash function.
  2. store a hash map of processed image file paths and use that in conjunction with file md5 hashes to determine if an image is processed yet. (e.g. an image has been processed only if the image's md5 hash is found in imageData and the image path is in the hash map)

The second option (path hash map) seems more favorable to me as I think it will be more flexible, less prone to bugs and easier to maintain.

Edit:

This will be a bit tricky as the images are indexed in a hash map with their md5 hash digest as the key. This occurs during the image scanning process: https://github.com/getflywheel/local-addon-image-optimizer/blob/master/src/main/scanImagesProcess.ts#L12-L40

That said, the second approach would need to change a bit since I had forgotten exactly how images were indexed when originally writing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants