-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: rotate images on source acquisition #51
Conversation
If the first thing we do after lazily opening the image is call exif_transpose, we end up causing some issues with that functions logic. Specifically, TIFF images with a TIFF tag orientation will be double rotated because the exif_transpose function reads the TIFF tag orientation (in Pillow exif also includes these TIFF tags) before the image is loaded into memory, but once the image is loaded into memory the TIFF plugin will rotate the image based on the TIFF orientation tag before the exif_transpose function then also rotates it. To avoid this, do something else which will force the image to be loaded into memory before calling exif_transpose. Bit hacky but hey tis what it is.
OK! Who is ready for some fun explaining all this? We have a couple of examples which show the issues we're trying to resolve with this rotation problem: https://data.nhm.ac.uk/media/c11e58f9-e0ef-4d5e-8171-9cd51d345565 a lovely picture of a bee: https://data.nhm.ac.uk/media/7aeb4a93-26df-467d-91c7-555826c92885 a lovely picture of a moth: In the examples above, the moth is shown upside down and the bee is the correct way up. In this case we're using the EMu derivative images to serve up both of these images. The bee has no EXIF orientation data but the moth does, we just ignore it during source aquisition. This PR is primarily attempting to solve the problem with the upside down moth, however, in doing so we needed (or at least thought we did) to keep the bee the right way up. The change in this PR, both commits in fact, fix the issue with the moth picture. However, they break any images created based on the original image of the bee (not ones from the derivatives, they are fine). The TL;DR explanation of this is that the bee's original TIFF image has an incorrectly applied TIFF orientation tag. You can actually see this currently on live now (21/07/24) where https://data.nhm.ac.uk/media/c11e58f9-e0ef-4d5e-8171-9cd51d345565/full/max/0/default.jpg looks like this: The details: The JPEG derivative images created by EMu all have no EXIF orientation tag present. The current code and the new code both just do nothing to rotate the images and you end up with a correclty orientated image. The original TIFF image has a TIFF orientation tag (not EXIF) which instructs any code reading the raw image data in the file counter-clockwise 270°. This causes the image to be presented on its side (as in the image above) as the raw image data is actually the correct way up. This is why I believe the problem with this bee original image is actually with the original TIFF not our processing of it. Now to explain why we have a second commit in this PR with a sutle change where 2 lines are reordered. I've explained this a bit in the commit message but didn't want to overload that with information, but I'll do a more thorough explanation here. We use Pillow to load images from their source and convert them to JPEG before they enter the IIIF processing pipeline. Pillow uses lazy loading and won't actually load the image's data into memory until you try and do something with it (like rotate it, or manipulate it in some way). In the first commit on this PR, we remove the EXIF stripping stuff and insert a call to the That bee is upside down 🙁. How has this happened? Turns out the The fix for this is in the second commit where we call the So overall, this PR fixes the moth and the bee original image is just incorrectly rotated at source, so we need to get a curator to reupload the image to fix that. I may raise this as an issue on Pillow but it may just be expected behaviour, I'm no TIFF, TIFF tag, or EXIF expert so I can't say for sure. |
It would be good to have some more test cases for this, but from testing with the one example I have, this works fine.
Staging (running this branch): https://data-nlb-stg-01.nhm.ac.uk/media/7aeb4a93-26df-467d-91c7-555826c92885
Live (running v0.16.1): https://data.nhm.ac.uk/media/7aeb4a93-26df-467d-91c7-555826c92885
Essentially this change rotates the images on entry to the server (i.e. when we get the image from source) and strips any orientation value from the EXIF data. This puts the onus on the person putting the file into EMu and on EMu to do the right thing™️ and be consistent with any orientation tag that may or may not exist.