Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICC profiles from file stream in pyvips #475

Open
jonasteuwen opened this issue May 27, 2024 · 5 comments
Open

ICC profiles from file stream in pyvips #475

jonasteuwen opened this issue May 27, 2024 · 5 comments

Comments

@jonasteuwen
Copy link

jonasteuwen commented May 27, 2024

Problem

Currently pyvips only supports reading ICC profiles from a file as far as I can see. OpenSlide gives an io.BytesIO output.
I have modified openslide-python to output pyvips.Image.

Code example

So right now you can do this:

import openslide.lowlevel as openslide_lowlevel
import os

owsi = openslide_lowlevel.open(str(filename))
profile = openslide_lowlevel.read_icc_profile(owsi)
color_profile = io.BytesIO(profile)

With PIL you can now do this:

pil_region = wsi.read_region(coordinates, level, size)
to_profile = PIL.ImageCms.createProfile("sRGB")
intent = PIL.ImageCms.getDefaultIntent(color_profile)
color_transform = PIL.ImageCms.buildTransform(color_profile, to_profile, "RGBA", "RGBA", intent, 0)
PIL.ImageCms.applyTransform(pil_region, color_transform, inPlace=True)

This does not seem to be possible with pyvips and I need to dump color_profile to disk?

@jcupitt
Copy link
Member

jcupitt commented May 27, 2024

Hi @jonasteuwen,

pyvips lets you fetch any libvips metadata with get. For example:

image = pyvips.Image.new_from_file("CMU-1.svs")
profile = image.get("icc-profile-data")

You can see all the metadata that libvips can read for a file with vipsheader, for example:

$ vipsheader -a CMU-1.svs | grep icc
openslide.icc-size: 141992
icc-profile-data: 141992 bytes of binary data

The icc_transform operation in pyvips can pick up the metadata profile, so you could write:

image = pyvips.Image.new_from_file("CMU-1.svs")
srgb = image.icc_transform("srgb")

And it'll combine the slide profile with a standard srgb profile to generate a corrected sRGB image.

openslide makes RGBA images by default, though the A is almost always just 255. If you pass the rgb option to new_from_file it'll read plain RGB instead, which can give a very useful speedup.

image = pyvips.Image.new_from_file("CMU-1.svs", rgb=True)

@jcupitt
Copy link
Member

jcupitt commented May 27, 2024

Ah you want to just fetch and process a small region, is that right? You could write:

image = pyvips.Image.new_from_file("CMU-1.svs", rgb=True).icc_transform("srgb")
for y in range(0, image.height, 256):
    for x in range(0, image.width, 256):
        tile = image.crop(x, y, min(256, image.width - x), min(256, image.height - y))
        rgb_pixel_array = tile.numpy()
        do_something_with_the_tile_data(rgb_pixel_array)

libvips is threaded and demand-driven, so it'll be efficient.

@jonasteuwen
Copy link
Author

jonasteuwen commented May 27, 2024

Hi @jcupitt,

Thank you for your prompt reply. In my code, I have two backends: pyvips directly, which will work as you do (thanks for the example, that's much more efficient!), and a fork of openside-python where instead of outputting it to a PIL Image, pass it to a pyvips image. See here:

https://github.com/NKI-AI/dlup/blob/feature/libvips/dlup/backends/openslide_backend.py
https://github.com/NKI-AI/dlup/blob/feature/libvips/dlup/experimental_backends/pyvips_backend.py.

When using the openslide C library, you can get the icc profile as BytesIO stream as shown above, and I want to use those to create an icc_transform that I want to apply to your rgb_pixel_array.

I would imagine something like this:

owsi = openslide_lowlevel.open(str(filename))
profile = openslide_lowlevel.read_icc_profile(owsi)
color_profile = io.BytesIO(profile)

for y in range(0, image.height, 256):
    for x in range(0, image.width, 256):
        tile = owsi.read_region((x, y), level, (min(256, image.width - x), min(256, image.height - y))).icc_transform("srgb", input_profile=color_profile)
        rgb_pixel_array = tile.numpy()
        do_something_with_the_tile_data(rgb_pixel_array)

Note that I modified the .read_region() of the openslide library to output a pyvips.Image.

OpenSlide attaches it to the PIL image when reading the region: https://github.com/openslide/openslide-python/blob/22978715366db4ef1a3ebaab49c514131617fe66/openslide/__init__.py#L255

Can we do the same using this profile BytesIO?

@jcupitt
Copy link
Member

jcupitt commented May 27, 2024

You can attach the profile from openslide_lowlevel as metadata to the pyvips image. Something like (untested):

owsi = openslide_lowlevel.open(str(filename))
profile = openslide_lowlevel.read_icc_profile(owsi)
color_profile = io.BytesIO(profile).read()

tile = owsi.read_region((x, y), level, (min(256, image.width - x), min(256, image.height - y)))
# attach profile to image as metadata
tile.set_type(pyvips.GValue.blob_type, "icc-profile-data", color_profile)
tile = tile.icc_transform("srgb")

Though performance might not be that great -- image = pyvips.Image.new_from_file("CMU-1.svs", rgb=True).icc_transform("srgb") will probably be a lot quicker (but I've not benchmarked it).

Why do you need two backends?

@jonasteuwen
Copy link
Author

jonasteuwen commented May 27, 2024

@jcupitt Thank you! I will give it a try!

Different backends: I found there are some minor differences between how pyvips reads the images and openslide reads them (one of them the output being RGB/RGBA or so) and maybe some interpolation. I don't know why, the ssim > 0.999 but np.allclose(a,b) is not true. While I use pyvips for new projects, I wanted to make sure that our older projects based on openslide remain producing the same outputs for the same data when they update the library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants