Image Preprocessing techniques for selfie segmentation #5730

codewarrior26 · 2024-11-14T23:13:19Z

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

Linux Ubuntu

MediaPipe Tasks SDK version

No response

Task name (e.g. Image classification, Gesture recognition etc.)

Selfie Segmentation

Programming Language and version (e.g. C++, Python, Java)

Python

Describe the actual behavior

Need help to preprocess input images

Describe the expected behaviour

Currently, the model card doesn't specify the preprocessing techniques to be used.

Standalone code/steps you may have used to try to get what you need

I am using the Selfie segmentation tflite (float 32) in Python without using the mediaipipe library. This is because I need to deploy the model on edge device where there are compute and memory constraints. 
The model card specifies that images have to be normalized to [0.0 1.0]. I tried normalizing my input image by dividing it by 255, but that didn't help with the accuracy of the model. I request some help in understanding what preprocessing techniques to use on input images so that I could reproduce the accurate results I see when I use mediapipe library directly. Thanks!

Other info / Complete Logs

No response

kuaashish · 2024-11-15T07:08:52Z

Hi @codewarrior26,

Could you please share the complete instructions you are following from our documentation to help us better understand the issue? Additionally, providing the standalone code you are using would be very helpful.

Thank you!!

codewarrior26 · 2024-11-15T14:20:37Z

Hi @kuaashish, thank you for responding. So this issue is not a bug, but rather an ask for support in understanding what steps to follow to preprocess the image before making an inference using the selfie segmentation model through tensorflow-lite (and not through mediapipe directly). Let me elaborate this:

Method-1 - directly using mediapipe library to make the inference

with mp_selfie_segmentation.SelfieSegmentation() as selfie_segmentation:
    # Convert the BGR image to RGB and process it with MediaPipe Selfie Segmentation.
    results = selfie_segmentation.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))

    blurred_image = cv2.GaussianBlur(image,(55,55),0)
    condition = np.stack((results.segmentation_mask,) * 3, axis=-1) > 0.1
    output_image = np.where(condition, image, blurred_image)

    # Plot the result
    print(f'Blurred background of {name}:')
    resize_and_show(output_image)

This works perfectly fine as expected

Method-2 using pretrained tflite to make the inference

Download the mediapipe_selfie.tflite model
Make the inference

image = cv2.imread(image_path)
image = cv2.resize(image, (256, 256))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
original_image = copy.deepcopy(image)

# Preprocess the image to normalize the image in [0.0, 1.0]
image = (image / 255.0)

# Load the tflite interpreter and make inference
interpreter = tf.lite.Interpreter(model_path='/content/mediapipe_selfie.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()[0]
interpreter.set_tensor(input_details['index'], image)
interpreter.invoke()

raw_prediction = interpreter.get_tensor(output_details['index'])
raw_prediction = np.squeeze(raw_prediction)
blurred_image = cv2.GaussianBlur(original_image,(55,55),0)
condition = np.stack((raw_prediction,) * 3, axis=-1) > 0.1
output_image = np.where(condition, original_image, blurred_image)

# Plot the result
plt.imshow(output_image)
plt.axis('off')
plt.show()

Method-2 works fine, but the results are not as good as method-1 primarily because
in method-1 the mediapipe's selfie_segmentation.process(...) function takes care of preprocessing the image under the hood ie; it
follows some techniques and finally noramlizes pixel values to [0.0 ,1.0]. In my method-2
if I follow those similar preprocessing steps (and not just do image = (image / 255.0)), I could improve the results I see with method-2.

Hence my ask here is for you to please shed some light on what steps I have to take to preprocess the image I feed to the pretrained tflite model in method-2.
(Please note that due to constraints of edge device, I cannot directly use mediapipe library and hence I am using the method-2 above)

kuaashish · 2024-11-29T08:59:30Z

Hi @codewarrior26,

The solution or package you are currently using is part of our legacy solution, which have been upgraded and integrated into the Image Segmentor. You can find the updated information in our documentation.

Please note that support for all legacy solutions has been discontinued. We recommend using our new Image Segmentor Task API. You can access the overview page, the implementation guide for Python.

We encourage you to try the new solution. If you encounter any issues or have similar concerns, please let us know, we will be happy to assist. Unfortunately, further support for legacy solutions is no longer available.

Thank you!!

google-ml-butler bot assigned kuaashish Nov 14, 2024

kuaashish added os:linux-non-arm Issues on linux distributions which run on x86-64 architecture. DOES NOT include ARM devices. platform:python MediaPipe Python issues task:image segmentation Issues related to image segmentation: Locate objects and create image masks with labels labels Nov 15, 2024

kuaashish added the stat:awaiting response Waiting for user response label Nov 15, 2024

google-ml-butler bot removed the stat:awaiting response Waiting for user response label Nov 15, 2024

kuaashish added the stat:awaiting response Waiting for user response label Nov 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Preprocessing techniques for selfie segmentation #5730

Image Preprocessing techniques for selfie segmentation #5730

codewarrior26 commented Nov 14, 2024

kuaashish commented Nov 15, 2024

codewarrior26 commented Nov 15, 2024

kuaashish commented Nov 29, 2024

Image Preprocessing techniques for selfie segmentation #5730

Image Preprocessing techniques for selfie segmentation #5730

Comments

codewarrior26 commented Nov 14, 2024

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

OS Platform and Distribution

MediaPipe Tasks SDK version

Task name (e.g. Image classification, Gesture recognition etc.)

Programming Language and version (e.g. C++, Python, Java)

Describe the actual behavior

Describe the expected behaviour

Standalone code/steps you may have used to try to get what you need

Other info / Complete Logs

kuaashish commented Nov 15, 2024

codewarrior26 commented Nov 15, 2024

kuaashish commented Nov 29, 2024