-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ArrowInvalid: Could not convert <PIL.Image.Image image mode=RGB when adding image to Dataset #4796
Comments
@mariosasko I'm getting a similar issue when creating a Dataset from a Pandas dataframe, like so:
results in
Will the PR linked above also fix that? |
I would expect this to work, but it doesn't. Shouldn't be too hard to fix tho (in a subsequent PR). |
Hi @mariosasko just wanted to check in if there is a PR to follow for this. I was looking to create a demo app using this. If it's not working I can just use byte encoded images in the dataset which are not displayed. |
Hi @darraghdog! No PR yet, but I plan to fix this before the next release. |
I was just pointed here by @mariosasko, meanwhile I found a workaround using
|
Hmm, interesting. If I create the dataset on the fly:
it doesn't fail with the error in the OP, as However if I try to use this dataset it fails now with:
but if I create that same dataset one item at a time as in the previous comment's code snippet it doesn't fail. The features of this dataset are set to:
|
It looks like the problem still exists. Thank you |
There is a workaround: Here is an example how to do that: https://huggingface.co/datasets/jamescalam/image-text-demo/tree/main and Here are videos with explanations: https://www.youtube.com/watch?v=lqK4ocAKveE and https://www.youtube.com/watch?v=ODdKC30dT8c |
cc @mariosasko gentle ping for a fix :) |
Any update on this? I'm still facing this issure. Any workaround? |
I was facing the same issue. Downgrading datasets from 2.11.0 to 2.4.0 solved the issue. |
I was able to resolve my issue with a quick workaround:
Hope it helps! |
It works!! |
how did this work, how to use this script or where to paste it? |
I had a similar issue to @NielsRogge where I was unable to create a dataset from a Pandas DataFrame containing PIL.Images. I found another workaround that works in this case which involves converting the DataFrame to a python dictionary, and then creating a dataset from said python dictionary. This is a generic example of my workaround. The example assumes that you have your data in a Pandas DataFrame variable called "dataframe" plus a dictionary of your data's features in a variable called "features".
|
cc @mariosasko this issue has been open for 2 years, would be great to resolve it :) |
I have the same issue, my current workaround is saving the dataframe to a csv and then loading the dataset from the csv. Would also appreciate it a fix :) |
awesome, it really works~ |
Describe the bug
When adding a Pillow image to an existing Dataset on the hub,
add_item
fails due to the Pillow image not being automatically converted into the Image feature.Steps to reproduce the bug
Expected results
The image should be automatically casted to the Image feature when using
add_item
. For now, this can be fixed by usingencode_example
:Actual results
The text was updated successfully, but these errors were encountered: