bugfix Idefics3 processor - handle gracefully cases with text and no images #35363

mfarre · 2024-12-20T13:32:04Z

What does this PR do?

Fixing Idefics3 processor to work with batches that do not include images

Who can review?

HuggingFaceDocBuilderDev · 2024-12-20T14:08:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

andimarafioti

LGTM!

andimarafioti · 2024-12-20T15:33:19Z

Let's wait for @yonigozlan to be sure!

yonigozlan

Yes that looks much better! Just suggested a check to avoid hallucinations like this:

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "What do we see in this image?"},
        ]
    }
]
prompt = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(text=prompt, return_tensors="pt")
inputs = {k: v.to(DEVICE) for k, v in inputs.items()}


# Generate
generated_ids = model.generate(**inputs, max_new_tokens=50)
generated_texts = processor.batch_decode(generated_ids, skip_special_tokens=True)

print(generated_texts)

['User:What do we see in this image?\nAssistant: The image depicts a scene from a historical or fictional setting, likely from the medieval period, given the attire and architecture. The central focus is on a large, ornate gate, which appears to be the entrance to a castle or a fortified structure.']

also in Idefics3ProcessorTest could we add a test for text only inference and one to check that an error is raised if we have an image in the conversation, but no images are passed to the processor?

src/transformers/models/idefics3/processing_idefics3.py

Co-authored-by: Yoni Gozlan <[email protected]>

mfarre · 2024-12-20T21:11:40Z

thanks @andimarafioti
thanks @yonigozlan if you give me your LGTM I will merge the changes: I followed your suggestions adding some tests and adding your code proposal.

Yes that looks much better! Just suggested a check to avoid hallucinations like this:

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "What do we see in this image?"},
        ]
    }
]
prompt = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(text=prompt, return_tensors="pt")
inputs = {k: v.to(DEVICE) for k, v in inputs.items()}


# Generate
generated_ids = model.generate(**inputs, max_new_tokens=50)
generated_texts = processor.batch_decode(generated_ids, skip_special_tokens=True)

print(generated_texts)

['User:What do we see in this image?\nAssistant: The image depicts a scene from a historical or fictional setting, likely from the medieval period, given the attire and architecture. The central focus is on a large, ornate gate, which appears to be the entrance to a castle or a fortified structure.']

also in Idefics3ProcessorTest could we add a test for text only inference and one to check that an error is raised if we have an image in the conversation, but no images are passed to the processor?

yonigozlan · 2024-12-20T22:39:17Z

Looks great thanks @mfarre for fixing this! LGTM for me, but let's maybe wait for @ArthurZucker 's review before merging :)

bugfix processing empty images

184a215

mfarre requested a review from andimarafioti December 20, 2024 13:32

mfarre added 2 commits December 20, 2024 13:34

fix

f2c808c

fix

f994edc

andimarafioti approved these changes Dec 20, 2024

View reviewed changes

andimarafioti requested a review from yonigozlan December 20, 2024 15:33

yonigozlan reviewed Dec 20, 2024

View reviewed changes

src/transformers/models/idefics3/processing_idefics3.py Outdated Show resolved Hide resolved

mfarre and others added 5 commits December 20, 2024 19:54

Update src/transformers/models/idefics3/processing_idefics3.py

f83689c

Co-authored-by: Yoni Gozlan <[email protected]>

adding tests

3c1fc53

fix

b83ffc0

fix

a82bc25

fix

2fd930a

yonigozlan requested a review from ArthurZucker December 20, 2024 22:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix Idefics3 processor - handle gracefully cases with text and no images #35363

bugfix Idefics3 processor - handle gracefully cases with text and no images #35363

mfarre commented Dec 20, 2024

HuggingFaceDocBuilderDev commented Dec 20, 2024

andimarafioti left a comment

andimarafioti commented Dec 20, 2024

yonigozlan left a comment

mfarre commented Dec 20, 2024

yonigozlan commented Dec 20, 2024

bugfix Idefics3 processor - handle gracefully cases with text and no images #35363

Are you sure you want to change the base?

bugfix Idefics3 processor - handle gracefully cases with text and no images #35363

Conversation

mfarre commented Dec 20, 2024

What does this PR do?

Who can review?

HuggingFaceDocBuilderDev commented Dec 20, 2024

andimarafioti left a comment

Choose a reason for hiding this comment

andimarafioti commented Dec 20, 2024

yonigozlan left a comment

Choose a reason for hiding this comment

mfarre commented Dec 20, 2024

yonigozlan commented Dec 20, 2024