Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inference API] Add image-text-to-text task and fix
generate
script #1440[Inference API] Add image-text-to-text task and fix
generate
script #1440Changes from 4 commits
1efb326
7cf2959
b6771b3
5977ab7
05b8858
b94abfa
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not expected (i.e. having
import requests ...
beforefrom huggingface_hub import InferenceClient
). I realized that https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct?inference_api=true has a problem. Model doesn't have a chat template and therefore is not tagged as"conversational"
which creates this weird side effect.So I see 3 independent things to correct here:
meta-llama/Llama-3.2-11B-Vision-Instruct
first on the image-text-to-image task page (to update here)"conversational"
detection. At the moment, it's based only on the presence of a chat template. However for idefics chatty 8b it seems it's using "use_default_system_prompt": true instead. @Rocketknight1 is it safe to assume that a model with no chat template but this parameter set to True is in fact a conversational model? And if not, which parameter could we check?image-text-to-text
(does that even exist?), we should fix the snippet generator so that only therequests
-based snippet is displayed instead of this weird combination.cc @osanseviero as well for viz'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to add,
HuggingFaceM4/idefics2-8b-chatty
has thechat_template
defined in the processor_config.json. Thetokenizer.chat_template
attribute is supposed to be saved intokenizer_config.json
file. I guess the template was set usingtransformers.ProcessorMixin
instead.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the 3rd point, pinging @mishig25 since it's related to huggingface.js/pull/938. do you think it's okay to map image-text-to-text to
snippetBasic
instead and define the task input here ?