You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fine-tuning with vision is working great, but there's a limitation right now for models such as Qwen2-VL where {"type": "image"} must be used when formatting the template, as discussed in the colab examples. This works okay with vLLM offline inference, but it is not possible to use this format with vLLM's OpenAI Client since the OpenAI API requires using {"type": "image_url"} with Base64 encoded images as outlined in the OpenAI API documentation here and in vLLM's example here.
If this is already possible and I missed it, please let me know!
The text was updated successfully, but these errors were encountered:
Fine-tuning with vision is working great, but there's a limitation right now for models such as Qwen2-VL where
{"type": "image"}
must be used when formatting the template, as discussed in the colab examples. This works okay with vLLM offline inference, but it is not possible to use this format with vLLM's OpenAI Client since the OpenAI API requires using{"type": "image_url"}
with Base64 encoded images as outlined in the OpenAI API documentation here and in vLLM's example here.If this is already possible and I missed it, please let me know!
The text was updated successfully, but these errors were encountered: