[Feature Request] Add {"type": "image_url"} for vision fine-tuning to support OpenAI API integration (e.g., vLLM) #1348

davedgd · 2024-11-28T03:41:59Z

Fine-tuning with vision is working great, but there's a limitation right now for models such as Qwen2-VL where {"type": "image"} must be used when formatting the template, as discussed in the colab examples. This works okay with vLLM offline inference, but it is not possible to use this format with vLLM's OpenAI Client since the OpenAI API requires using {"type": "image_url"} with Base64 encoded images as outlined in the OpenAI API documentation here and in vLLM's example here.

If this is already possible and I missed it, please let me know!

The text was updated successfully, but these errors were encountered:

danielhanchen · 2024-11-28T10:41:12Z

Oh I have not yet added it in! Will do!

davedgd · 2024-11-28T14:51:23Z

Awesome, thank you @danielhanchen!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Add {"type": "image_url"} for vision fine-tuning to support OpenAI API integration (e.g., vLLM) #1348

[Feature Request] Add {"type": "image_url"} for vision fine-tuning to support OpenAI API integration (e.g., vLLM) #1348

davedgd commented Nov 28, 2024

danielhanchen commented Nov 28, 2024

davedgd commented Nov 28, 2024

[Feature Request] Add {"type": "image_url"} for vision fine-tuning to support OpenAI API integration (e.g., vLLM) #1348

[Feature Request] Add {"type": "image_url"} for vision fine-tuning to support OpenAI API integration (e.g., vLLM) #1348

Comments

davedgd commented Nov 28, 2024

danielhanchen commented Nov 28, 2024

davedgd commented Nov 28, 2024