Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add {"type": "image_url"} for vision fine-tuning to support OpenAI API integration (e.g., vLLM) #1348

Open
davedgd opened this issue Nov 28, 2024 · 2 comments

Comments

@davedgd
Copy link

davedgd commented Nov 28, 2024

Fine-tuning with vision is working great, but there's a limitation right now for models such as Qwen2-VL where {"type": "image"} must be used when formatting the template, as discussed in the colab examples. This works okay with vLLM offline inference, but it is not possible to use this format with vLLM's OpenAI Client since the OpenAI API requires using {"type": "image_url"} with Base64 encoded images as outlined in the OpenAI API documentation here and in vLLM's example here.

If this is already possible and I missed it, please let me know!

@danielhanchen
Copy link
Contributor

Oh I have not yet added it in! Will do!

@davedgd
Copy link
Author

davedgd commented Nov 28, 2024

Awesome, thank you @danielhanchen!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants