Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Vision Language Model chapter #49

Open
3 tasks
burtenshaw opened this issue Dec 5, 2024 · 6 comments
Open
3 tasks

Implement Vision Language Model chapter #49

burtenshaw opened this issue Dec 5, 2024 · 6 comments

Comments

@burtenshaw
Copy link
Collaborator

We need to implement the section on VLMs. It should be based on existing content from the huggingface ecosystem, adapted for SmolVLM, adapted to the course structure, and offer exercises.

Material

Steps to do

  • add prose that explains vlm in simplest possible terms
  • add references to all pages
  • add exercise notebook on finetuning SmolVLM
@duydl
Copy link
Collaborator

duydl commented Dec 6, 2024

Hi, could the issue be assigned?

@burtenshaw
Copy link
Collaborator Author

Hi, could the issue be assigned?

Of course, I expect others will contribute too, but open a draft PR with the outline and LGTM.

@jungnerd
Copy link

jungnerd commented Dec 7, 2024

Hi there 👋🏻
I'd like to contribute to this issue by providing an exercise notebook demonstrating how to fine-tune VLM with TRL?
The notebook will be based on How to Fine-Tune Multimodal Models or VLMs with Hugging Face TRL but using a SmolVLM instead.

@burtenshaw
Copy link
Collaborator Author

burtenshaw commented Dec 7, 2024

That's great @jungnerd . Thanks.

We're already working on the text for the vlm chapter here: #59 . There's a first draft of the notebook from @duydl .

Why don't Open a PR on to #59 with an exercise notebook?

@duydl
Copy link
Collaborator

duydl commented Dec 7, 2024

@jungnerd I had been working on sft for vlm and just finished it. There is still the dpo approach notebook, could be based on this blog Hugging Face Blog: Preference Optimization for VLMs, and probably lots of fixes and optimizations in completed the markdown and notebooks for vlm.

@jungnerd
Copy link

jungnerd commented Dec 7, 2024

@duydl Am I understanding correctly that you’re suggesting creating a notebook to fine-tune SmolVLM using DPO? If so, that sounds great! I think it would be really exciting to work on a fine-tuning notebook with DPO!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants