[MODULE] Implement Chapter 5: Vision Language Model #59

duydl · 2024-12-06T14:29:47Z

PR for Issue #49: Implement Vision Language Model chapter

Description

This PR adds the Vision Language Model (VLM) chapter.

Changes Introduced

Overview of VLMs
- Defined Vision Language Models and their capabilities.
- Highlighted applications like image captioning, visual question answering, and multimodal reasoning.
- Linked to the detailed VLM Usage page.
- Complete VLM Usage page
Fine-Tuning Guide
- Explained the process and importance of fine-tuning for specific tasks.
- Linked to the detailed VLM Fine-Tuning page.
- Complete VLM Fine-Tuning page.
Exercise Notebooks
- Added two Jupyter Notebooks:
  - vlm_usage_sample.ipynb: Demonstrates pre-trained VLM usage for tasks such as image and video processing.
  - vlm_finetune_sample.ipynb: Guides fine-tuning a VLM for various datasets and advanced methods.
- Includes examples and tiered exercises for learners.

burtenshaw

Thanks for the PR!

This looks like a great structure. I've reviewed the text but not the notebook. Let's get the text in place then move on to the notebook last.

5_vision_language_models/README.md

5_vision_language_models/vlm_finetuning.md

5_vision_language_models/vlm_usage.md

Co-authored-by: burtenshaw <[email protected]>

duydl · 2024-12-07T08:22:20Z

@burtenshaw I think it is ready for a review.

burtenshaw · 2024-12-07T08:26:13Z

@burtenshaw I think it is ready for a review.

Nice work! Let's get the notebook work from here in, and then I'll get a reviewer on it.

duydl · 2024-12-07T11:06:16Z

@burtenshaw I got the notebook working, though the training would take some time on my hardware.

duydl · 2024-12-15T16:22:48Z

@burtenshaw Seem like this should be merged by tomorrow. Sorry, I got unexpected busy and could not work on this. Let see what I can add before the deadline...

burtenshaw · 2024-12-15T17:07:34Z

@burtenshaw Seem like this should be merged by tomorrow. Sorry, I got unexpected busy and could not work on this. Let see what I can add before the deadline...

No worries. I am currently merging modules on their release day, so I'll do this tomorrow.

[MODULE] Implement Chapter 5: Vision Language Model

duydl and others added 7 commits December 6, 2024 19:22

Init notebook files

e1a1667

Init markdown files

446eb2f

Add draft of README for vlm

dcd8639

Adjust markdowns guide

8caef10

Draft of vlm_usage notebook

decaaff

Draft of vlm_finetune notebook with lots of filler

2f095f2

Delete mistakenly added 1_instruction_tuning/tiny-code/README.md

172d215

burtenshaw reviewed Dec 6, 2024

View reviewed changes

duydl and others added 3 commits December 7, 2024 12:18

Update 5_vision_language_models/README.md

9cddb7d

Co-authored-by: burtenshaw <[email protected]>

Update 5_vision_language_models/README.md

da5e102

Co-authored-by: burtenshaw <[email protected]>

Update 5_vision_language_models/README.md

ea027df

burtenshaw mentioned this pull request Dec 7, 2024

Implement Vision Language Model chapter #49

Open

3 tasks

duydl added 7 commits December 7, 2024 14:45

Draft of VLM Fine-Tuning

65023ac

Complete VLM Fine Tuning page

87f13ae

Complete vlm usage page

75f6772

Fix README.md

7a4f016

Add images

f524c79

Fix links in vlm finetuning resources

666e4d5

Update notebooks

a2a4016

duydl marked this pull request as ready for review December 7, 2024 08:21

duydl added 4 commits December 7, 2024 15:30

Update fine tune notebook

7630f72

Rename finetune notebook and add sft ,dpo notebook

7b22889

Running sft vlm

d5bbe14

Complete vlm_sft notebook

ec68534

burtenshaw mentioned this pull request Dec 12, 2024

Let's work on a RAG module! #85

Open

burtenshaw approved these changes Dec 16, 2024

View reviewed changes

burtenshaw merged commit f43a962 into huggingface:main Dec 16, 2024

zcasanova pushed a commit to zcasanova/smol-course that referenced this pull request Dec 27, 2024

Merge pull request huggingface#59 from duydl/feature/vlm-module

e5039dc

[MODULE] Implement Chapter 5: Vision Language Model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MODULE] Implement Chapter 5: Vision Language Model #59

[MODULE] Implement Chapter 5: Vision Language Model #59

duydl commented Dec 6, 2024 •

edited

Loading

burtenshaw left a comment

duydl commented Dec 7, 2024

burtenshaw commented Dec 7, 2024

duydl commented Dec 7, 2024

duydl commented Dec 15, 2024

burtenshaw commented Dec 15, 2024

[MODULE] Implement Chapter 5: Vision Language Model #59

[MODULE] Implement Chapter 5: Vision Language Model #59

Conversation

duydl commented Dec 6, 2024 • edited Loading

PR for Issue #49: Implement Vision Language Model chapter

Description

Changes Introduced

burtenshaw left a comment

Choose a reason for hiding this comment

duydl commented Dec 7, 2024

burtenshaw commented Dec 7, 2024

duydl commented Dec 7, 2024

duydl commented Dec 15, 2024

burtenshaw commented Dec 15, 2024

duydl commented Dec 6, 2024 •

edited

Loading