Multimodal Interaction: Object Detection

This repository provides resources for exploring multimodal interaction and object detection using a combination of visual data (images) and text. It is designed for educational purposes, particularly for hands-on tutorials. References are included in each notebook.

1. Object_Detection_Basics_YOLO_OWL-ViT.ipynb

In this notebook the focus is to explore the basic with pre-trained object detection models (i.e. YOLOv8). The tutorial include the following steps:

inference
fine-tuning
open vocabulary extension

Extra:

2. VLM_Basics.ipynb

In this notebook the focus is on learning how to use programamtically 2 VLMs, i.e. GPT4 and Gemini.

3. Identify_Obj_Positions_startingpoint.ipynb

This tutorial contains some utils to help you plot bounding boxes on images and parse outcome of VLM. You are asked to use the material from previous 2 notebooks to perfrom and compare object detection with different methods: Yolo, GPT4 and Gemini. Compare the outcome of the 3 methods.

Install the dependencies with:

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.devcontainer		.devcontainer
images		images
.gitignore		.gitignore
Identify_Obj_Positions_startingpoint.ipynb		Identify_Obj_Positions_startingpoint.ipynb
Object_Detection_Basics_YOLO_OWL-ViT.ipynb		Object_Detection_Basics_YOLO_OWL-ViT.ipynb
README.md		README.md
VLM_Basics.ipynb		VLM_Basics.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Interaction: Object Detection

1. Object_Detection_Basics_YOLO_OWL-ViT.ipynb

2. VLM_Basics.ipynb

3. Identify_Obj_Positions_startingpoint.ipynb

About

Releases

Packages

Languages

zhaw-iwi/MultimodalInteraction_ObjDet

Folders and files

Latest commit

History

Repository files navigation

Multimodal Interaction: Object Detection

1. Object_Detection_Basics_YOLO_OWL-ViT.ipynb

2. VLM_Basics.ipynb

3. Identify_Obj_Positions_startingpoint.ipynb

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages