add yolov8 support

sipeed · Jun 12, 2024 · 2bcb963 · 2bcb963
1 parent 4330fe1
commit 2bcb963
Show file tree

Hide file tree

Showing 5 changed files with 237 additions and 7 deletions.
diff --git a/docs/doc/en/sidebar.yaml b/docs/doc/en/sidebar.yaml
@@ -56,7 +56,7 @@ items:
     -   file: vision/classify.md
         label: AI object classify
     -   file: vision/yolov5.md
-        label: YOLOv5 object detect
+        label: YOLOv5/v8 object detect
     -   file: vision/face_detection.md
         label: Face detect and keypoints
     -   file: vision/face_recognition.md

diff --git a/docs/doc/en/vision/yolov5.md b/docs/doc/en/vision/yolov5.md
@@ -1,5 +1,5 @@
 ---
-title: Using YOLOv5 Model for Object Detection with MaixPy
+title: Using YOLOv5 / YOLOv8 Model for Object Detection with MaixPy
 ---
 
 ## Concept of Object Detection
@@ -10,12 +10,14 @@ Unlike classification, object detection includes positional information, so the
 
 ## Using Object Detection in MaixPy
 
-MaixPy comes with the `YOLOv5` model by default, which can be used directly:
+MaixPy comes with the `YOLOv5` and `YOLOv8` model by default, which can be used directly:
+> MaixPy need >= 4.3.0 to use YOLOv8.
 
 ```python
 from maix import camera, display, image, nn, app
 
 detector = nn.YOLOv5(model="/root/models/yolov5s.mud")
+# detector = nn.YOLOv8(model="/root/models/yolov8n.mud")
 
 cam = camera.Camera(detector.input_width(), detector.input_height(), detector.input_format())
 dis = display.Display()
@@ -36,8 +38,114 @@ Demonstration video:
 
 This setup uses a camera to capture images, which are then sent to the `detector` for detection. The results (classification name and position) are displayed on the screen.
 
+And here Replace `YOLOv5` and `YOLOv8` to switch `v5/v8`. Note that the model file path also needs to be modified.
+
+For a list of 80 objects supported by the model, please see the appendix of this article.
+
 For more API usage, refer to the documentation of the [maix.nn](/api/maix/nn.html) module.
 
+## More input resolutions
+
+The default model input is `320x224` resolution, because this resolution ratio is close to the default screen resolution. You can also manually download models with other resolutions to replace:
+
+YOLOv5s: https://maixhub.com/model/zoo/365
+YOLOv8n: https://maixhub.com/model/zoo/400
+
+The larger the resolution, the higher the accuracy, but the longer the running time. Just choose the appropriate one according to your application scenario.
+
+## Can the camera resolution and model resolution be different?
+
+When using the `detector.detect(img)` function for detection above, if the resolution of `img` is different from the model resolution, this function will automatically call `img.resize` to scale the image to the same resolution as the model input. `resize` uses the `image.Fit.FIT_CONTAIN` method by default, that is, the aspect ratio is maintained and the surrounding is filled with black. The detected coordinates will also be automatically mapped to the coordinates of the original `img`.
+
 ## Training Your Own Object Detection Model
 
 Please visit [MaixHub](https://maixhub.com) to learn and train object detection models. When creating a project, select `Object Detection Model`.
+
+## Appendix: 80 Categories
+
+The 8 objects in the COCO dataset are:
+
+
+```txt
+person
+bicycle
+car
+motorcycle
+airplane
+bus
+train
+truck
+boat
+traffic light
+fire hydrant
+stop sign
+parking meter
+bench
+bird
+cat
+dog
+horse
+sheep
+cow
+elephant
+bear
+zebra
+giraffe
+backpack
+umbrella
+handbag
+tie
+suitcase
+frisbee
+skis
+snowboard
+sports ball
+kite
+baseball bat
+baseball glove
+skateboard
+surfboard
+tennis racket
+bottle
+wine glass
+cup
+fork
+knife
+spoon
+bowl
+banana
+apple
+sandwich
+orange
+broccoli
+carrot
+hot dog
+pizza
+donut
+cake
+chair
+couch
+potted plant
+bed
+dining table
+toilet
+tv
+laptop
+mouse
+remote
+keyboard
+cell phone
+microwave
+oven
+toaster
+sink
+refrigerator
+book
+clock
+vase
+scissors
+teddy bear
+hair drier
+toothbrush
+```
+
diff --git a/docs/doc/zh/sidebar.yaml b/docs/doc/zh/sidebar.yaml
@@ -56,7 +56,7 @@ items:
     -   file: vision/classify.md
         label: AI 物体分类
     -   file: vision/yolov5.md
-        label: YOLOv5 物体检测
+        label: YOLOv5/v8 物体检测
     -   file: vision/face_detection.md
         label: 人脸及关键点检测
     -   file: vision/face_recognition.md

diff --git a/docs/doc/zh/vision/yolov5.md b/docs/doc/zh/vision/yolov5.md
@@ -1,5 +1,5 @@
 ---
-title: MaixPy 使用 YOLOv5 模型进行目标检测
+title: MaixPy 使用 YOLOv5 / YOLOv8 模型进行目标检测
 ---
 
 
@@ -11,12 +11,14 @@ title: MaixPy 使用 YOLOv5 模型进行目标检测
 
 ## MaixPy 中使用目标检测
 
-MaixPy 默认提供了 `YOLOv5` 模型，可以直接使用：
+MaixPy 默认提供了 `YOLOv5` 和 `YOLOv8` 模型，可以直接使用：
+> YOLOv8 需要 MaixPy >= 4.3.0。
 
 ```python
 from maix import camera, display, image, nn, app
 
 detector = nn.YOLOv5(model="/root/models/yolov5s.mud")
+# detector = nn.YOLOv8(model="/root/models/yolov8n.mud")
 
 cam = camera.Camera(detector.input_width(), detector.input_height(), detector.input_format())
 dis = display.Display()
@@ -38,9 +40,114 @@ while not app.need_exit():
 
 这里使用了摄像头拍摄图像，然后传给 `detector`进行检测，得出结果后，将结果(分类名称和位置)显示在屏幕上。
 
+以及这里 替换`YOLOv5` 和`YOLOv8`即可实现`v5/v8`切换，注意模型文件路径也要修改。
+
+模型支持的 80 种物体列表请看本文附录。
+
 更多 API 使用参考 [maix.nn](/api/maix/nn.html) 模块的文档。
 
-## 训练自己的目标检测模型
+## 更多输入分辨率
+
+默认的模型输入是`320x224`分辨率，因为这个分辨率比例和默认提供的屏幕分辨率接近，你也可以手动下载其它分辨率的模型替换：
+
+YOLOv5s: https://maixhub.com/model/zoo/365
+YOLOv8n: https://maixhub.com/model/zoo/400
+
+分辨率越大精度越高，但是运行耗时越长，根据你的应用场景选择合适的即可。
+
+## 摄像头分辨率和模型分辨率不同可以吗
+
+上面使用`detector.detect(img)`函数进行检测时，如果 `img` 的分辨率和模型分辨率不同，这个函数内部会自动调用`img.resize`将图像缩放成和模型输入分辨率相同的，`resize`默认使用`image.Fit.FIT_CONTAIN` 方法，即保持宽高比缩放，周围填充黑色的方式，检测到的坐标也会自动映射到原`img`的坐标上。
+
+
+## 在线训练自己的目标检测模型
 
 请到[MaixHub](https://maixhub.com) 学习并训练目标检测模型，创建项目时选择`目标检测模型`即可。
 
+
+## 附录：80分类
+
+COCO 数据集的 8 种物体分别为：
+
+```txt
+person
+bicycle
+car
+motorcycle
+airplane
+bus
+train
+truck
+boat
+traffic light
+fire hydrant
+stop sign
+parking meter
+bench
+bird
+cat
+dog
+horse
+sheep
+cow
+elephant
+bear
+zebra
+giraffe
+backpack
+umbrella
+handbag
+tie
+suitcase
+frisbee
+skis
+snowboard
+sports ball
+kite
+baseball bat
+baseball glove
+skateboard
+surfboard
+tennis racket
+bottle
+wine glass
+cup
+fork
+knife
+spoon
+bowl
+banana
+apple
+sandwich
+orange
+broccoli
+carrot
+hot dog
+pizza
+donut
+cake
+chair
+couch
+potted plant
+bed
+dining table
+toilet
+tv
+laptop
+mouse
+remote
+keyboard
+cell phone
+microwave
+oven
+toaster
+sink
+refrigerator
+book
+clock
+vase
+scissors
+teddy bear
+hair drier
+toothbrush
+```
diff --git a/examples/vision/ai_vision/nn_yolov8.py b/examples/vision/ai_vision/nn_yolov8.py
@@ -0,0 +1,15 @@
+from maix import camera, display, image, nn, app
+
+detector = nn.YOLOv8(model="/root/models/yolov8n.mud")
+
+cam = camera.Camera(detector.input_width(), detector.input_height(), detector.input_format())
+dis = display.Display()
+
+while not app.need_exit():
+    img = cam.read()
+    objs = detector.detect(img, conf_th = 0.5, iou_th = 0.45)
+    for obj in objs:
+        img.draw_rect(obj.x, obj.y, obj.w, obj.h, color = image.COLOR_RED)
+        msg = f'{detector.labels[obj.class_id]}: {obj.score:.2f}'
+        img.draw_string(obj.x, obj.y, msg, color = image.COLOR_RED)
+    dis.show(img)