Skip to content

Commit

Permalink
[doc] update model-zoo test for bm1690 in quick_start
Browse files Browse the repository at this point in the history
Change-Id: I2b620198a8b52cfedfec1312d8102a6877f51c81
  • Loading branch information
ningsen-sophgo authored and charlesxzb committed Apr 15, 2024
1 parent 846b5ca commit 8323eea
Show file tree
Hide file tree
Showing 6 changed files with 169 additions and 203 deletions.
2 changes: 1 addition & 1 deletion docs/quick_start/source_en/03_onnx.rst
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ The main parameters of ``model_deploy`` are as follows (for a complete introduct
- Quantization type (F32/F16/BF16/INT8)
* - processor
- Y
- The platform that the model will use. Support bm1688/bm1684x/bm1684/cv186x/cv183x/cv182x/cv181x/cv180x.
- The platform that the model will use. Support bm1690/bm1688/bm1684x/bm1684/cv186x/cv183x/cv182x/cv181x/cv180x.
* - calibration_table
- N
- The calibration table path. Required when it is INT8 quantization
Expand Down
6 changes: 3 additions & 3 deletions docs/quick_start/source_en/07_quantization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ Use ``run_qtable`` to gen qtable, parameters as below:
- Name of calibration table file
* - processor
- Y
- The platform that the model will use. Support bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x.
- The platform that the model will use. Support bm1690, bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x.
* - fp_type
- N
- Specifies the type of float used for mixing precision. Support auto,F16,F32,BF16. Default is auto, indicating that it is automatically selected by program
Expand Down Expand Up @@ -491,7 +491,7 @@ Use ``run_sensitive_layer`` and bad cases to search sensitive layers, parameters
- Name of calibration table file
* - processor
- Y
- The platform that the model will use. Support bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x.
- The platform that the model will use. Support bm1690, bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x.
* - fp_type
- N
- Specifies the type of float used for mixing precision. Support auto,F16,F32,BF16. Default is auto, indicating that it is automatically selected by program
Expand Down Expand Up @@ -749,7 +749,7 @@ Parameter Description
- mlir file
* - processor
- Y
- The platform that the model will use. Support bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x.
- The platform that the model will use. Support bm1690, bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x.
* - fpfwd_inputs
- N
- Specify layers (including this layer) to skip quantization before them. Multiple inputs are separated by commas.
Expand Down
174 changes: 79 additions & 95 deletions docs/quick_start/source_en/Appx.04_bm168x_test.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Appendix.04: Test SDK release package with TPU-PERF
Appendix.04: Model-zoo test
===================================================


Expand All @@ -16,25 +16,7 @@ If you are using Docker for the first time, use the methods in :ref:`Environment
Get the ``model-zoo`` model
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In your working directory, use the following command to clone the ``model-zoo`` project:

.. code-block:: shell
$ git clone --depth=1 https://github.com/sophgo/model-zoo
$ cd model-zoo
$ git lfs pull --include "*.onnx,*.jpg,*.JPEG,*.npz" --exclude=""
If you have cloned ``model-zoo``, you can execute the following command to synchronize the model to the latest state:

.. code-block:: shell
$ cd model-zoo
$ git pull
$ git lfs pull --include "*.onnx,*.jpg,*.JPEG" --exclude=""
This process downloads a large amount of data from ``GitHub``. Due to differences in specific network environments, this process may take a long time.

If you get the ``model-zoo`` test package provided by SOPHGO, you can do the following to create and set up the ``model-zoo``. After completing this step, go directly to the next section :ref:`get tpu-perf`.
In your working directory, get the ``model-zoo`` test package from the SDK package provided by SOPHGO, then create and set up ``model-zoo`` as follows:

.. code-block:: shell
Expand Down Expand Up @@ -67,18 +49,70 @@ Install the dependencies needed to run ``model-zoo`` on your system (outside of
.. code-block:: shell
# for ubuntu operating system
sudo apt install build-essential
sudo apt install python3-dev
sudo apt install -y libgl1
$ sudo apt install build-essential
$ sudo apt install python3-dev
$ sudo apt install -y libgl1
$ sudo apt install patchelf
# for centos operating system
sudo yum install make automake gcc gcc-c++ kernel-devel
sudo yum install python-devel
sudo yum install mesa-libGL
$ sudo yum install make automake gcc gcc-c++ kernel-devel
$ sudo yum install python-devel
$ sudo yum install mesa-libGL
$ sudo yum install patchelf
# accuracy tests require the following operations to be performed, performance tests can be performed without, it is recommended to use Anaconda to create a virtual environment of python 3.7 or above
cd path/to/model-zoo
pip3 install -r requirements.txt
$ cd path/to/model-zoo
$ pip3 install -r requirements.txt
In addition, tpu hardware needs to be invoked for performance and accuracy tests, so please install the runtime environment for the TPU hardware.


Configure SOC device
~~~~~~~~~~~~~~~~~~

Note: If your device is a PCIE board, you can skip this section directly.

The performance test only depends on the runtime environment for the TPU hardware, so after packaging models, compiled in the toolchain compilation environment, and ``model-zoo``, the performance test can be carried out in the SOC environment by ``tpu_perf``. However, the complete ``model-zoo`` as well as compiled output contents may not be fully copied to the SOC since the storage on the SOC device is limited. Here is a method to run tests on SOC devices through linux nfs remote file system mounts.

First, install the nfs service on the toolchain environment server "host system":

.. code-block:: shell
$ sudo apt install nfs-kernel-server
Add the following content to ``/etc/exports`` (configure the shared directory):

.. code-block:: shell
/the/absolute/path/of/model-zoo *(rw,sync,no_subtree_check,no_root_squash)
Where ``*`` means that everyone can access the shared directory. Moreover, it
can be configured to be accessible by a specific network segment or IP, such as:

.. code-block:: shell
/the/absolute/path/of/model-zoo 192.168.43.0/24(rw,sync,no_subtree_check,no_root_squash)
Then execute the following command to make the configuration take effect:

.. code-block:: shell
$ sudo exportfs -a
$ sudo systemctl restart nfs-kernel-server
In addition, you need to add read permissions to the images in the dataset directory:

.. code-block:: shell
$ chmod -R +r path/to/model-zoo/dataset
Install the client on the SOC device and mount the shared directory:

.. code-block:: shell
In addition, tpu hardware needs to be invoked for performance and accuracy tests, so please install libsophon according to the libsophon manual.
$ mkdir model-zoo
$ sudo apt-get install -y nfs-common
$ sudo mount -t nfs <IP>:/path/to/model-zoo ./model-zoo
In this way, the test directory is accessible in the SOC environment. The rest of the SOC test operation is basically the same as that of PCIE. Please refer to the following content for operation. The difference in command execution position and operating environment has been explained in the execution place.


Prepare dataset
Expand All @@ -93,8 +127,8 @@ After unzipping, move the data under ``Data/CLS_LOC/val`` to a directory like mo

.. code-block:: shell
cd path/to/sophon/model-zoo
mv path/to/imagenet-object-localization-challenge/Data/CLS_LOC/val dataset/ILSVRC2012/ILSVRC2012_img_val
$ cd path/to/sophon/model-zoo
$ mv path/to/imagenet-object-localization-challenge/Data/CLS_LOC/val dataset/ILSVRC2012/ILSVRC2012_img_val
# It is also possible to map the dataset directory to dataset/ILSVRC2012/ILSVRC2012_img_val through the soft link ln -s
Expand All @@ -105,11 +139,11 @@ If the precision test uses the coco dataset (networks trained with coco such as

.. code-block:: shell
cd path/to/model-zoo/dataset/COCO2017/
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
unzip annotations_trainval2017.zip
unzip val2017.zip
$ cd path/to/model-zoo/dataset/COCO2017/
$ wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
$ wget http://images.cocodataset.org/zips/val2017.zip
$ unzip annotations_trainval2017.zip
$ unzip val2017.zip
Vid4 (optional)
Expand Down Expand Up @@ -147,7 +181,7 @@ After running the command, it will be in a Docker container, install tpu_mlir py
Install ``tpu-perf`` tool
~~~~~~~~~~~~~~~~~~~~~~~~~

Download the latest ``tpu-perf`` wheel installation package from https://github.com/sophgo/tpu-perf/releases. For example, ``tpu_perf-x.x.x-py3-none-manylinux2014_x86_64`` .whl.
Get the latest ``tpu-perf`` wheel installer from the SDK package provided by SOPHGO. For example, ``tpu_perf-x.x.x-py3-none-manylinux2014_x86_64.whl`` .

You need to install ``tpu-perf`` both inside and outside of Docker:

Expand All @@ -157,56 +191,6 @@ You need to install ``tpu-perf`` both inside and outside of Docker:
$ pip3 install path/to/tpu_perf-x.x.x-py3-none-manylinux2014_x86_64.whl
Configure SOC device
~~~~~~~~~~~~~~~~~~

Note: If your device is a PCIE board, you can skip this section directly.

The performance test only depends on the ``libsophon`` runtime environment, so after packaging models, compiled in the toolchain compilation environment, and ``model-zoo``, the performance test can be carried out in the SOC environment by ``tpu_perf``. However, the complete ``model-zoo`` as well as compiled output contents may not be fully copied to the SOC since the storage on the SOC device is limited. Here is a method to run tests on SOC devices through linux nfs remote file system mounts.

First, install the nfs service on the toolchain environment server "host system":

.. code-block:: shell
$ sudo apt install nfs-kernel-server
Add the following content to ``/etc/exports`` (configure the shared directory):

.. code-block:: shell
/the/absolute/path/of/model-zoo *(rw,sync,no_subtree_check,no_root_squash)
Where ``*`` means that everyone can access the shared directory. Moreover, it
can be configured to be accessible by a specific network segment or IP, such as:

.. code-block:: shell
/the/absolute/path/of/model-zoo 192.168.43.0/24(rw,sync,no_subtree_check,no_root_squash)
Then execute the following command to make the configuration take effect:

.. code-block:: shell
$ sudo exportfs -a
$ sudo systemctl restart nfs-kernel-server
In addition, you need to add read permissions to the images in the dataset directory:

.. code-block:: shell
$ chmod -R +r path/to/model-zoo/dataset
Install the client on the SOC device and mount the shared directory:

.. code-block:: shell
$ mkdir model-zoo
$ sudo apt-get install -y nfs-common
$ sudo mount -t nfs <IP>:/path/to/model-zoo ./model-zoo
In this way, the test directory is accessible in the SOC environment. The rest of the SOC test operation is basically the same as that of PCIE. Please refer to the following content for operation. The difference in command execution position and operating environment has been explained in the execution place.


.. _test_main:

Model performance and accuracy testing process
Expand All @@ -226,7 +210,7 @@ Execute the following command to compile the ``resnet18-v2`` model:
$ cd ../model-zoo
$ python3 -m tpu_perf.build --target BM1684X --mlir vision/classification/resnet18-v2
where the ``--target`` is used to specify the processor model, which currently supports ``BM1684`` , ``BM1684X`` , ``BM1688`` and ``CV186X`` .
where the ``--target`` is used to specify the processor model, which currently supports ``BM1684`` , ``BM1684X`` , ``BM1688`` , ``BM1690`` and ``CV186X`` .

Execute the following command to compile all test samples:

Expand Down Expand Up @@ -261,7 +245,7 @@ After the command is finished, you will see the newly generated ``output`` folde
Performance test
---------

Running the test needs to be done in an environment outside Docker, it is assumed that you have installed and configured the 1684X device and driver, so you can exit the Docker environment:
Running the test needs to be done in an environment outside Docker, it is assumed that you have installed and configured the runtime environment for the TPU hardware, so you can exit the Docker environment:

.. code-block:: shell
Expand All @@ -277,7 +261,7 @@ Run the following commands under the PCIE board to test the performance of the g
$ cd model-zoo
$ python3 -m tpu_perf.run --target BM1684X --mlir -l full_cases.txt
where the ``--target`` is used to specify the processor model, which currently supports ``BM1684`` , ``BM1684X`` , ``BM1688`` and ``CV186X`` .
where the ``--target`` is used to specify the processor model, which currently supports ``BM1684`` , ``BM1684X`` , ``BM1688`` , ``BM1690`` and ``CV186X`` .

Note: If multiple SOPHGO accelerator cards are installed on the host, you can
specify the running device of ``tpu_perf`` by adding ``--devices id`` when using
Expand All @@ -291,7 +275,7 @@ specify the running device of ``tpu_perf`` by adding ``--devices id`` when using

The SOC device uses the following steps to test the performance of the generated ``bmodel``.

Download the latest ``tpu-perf``, ``tpu_perf-x.x.x-py3-none-manylinux2014_aarch64.whl``, from https://github.com/sophgo/tpu-perf/releases to the SOC device and execute the following operations:
Get the latest ``tpu-perf`` wheel installer from the SDK package provided by SOPHGO. For example, ``tpu_perf-x.x.x-py3-none-manylinux2014_aarch64.whl``, then transfer the file to the SOC device and execute the following operations:

.. code-block:: shell
Expand All @@ -317,11 +301,11 @@ After that, performance data is available in ``output/stats.csv``, in which the
Precision test
---------

Running the test needs to be done in an environment outside Docker, it is assumed that you have installed and configured the 1684X device and driver, so you can exit the Docker environment:
Running the test needs to be done in an environment outside Docker, it is assumed that you have installed and configured the runtime environment for the TPU hardware, so you can exit the Docker environment:

.. code-block:: shell
exit
$ exit
Run the following commands under the PCIE board to test the precision of the generated ``bmodel`` :

Expand All @@ -331,7 +315,7 @@ Run the following commands under the PCIE board to test the precision of the gen
$ cd model-zoo
$ python3 -m tpu_perf.precision_benchmark --target BM1684X --mlir -l full_cases.txt
where the ``--target`` is used to specify the processor model, which currently supports ``BM1684`` , ``BM1684X`` , ``BM1688`` and ``CV186X`` .
where the ``--target`` is used to specify the processor model, which currently supports ``BM1684`` , ``BM1684X`` , ``BM1688`` , ``BM1690`` and ``CV186X`` .

Note: If multiple SOPHGO accelerator cards are installed on the host, you can
specify the running device of ``tpu_perf`` by adding ``--devices id`` when using
Expand All @@ -345,7 +329,7 @@ Specific parameter descriptions can be obtained with the following commands:

.. code-block:: shell
python3 -m tpu_perf.precision_benchmark --help
$ python3 -m tpu_perf.precision_benchmark --help
The output precision data is available in ``output/topk.csv`` . The precision results for ``resnet18-v2``:

Expand Down
2 changes: 1 addition & 1 deletion docs/quick_start/source_zh/03_onnx.rst
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ MLIR转F16模型
* - processor
- 是
- 指定模型将要用到的平台,
支持bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x
支持bm1690, bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x
* - calibration_table
- 否
- 指定校准表路径, 当存在INT8量化的时候需要校准表
Expand Down
6 changes: 3 additions & 3 deletions docs/quick_start/source_zh/07_quantization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@
* - processor
- 是
- 指定模型将要用到的平台,
支持bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x
支持bm1690, bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x
* - fp_type
- 否
- 指定混精度使用的float类型, 支持auto,F16,F32,BF16,默认为auto,表示由程序内部自动选择
Expand Down Expand Up @@ -493,7 +493,7 @@ INT8对称量化模型:
* - processor
- 是
- 指定模型将要用到的平台,
支持bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x
支持bm1690, bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x
* - fp_type
- 否
- 指定混精度使用的float类型, 支持auto,F16,F32,BF16,默认为auto,表示由程序内部自动选择
Expand Down Expand Up @@ -745,7 +745,7 @@ INT8模型mAP为: 34.70%
- 指定mlir文件
* - processor
- 是
- 指定模型将要用到的平台,支持bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x
- 指定模型将要用到的平台,支持bm1690, bm1688, bm1684x, bm1684, cv186x, cv183x, cv182x, cv181x, cv180x
* - fpfwd_inputs
- 否
- 指定层(包含本层)之前不执行量化,多输入用,间隔
Expand Down
Loading

0 comments on commit 8323eea

Please sign in to comment.