Skip to content

Latest commit

 

History

History
66 lines (57 loc) · 2.4 KB

README.md

File metadata and controls

66 lines (57 loc) · 2.4 KB

Rosetta Benchmark on ZCU102

The code here is verified in ZCU102 with Vitis 2021.1. Original Rosetta benchmark repo is here. When compiling optical flow benchmark, please adjust xf_video_mem.hpp location.

About benchmarks

  • Optical Flow, 64: In ./optical-flow_64/src/typedefs.h

      typedef ap_fixed<32,27> outer_pixel_t;
      typedef ap_fixed<64,56> calc_pixel_t;

    And in flow_cal function,

    static outer_pixel_t buf[2];
  • Optical Flow, 96, float: In ./optical-flow_96/src/typedefs.h

      typedef ap_fixed<48,27> outer_pixel_t;
      typedef ap_fixed<96,56> calc_pixel_t;

    And in flow_cal function,

    // static outer_pixel_t buf[2];
    static float buf[2];
  • PAR_FACTOR in typedefs.h are different for Digit Recognition, PAR_FACTOR==* and Spam Filter, PAR_FACTOR==*.

Compile Benchmarks for ZCU102

Adjust the PLATFORM_REPO_PATHS, ROOTFS, etc in ./3d-rendering/zcu102/build.sh.

cd to ./3d-rendering/zcu102 and do below:

./build.sh

The command is the same for other benchmarks.

If you do

python report.py

it will parse compile time for the benchmarks that have been built. Note that rtdgen, cf2sw, xclbinutil, etc, and v++ -p times are not included.

Run Benchmarks on ZCU102

Write the <BENCHMARK_DIR>/package/sd_card.img to the SD card and boot the board.

ssh to ZCU102 and do below:

cd /media/sd-mmcblk0p1/
./run_app.sh

Below are the application latencies for the current version of the code.

Benchmark Kernel Frequency Application Latency
Optlical Flow, 64 200MHz 19.1ms
Optlical Flow, 96, float 200MHz 19.4ms
3D Rendering 200MHz 2.3ms
Digit Recognition, PAR_FACTOR==40 200MHz 12.2ms
Digit Recognition, PAR_FACTOR==80 200MHz 11.5ms
Spam Filter, PAR_FACTOR==32 200MHz 35.7ms
Spam Filter, PAR_FACTOR==64 200MHz 30.4ms