the calculation of the FullyConnected layer takes a lot of time #7273

zhaohb · 2021-08-27T11:10:50Z

System information (version)

OpenVINO => 2021.4
Compiler => gcc

now I have a model, and I've pressed it with benchmark and found that the FullyConnected layer takes a lot of time，about 25% of the total inference time：

dense/BiasAdd                 EXECUTED       layerType: FullyConnected     realTime: 4481      cpu: 4481           execType: jit_gemm_FP32
...
dense_2/BiasAdd               EXECUTED       layerType: FullyConnected     realTime: 4490      cpu: 4490           execType: jit_gemm_FP32
...

In this link https://toscode.gitee.com/vinsonSpace/openvino/blob/master/build-instruction.md saw that GEMM can be accelerated through openblas or MKL：

I want to use mkl:

cmake ..     -DENABLE_CLDNN=OFF     -DENABLE_OPENCV=OFF     -DENABLE_VPU=OFF     -DENABLE_PYTHON=ON    -DNGRAPH_ONNX_IMPORT_ENABLE=ON -DNGRAPH_ONNX_FRONTEND_ENABLE=ON  -DNGRAPH_ONNX_EDITOR_ENABLE=ON  -DGEMM=MKL -DMKLROOT=/work/compile_ov_20214/mklml_lnx_2019.0.5.20190502  -DCMAKE_INSTALL_PREFIX=/work/compile_ov_20214/openvino_dist

but I am reminded that this GEMM macro is not available.

CMake Warning:
  Manually-specified variables were not used by the project:

    GEMM
    DMKLROOT
......

openvino doesn't support this macro？How to accelerate FullyConnected layer？

zhaohb · 2021-08-27T14:21:14Z

my bad, GEMM macros can be set in version 2021.1, but now I'm using 2021.4.
but how do I choose gemm's implementation? Or will it automatically choose the best performance implementation, MKL or openblas？

Iffa-Intel · 2021-08-30T04:48:59Z

Hi,
according to the documentation, the default build uses an internal JIT GEMM implementation.
So if you didn't specify GEMM=OPENBLAS or -DGEMM=MKL, etc in the build, it would automatically use the internal JIT GEMM.

zhaohb · 2021-08-30T05:14:01Z

but in 2021.4 I can not found this macros, why delete it ?

zhaohb · 2021-08-31T07:35:55Z

@Iffa-Meah can you help me?

jgespino · 2021-09-01T22:37:32Z

Hi @zhaohb

I see GEMM was removed starting on OpenVINO 2021.2 release, I would have to check with the development team. Could you provide your model? I want to reproduce the behavior and get the development team's input as well.

Regards,
Jesus

zhaohb · 2021-09-02T01:50:16Z

ok, I will share my model with you later , but I want to know why remove GEMM? Is the performance similar between different implementations？
my model : https://drive.google.com/drive/folders/10FfO_AgJtJMJx5bcSEd-p0S6oeDWI1k-?usp=sharing you can dowmload it.

zhaohb · 2021-09-02T01:57:02Z

@jgespino I have tried to add GEMM in 2021.4, but failed, so I hope you can add GEMM and test my model to see if the performance of these methods is the same, thank you very much.

zhaohb · 2021-09-06T04:28:04Z

@jgespino Is there any progress now？

jgespino · 2021-09-07T22:05:00Z

Hi @zhaohb

Not yet, I have to check with the development team. I see GEMM was removed by pull request #5642.

Regards,
Jesus

Ref. 65047

zhaohb · 2021-09-13T01:53:59Z

@jgespino I have added the code for the PR deletion #5642 but I still can't compile successfully
So when will this feature be available on the official？

zhaohb · 2021-09-13T07:47:35Z

@jgespino How can I tell if gemm=MKL has compiled successfully?
By default, I didn't add gemm=MKL, benchmark_app.py -pc 1 show that:

dense_1/BiasAdd               EXECUTED       layerType: FullyConnected     realTime: 3055      cpu: 3055           execType: jit_gemm_FP32

But I added gemm=MKL, compiled successfully, benchmark_app.py -pc still shows:

dense_1/BiasAdd               EXECUTED       layerType: FullyConnected     realTime: 3055      cpu: 3035           execType: jit_gemm_FP32

Will execType change after using MKL? and the execution time seem not changed.

zhaohb · 2021-09-26T06:48:32Z

Who can give me some advice？

jgespino · 2021-09-27T17:18:57Z

Hi @zhaohb

I appreciate your patience, I've reached out to the development team for additional assistance.
I will let you know what I find out.

Regards,
Jesus

zhaohb · 2021-09-28T01:36:00Z

@jgespino thank you very much. let me know if you find anything!

zhaohb · 2021-10-14T12:45:25Z

@jgespino How is the progress now? I really need a solution to this problem

zhaohb · 2021-10-29T08:20:33Z

Hi, who can help me?

dmitry-gorokhov · 2021-11-02T09:25:47Z

Hi @zhaohb.
As you correctly mentioned before we have alternative implementations for matrix multiplication routines: MKL, OpenBLAS. By default we are using oneDNN for such kind of operations. We performed huge amount of performance checks which showed that oneDNN provides best performance for matrix multiplication operations (layerType: FullyConnected) in all cases. This is justification for decision to drop MKL and OpenBLAS options support. In other words OpenVINO should provide best MatMul performance with default options.
BTW, which HW you are using for benchmarking?

zhaohb · 2021-11-03T02:40:35Z

@dmitry-gorokhov thank you for your reply.

this is part of my model, there are many combinations of HW, such as (1490x256、1490x4、256x256), and the slowest one should be 1490x256.
If it is not possible to accelerate FC from the operator, can we optimize FC from other ways?
I also tried to increase the number of CPU cores, but nothing happened.

zhaohb · 2021-11-04T14:02:55Z

@dmitry-gorokhov This part of the model is a bit wide, how can we increase parallelism in this part, which I think should improve performance.

dmitry-gorokhov · 2021-11-08T06:36:15Z

@zhaohb By HW I actually mentioned hardware :). It is important to know which system your are using for benchmarking because it affects possible ways for perf improvements.

dmitry-gorokhov · 2021-11-10T07:27:43Z

@zhaohb Glad to hear that. I expect the PR to be merged within 2 weeks.
Seems like you can also can try different values for -nstreams parameter. For example using 12 streams instead of 8 might improve throughput while preserving 100 ms latency. On the other hand there might cases then big number of streams might be harmful for performance because of L3 repletion.

zhaohb · 2021-11-10T07:45:05Z

@dmitry-gorokhov ok, I will try to test the optimal number of nstreams.
But I have another question, is based on https://github.com/xuchen-intel/openvino.git reduce_node_extension branch I compile openvino, or wait for this branch merge into the mster branch than I compile openvino? Is there a difference between the two ways?
Which one do you recommend

thank you very much.

dmitry-gorokhov · 2021-11-10T07:49:05Z

@zhaohb There shouldn't be much difference in terms of performance. So you can use feature branch for benchmarking.

zhaohb · 2021-11-10T07:53:23Z

@dmitry-gorokhov More than just benchmarking, I wanted to add this branch to the Model Server and make it available to the Model Server for the best inference performance

zhaohb · 2021-11-12T08:49:08Z

@dmitry-gorokhov I compiled the reduce_node_extension branch, but found that I could not generate the opencv library, this is my compile command：

cmake .. -DCMAKE_BUILD_TYPE=Release -DENABLE_CLDNN=OFF -DENABLE_OPENCV=OFF -DTHREADING=TBB -DENABLE_GNA=OFF -DENABLE_VPU=OFF -DENABLE_PYTHON=ON -DNGRAPH_ONNX_FRONTEND_ENABLE=ON -DENABLE_OPENCV=ON -DCMAKE_INSTALL_PREFIX=/work/6686_openvino/out_6686_opencv/

but output :

-- OpenVINO version is 2022.1.0
-- CMAKE_BUILD_TYPE: Release
CMake Warning at cmake/developer_package/clang_format/clang_format.cmake:21 (message):
  Supported clang-format version is not found!
Call Stack (most recent call first):
  cmake/developer_package/IEDevScriptsConfig.cmake:294 (include)
  CMakeLists.txt:11 (find_package)


CMake Warning at cmake/developer_package/ncc_naming_style/ncc_naming_style.cmake:26 (message):
  Please, install libclang-[N]-dev package (required for ncc naming style
  check)
Call Stack (most recent call first):
  cmake/developer_package/IEDevScriptsConfig.cmake:295 (include)
  CMakeLists.txt:11 (find_package)


-- clang package is installed, but may have different version (5.0). Please use "/usr/bin/python3 -m pip install clang==9.0".
-- Inference Engine enabled features:
--
--     CI_BUILD_NUMBER: custom_reduce_node_extension_24b77d73c44f7058f4b0d05b59e079a7b80ab467
--     ENABLE_LTO = OFF
--     OS_FOLDER = OFF
--     USE_BUILD_TYPE_SUBFOLDER = ON
--     TREAT_WARNING_AS_ERROR = ON
--     ENABLE_INTEGRITYCHECK = OFF
--     ENABLE_SANITIZER = OFF
--     ENABLE_UB_SANITIZER = OFF
--     ENABLE_THREAD_SANITIZER = OFF
--     ENABLE_COVERAGE = OFF
--     ENABLE_SSE42 = ON
--     ENABLE_AVX2 = ON
--     ENABLE_AVX512F = ON
--     BUILD_SHARED_LIBS = ON
--     ENABLE_FASTER_BUILD = OFF
--     ENABLE_CPPLINT = ON
--     ENABLE_CPPLINT_REPORT = OFF
--     ENABLE_CLANG_FORMAT = OFF
--     ENABLE_NCC_STYLE = OFF
--     VERBOSE_BUILD = OFF
--     ENABLE_UNSAFE_LOCATIONS = OFF
--     ENABLE_FUZZING = OFF
--     ENABLE_MKL_DNN = ON
--     ENABLE_TESTS = OFF
--     ENABLE_STRICT_DEPENDENCIES = ON
--     ENABLE_CLDNN = OFF
--     ENABLE_PROFILING_ITT = OFF
--     ENABLE_PROFILING_FILTER = ALL
--     ENABLE_PROFILING_FIRST_INFERENCE = ON
--     SELECTIVE_BUILD = OFF
--     ENABLE_ERROR_HIGHLIGHT = OFF
--     ENABLE_PYTHON = ON
--     ENABLE_DOCS = OFF
--     ENABLE_GNA = OFF
--     ENABLE_CLDNN_TESTS = OFF
--     THREADING = TBB
--     ENABLE_VPU = OFF
--     ENABLE_MYRIAD = OFF
--     ENABLE_MYRIAD_NO_BOOT = OFF
--     ENABLE_GAPI_TESTS = OFF
--     GAPI_TEST_PERF = OFF
--     ENABLE_MYRIAD_MVNC_TESTS = OFF
--     ENABLE_DATA = OFF
--     ENABLE_BEH_TESTS = OFF
--     ENABLE_FUNCTIONAL_TESTS = OFF
--     ENABLE_SAMPLES = 0
--     ENABLE_OPENCV = ON
--     ENABLE_V7_SERIALIZE = OFF
--     ENABLE_TBB_RELEASE_ONLY = ON
--     ENABLE_SYSTEM_PUGIXML = OFF
--     ENABLE_DEBUG_CAPS = OFF
--     ENABLE_GPU_DEBUG_CAPS = OFF
--     ENABLE_CPU_DEBUG_CAPS = OFF
--     NGRAPH_ONNX_FRONTEND_ENABLE = ON
--     NGRAPH_PDPD_FRONTEND_ENABLE = ON
--     NGRAPH_IR_FRONTEND_ENABLE = ON
--     NGRAPH_USE_PROTOBUF_LITE = ON
--     NGRAPH_USE_SYSTEM_PROTOBUF = OFF
--     OPENVINO_DEBUG_ENABLE = OFF
--     ENABLE_REQUIREMENTS_INSTALL = ON
--
-- MODELS_PATH=
-- PROJECT ............................... OpenVINO
-- CMAKE_BINARY_DIR ...................... /work/6686_openvino/openvino/build
-- OpenVINO_SOURCE_DIR ................... /work/6686_openvino/openvino
-- CMAKE_GENERATOR ....................... Unix Makefiles
-- CMAKE_C_COMPILER_ID ................... GNU
-- CMAKE_BUILD_TYPE ...................... Release
-- The name pugixml::static is an ALIAS for pugixml-static. It will be exported to the InferenceEngineDeveloperPackage with the original name.
-- The name gflags is an ALIAS for gflags_nothreads_static. It will be exported to the InferenceEngineDeveloperPackage with the original name.
--
-- 3.9.2.0
-- Found PythonInterp: /usr/bin/python3 (found version "3.8.10")
-- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.8.so (found version "3.8.10")
Generated: /work/6686_openvino/openvino/build/thirdparty/onnx/onnx/onnx/onnx_ngraph_onnx-ml.proto
Generated: /work/6686_openvino/openvino/build/thirdparty/onnx/onnx/onnx/onnx-operators_ngraph_onnx-ml.proto
Generated: /work/6686_openvino/openvino/build/thirdparty/onnx/onnx/onnx/onnx-data_ngraph_onnx.proto
--
-- ******** Summary ********
--   CMake version             : 3.16.3
--   CMake command             : /usr/bin/cmake
--   System                    : Linux
--   C++ compiler              : /usr/bin/c++
--   C++ compiler version      : 9.3.0
--   CXX flags                 : -Wsuggest-override  -D_GLIBCXX_USE_CXX11_ABI=1 -Wno-error=parentheses  -Wformat -Wformat-security -D_FORTIFY_SOURCE=2 -fstack-protector-strong -s -fsigned-char -Werror -ffunction-sections -fdata-sections -fdiagnostics-show-option -Wundef -Wreturn-type -Wunused-variable -Wuninitialized -Winit-self -Wmaybe-uninitialized -Wno-suggest-override -Wnon-virtual-dtor
--   Build type                : Release
--   Compile definitions       : IE_BUILD_POSTFIX="";ENABLE_MKL_DNN=1
--   CMAKE_PREFIX_PATH         :
--   CMAKE_INSTALL_PREFIX      : /work/6686_openvino/out_6686_opencv
--   CMAKE_MODULE_PATH         :
--
--   ONNX version              : 1.9.0
--   ONNX NAMESPACE            : ngraph_onnx
--   ONNX_USE_LITE_PROTO       : ON
--   USE_PROTOBUF_SHARED_LIBS  : OFF
--   ONNX_DISABLE_EXCEPTIONS   : OFF
--   ONNX_WERROR               : OFF
--   ONNX_BUILD_TESTS          : OFF
--   ONNX_BUILD_BENCHMARKS     : OFF
--   ONNXIFI_DUMMY_BACKEND     : OFF
--   ONNXIFI_ENABLE_EXT        : OFF
--
--   Protobuf compiler         :
--   Protobuf includes         :
--   Protobuf libraries        :
--   BUILD_ONNX_PYTHON         : OFF
-- The name openvino::pp is an ALIAS for openvino_preprocessor. It will be exported to the InferenceEngineDeveloperPackage with the original name.
-- The name openvino::itt is an ALIAS for itt. It will be exported to the InferenceEngineDeveloperPackage with the original name.
-- The name openvino::conditional_compilation is an ALIAS for conditional_compilation. It will be exported to the InferenceEngineDeveloperPackage with the original name.
-- The name ngraph::builder is an ALIAS for ngraph_builders. It will be exported to the InferenceEngineDeveloperPackage with the original name.
-- The name ngraph::reference is an ALIAS for ngraph_reference. It will be exported to the InferenceEngineDeveloperPackage with the original name.
-- nGraph unit tests disabled
-- pybind11 v2.8.0 dev1
-- Python version=python3.8
-- TBB: /work/6686_openvino/openvino/inference-engine/temp/tbb
-- GPU support is disabled
-- Primitive cache is disabled
-- Static tbbbind_2_4 package was found
-- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.8.so (found suitable version "3.8.10", minimum required is "3")
-- Found Cython version 0.29.24
CMake Warning at inference-engine/samples/common/format_reader/CMakeLists.txt:21 (message):
  OPENCV is disabled or not found, format_reader will be built without OPENCV
  support


-- Register template_plugin to be built in build-modules/template_plugin
-- Found PythonInterp: /usr/bin/python3 (found suitable version "3.8.10", minimum required is "3")
-- Configuring done
-- Generating done
-- Build files have been written to: /work/6686_openvino/openvino/build

and the directory structure of the output has changed, like this:

install_dependencies  python  runtime  samples  setupvars.sh  tools

The previous directory structure looked like this：

bin              deployment_tools  inference_engine      licensing  python
data_processing  documentation     install_dependencies  opencv

The new directory structure was problematic when I recompiled the Model Server，what should I do ?
thank you vert much.

zhaohb · 2021-11-12T09:03:10Z

maybe I should use model server develop branch.

zhaohb · 2021-11-16T01:55:30Z

@dmitry-gorokhov It's my fault, the width of the model is not the bottleneck of OpenVino, the root problem is FC, if you have a lot of FC performance will degrade a lot.

zhaohb · 2021-11-16T06:46:51Z

@dmitry-gorokhov Which file is the operator implementation of FC in? I want to try and optimize it.
thank you very much.

jgespino · 2022-01-10T17:34:43Z

@zhaohb Just following up on this discussion, is this something you are still working on?

zhaohb · 2022-01-11T02:19:05Z

@jgespino yes, I am trying, but I also want some help, how to optimize FC, I do not have a particularly good method now.

jgespino · 2022-01-13T15:23:47Z

@dmitry-gorokhov @zhaohb Could you provide some guidance on possible approach to optimize FC?

jgespino · 2022-08-31T16:09:16Z

@zhaohb Apologies for the delay in our response. Could you please grant me access to the original model that was converted to IR format? Is it included in the link below?

https://drive.google.com/drive/folders/10FfO_AgJtJMJx5bcSEd-p0S6oeDWI1k-?usp=sharing

zhaohb · 2022-09-01T01:47:15Z

Yes, of course. I'll send [email protected] mailbox is whitelisted to access the model file.

jgespino · 2022-09-01T16:28:30Z

@zhaohb Received the invite, thank you! I don't see the original blue_c_concat_end.onnx model, is that something you can share?

zhaohb · 2022-09-02T03:03:26Z

@jgespino yes, It can be shared. I've uploaded it.
By the way, are you going to optimize it？thank you very much.

jgespino · 2022-09-02T22:49:33Z

@zhaohb Thanks! Yes, I want to test it on the latest OpenVINO release and see if the performance improved. I'll need to find a system with a processor similar to Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz.

We have a pre-release version of OpenVINO 2022.2 release on PyPI in case you want to try it from your side as well.
https://pypi.org/project/openvino-dev/2022.2.0.dev20220829/

Regards,
Jesus

zhaohb · 2022-09-05T02:23:27Z

@jgespino What is the optimization of Matmul/GEMM in 2022.2 compared to the previous version? I have tested 2022.1, but there is no improvement compared to the previous version.

zhaohb · 2022-10-14T08:52:03Z

@jgespino @avitial Have you made any progress？

avitial · 2022-10-17T17:16:35Z

@zhaohb I don't have access to same Xeon Skylake processor as you do. But testing on Icelake Intel® Xeon® Platinum 8368 CPU I can see some improvement in FullyConnected layers between OpenVINO versions (2021.4.1 vs 2022.2). This test was using the model you have shared with us.

In 2022.2 release 5 of 19 FullyConnected layers run as brgemm_avx512_FP32 and 14 of 19 as jit_gemm_FP32, whereas in 2021.4.1 release all 19 FC layers execute as jit_gemm_FP32.

Cumulative time, roughly, for all FullyConnected layers in 2021.4.1 is 11.72 ms whereas in 2022.2 is 1.40e-5 ms.

Not sure this type of improvement is expected in your environment/configuration, but might be worthwhile trying it out with 2022.2. Note in the table below jit_gemm_FP32;1.215; represents exec_type and exec_time in ms.

$ benchmark_app -m 2022.2/blue_c_concat_end.xml -d CPU -niter 10000 -api async -nstreams 8 -hint none

	2021.4.1-3926-14e67d86634-releases/2021/4	2022.2.0-7713-af16ea1d79a-releases/2022/2
dense_1/BiasAdd	jit_gemm_FP32;1.215;	brgemm_avx512_FP32;0:00:00.000002
dense_2/BiasAdd	jit_gemm_FP32;1.195;	brgemm_avx512_FP32;0:00:00.000002

avitial · 2022-10-27T16:40:38Z

Closing this, I hope previous responses were sufficient to help you proceed. Feel free to reopen and ask additional questions related to this topic.

akote123 · 2024-05-22T09:07:57Z

Hi @avitial ,
In Graviton3(aarch64) , for fully connected op does it uses openblas or libxsmm and how can we check which library is being used.

dmitry-gorokhov · 2024-05-22T09:55:22Z

Hi @akote123.
OpenVINO uses ACL library on all ARM platforms.
If you are running "benchmark_app" you can add "-pc" which provides additional details about executed operations. It includes primType field which should help to clarify what backend is used.

NishantPrabhuFujitsu · 2024-05-22T10:06:51Z

@dmitry-gorokhov Continuing on @akote123 question, is ACL used through oneDNN or independently? I noticed that nodes of type FullyConnected are not being executed through oneDNN (calls to those nodes don't show up in ONEDNN_VERBOSE logs) but MatMul nodes are.

dmitry-gorokhov · 2024-05-22T10:34:27Z

ACL used through oneDNN or independently?

Depends on the operations. But for Convolution, Matmul and FullyConnected we use OneDNN which fallbacks on ACL internally. This actually gives us an ability to leverage SVE kernels as well: https://github.com/openvinotoolkit/oneDNN/blob/v3.3_for_ie_master/src/cpu/cpu_convolution_list.cpp#L118-L120

Cannot say for sure why FC is not visible in VERBOSE_LOG.

NishantPrabhuFujitsu · 2024-05-22T11:03:58Z

@dmitry-gorokhov I see... I should probably give a bigger picture of what I'm trying to do.

I've been trying to run this script for LLaMA-2 from openvino.genai and determine the backend path followed for aarch64. I found an issue on the same repo which demonstrated how to collect profiling information with the primitives used for each operation during inference. I transferred that script to samples/ in this repo (with necessary changes) and built it along with the other samples.

For aarch64, I observed that FullyConnected layers fell back to a reference implementation (ref_any_f16 primitive) while MatMul layers used GEMM kernels from ACL. When I override this function all FullyConnected layers remained as MatMul layers and were executed using gemm:acl as expected. However, their execution times were ~10x slower than OSS oneDNN (v3.3.3, benchmarked with benchDNN).

A bit more investigation into the ONEDNN_VERBOSE logs revealed that matmuls in benchDNN were using the blocked implementation (wei_f16:a:blocked:aCb16c::f0) while calls to oneDNN from OpenVINO used the plain implementation (wei_f16:a:blocked:acb::f0). Forcing benchDNN to use the plain implementation makes its execution as slow as that of OpenVINO, leading me to believe that the slowdown on aarch64 is due to blocked layout in matmul not being used.

My question: Do you know why gemm:acl matmul doesn't use the blocked implementation? Are there any flags/modifications to be done during build to enable it?

Please note:

I also ran this on an x86 (Sapphire Rapids) machine; matmul executions aren't blocked there either. However, the brgemm_avx512_bf16 kernel gets called and it provides good execution times even without blocking.
I've also raised this on openvino.genai (this and this issue) but I'm waiting for any resolution there. I decided to ask here since I'm building my script with openvino source now.

dmitry-gorokhov · 2024-05-22T12:48:59Z

@NishantPrabhuFujitsu
We basically has 2 different operations to descibe matrix multiplication math. FullyConnected is used in case second input constains constant values (weights), while Matmul is used then 2nd input is dynamic. Blocked layout is applied for FC weights only. Matmul do not apply blocked layout to avoid data reorder on each iteration (since data is dynamic). So by disabling ConvertMatMulToFullyConnected pass you prevent the runtime from usage of dedicated FullyConnected operation.

The issue with FC (which fallback on ref impl) was caused by some bug on ACL side. The team already shared related patches with us and @alvoron incorporated them into OV runtime: openvinotoolkit/openvino.genai#438 (comment). So I would expect with that custom OV version correct OneDNN/ACL impls to be chosen.

NishantPrabhuFujitsu · 2024-05-23T05:06:43Z

@dmitry-gorokhov Thank you for providing clarity on how MatMul and FullyConnected nodes work. I tried out the patches shipped by @alvoron, and the issue I was facing has been resolved. Thanks again for your support.

zhaohb added bug Something isn't working support_request labels Aug 27, 2021

Munesh-Intel assigned Iffa-Intel Aug 30, 2021

Iffa-Intel added category: build OpenVINO cmake script / infra and removed bug Something isn't working labels Aug 30, 2021

Iffa-Intel added the PSE label Sep 1, 2021

jgespino self-assigned this Sep 1, 2021

mzegla mentioned this issue Sep 7, 2021

the calculation of the FullyConnected layer takes a lot of time openvinotoolkit/model_server#887

Closed

jgespino added the bug Something isn't working label Sep 7, 2021

zhaohb mentioned this issue Sep 15, 2021

how to build openvino with clang #7473

Closed

avitial self-assigned this Oct 5, 2022

avitial closed this as completed Oct 27, 2022

NishantPrabhuFujitsu mentioned this issue May 23, 2024

FullyConnected nodes use slow reference kernel on ARM openvinotoolkit/openvino.genai#438

Closed

the calculation of the FullyConnected layer takes a lot of time #7273

the calculation of the FullyConnected layer takes a lot of time #7273

Comments

zhaohb commented Aug 27, 2021 • edited Loading

System information (version)

zhaohb commented Aug 27, 2021

Iffa-Intel commented Aug 30, 2021 • edited Loading

zhaohb commented Aug 30, 2021

zhaohb commented Aug 31, 2021

jgespino commented Sep 1, 2021

zhaohb commented Sep 2, 2021 • edited Loading

zhaohb commented Sep 2, 2021

zhaohb commented Sep 6, 2021

jgespino commented Sep 7, 2021

zhaohb commented Sep 13, 2021

zhaohb commented Sep 13, 2021

zhaohb commented Sep 26, 2021

jgespino commented Sep 27, 2021

zhaohb commented Sep 28, 2021

zhaohb commented Oct 14, 2021

zhaohb commented Oct 29, 2021

dmitry-gorokhov commented Nov 2, 2021

zhaohb commented Nov 3, 2021

zhaohb commented Nov 4, 2021

dmitry-gorokhov commented Nov 8, 2021

dmitry-gorokhov commented Nov 10, 2021

zhaohb commented Nov 10, 2021

dmitry-gorokhov commented Nov 10, 2021

zhaohb commented Nov 10, 2021

zhaohb commented Nov 12, 2021

zhaohb commented Nov 12, 2021

zhaohb commented Nov 16, 2021

zhaohb commented Nov 16, 2021

jgespino commented Jan 10, 2022

zhaohb commented Jan 11, 2022

jgespino commented Jan 13, 2022

jgespino commented Aug 31, 2022

zhaohb commented Sep 1, 2022

jgespino commented Sep 1, 2022

zhaohb commented Sep 2, 2022

jgespino commented Sep 2, 2022

zhaohb commented Sep 5, 2022

zhaohb commented Oct 14, 2022

avitial commented Oct 17, 2022

avitial commented Oct 27, 2022

akote123 commented May 22, 2024 • edited Loading

dmitry-gorokhov commented May 22, 2024 • edited Loading

NishantPrabhuFujitsu commented May 22, 2024 • edited Loading

dmitry-gorokhov commented May 22, 2024

NishantPrabhuFujitsu commented May 22, 2024 • edited Loading

dmitry-gorokhov commented May 22, 2024 • edited Loading

NishantPrabhuFujitsu commented May 23, 2024 • edited Loading

zhaohb commented Aug 27, 2021 •

edited

Loading

Iffa-Intel commented Aug 30, 2021 •

edited

Loading

zhaohb commented Sep 2, 2021 •

edited

Loading

akote123 commented May 22, 2024 •

edited

Loading

dmitry-gorokhov commented May 22, 2024 •

edited

Loading

NishantPrabhuFujitsu commented May 22, 2024 •

edited

Loading

NishantPrabhuFujitsu commented May 22, 2024 •

edited

Loading

dmitry-gorokhov commented May 22, 2024 •

edited

Loading

NishantPrabhuFujitsu commented May 23, 2024 •

edited

Loading