Skip to content

Releases: ARM-software/ComputeLibrary

v24.11

18 Nov 11:53
Compare
Choose a tag to compare

v24.11 Public Major Release

Feat

  • Add SVE SoftmaxLayer kernel for BF16
  • Provide stateless API for CpuGemmLowpMatrixMultiplyCore, CpuQuantize, and DequantizationLayer
  • Extend static quantization interface for both matmul and convolution operations

Fix

  • Clarify Third-Party IP licenses
  • Check if CpuGemmAssemblyDispatch is configured in CpuMatMul before continue
  • Add BF16 support for CpuGemmAssemblyDispatchWrapper
  • Detect SVE support on Windows® to run the available kernels
  • Fixed missing cstdint include which occurs with GCC 15
  • Disable -O2 when building for Windows® as this crashes when certain compiler versions are used
  • Make cast on CPU truncate float to int instead of round to be consistent with other ML frameworks
  • Return error in validate() for CpuGemmLowpMatrixMultiplyCore if pretransposed A or B are true as this is not supported
  • Avoid implicit conversion from __fp16 to arm_compute::bfloat16 to avoid illegal instructions in hardware with FP16 but no BF16 support
  • Softmax SME2 kernel selection now correctly detects if SME2 is supported
  • Requantization rounding issues in CPU/GPU Quantize
  • Scale normalising coefficient in GPU LogSoftmax
  • Apply consistent rounding policy in NEReduceMean
  • Revert default memory manager for NEQLSTMLayer
  • Create default memory manager when none is provided

Refactor

  • Turn duplicated code in the elementwise_binary kernel into templates to reduce code size
  • Move CpuSoftmaxKernel LUT to LUTManager to consolidate location of all LUTs

Perf

v24.09

27 Sep 13:56
Compare
Choose a tag to compare

v24.09 Public Major Release

Feat

  • Provide a wrapper class to expose cpu::CpuSoftmaxGeneric

  • Detect number of cores in Windows®

  • Add Optimized SME kernel for QASYMM8_SIGNED elementwise addition operation

Fix

  • LogSoftmax Int8/UInt8 mismatches in Cpu

  • Rounding of negative integers in pooling 2d/3d gpu kernels

  • OpenMP® linker error on Windows®

  • Rounding of negative integers in pooling 2d/3d kernels

  • Patches linker failure for cpu::CpuSoftmaxGeneric in partial builds

  • Cpu/Gpu Reverse data type support

  • QSYMM16 broadcasted subtraction failures

  • CpuMulKernel validation when there is x-broadcasting for some types

  • Data type validation in depthwise op in Cpu

  • Update macOS® build instructions

  • Validation tests compute reference and target on each iteration

  • Reset permuted input and weights on configure in NEDepthwiseConvolutionLayer

  • Selectively enable CL job chaining

Refactor

  • Generate only one shared library when building with CMake

  • Add BF16 LUT for Softmax Layer with tests

  • Move heuristic logic of activation kernel into separate class

  • Removed unused CommandBuffer.

Perf

  • Allocate Persistent and Prepare tensors at start of prepare()

  • Use mws in OMPScheduler for better thread throttling

  • Enable FP16 winograd in CpuConv2d for v8a multi_isa builds.

Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v24.09/index.xhtml

v24.08.1

28 Aug 12:51
Compare
Choose a tag to compare

v24.08.1 Public Patch Release

Fix

  • Change inheritance qualifiers of experimental Cpu operator interface classes to public for cpu-wrappers.
  • Mismatches in static quantization updated after configure tests
  • CpuSoftmax configure ignores is_log on validation
  • Linker errors in armv8.2a Windows® builds

Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v24.08.1/index.xhtml

v24.08

16 Aug 16:16
Compare
Choose a tag to compare

v24.08 Public Major Release

Feature

  • Expose CpuAdd functionality using the experimental operators api
  • Expose CpuDepthwiseConv2d functionality using the experimental operators api
  • Expose CpuElementwiseDivision functionality using the experimental operators api
  • Expose CpuElementwiseMax functionality using the experimental operators api
  • Expose CpuElementwiseMin functionality using the experimental operators api
  • Expose CpuGemmAssemblyDispatch functionality using the experimental operators low-level api
  • Expose CpuMul functionality using the experimental operators api
  • Expose CpuSub functionality using the experimental operators api

Performance

  • Solve performance issue on Arm® Mali™-G78

Fix

  • Illegal intruction in multi_isa armv8a
  • Set num_threads in ThreadInfo correctly in OMPScheduler
  • Fix Alexnet graph example giving incorrect results

Documentation (API, build guide, contribution guide, errata, etc.) available here:
https://artificial-intelligence.sites.arm.com/computelibrary/v24.08/index.xhtml

v24.07

26 Jul 21:03
Compare
Choose a tag to compare

Public major release
Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.07

v24.06

18 Jun 18:46
Compare
Choose a tag to compare

Public minor release
Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.06

v24.05

30 May 15:18
Compare
Choose a tag to compare

Public major release

Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.05

v24.04

02 May 09:05
Compare
Choose a tag to compare

Public major release

Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.04

v24.02.1

19 Mar 17:40
Compare
Choose a tag to compare

Public patch release

Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.02.1/

v24.02

22 Feb 14:13
Compare
Choose a tag to compare

Public major release

Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here:
https://arm-software.github.io/ComputeLibrary/v24.02