Skip to content
Change the repository type filter

All

    Repositories list

    • MegCC

      Public
      MegCC是一个运行时超轻量,高效,移植简单的深度学习模型编译器
      C++
      Apache License 2.0
      57000Updated May 8, 2024May 8, 2024
    • 🎉CUDA 笔记: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
      Cuda
      GNU General Public License v3.0
      207000Updated Mar 27, 2024Mar 27, 2024
    • mlu-ops

      Public
      Efficient operation implementation based on the Cambricon Machine Learning Unit (MLU) .
      C++
      MIT License
      106000Updated Sep 21, 2023Sep 21, 2023
    • HIPIFY

      Public
      HIPIFY: Convert CUDA to Portable C++ Code
      C++
      MIT License
      78000Updated Sep 20, 2023Sep 20, 2023
    • DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
      C++
      MIT License
      244000Updated Jan 18, 2023Jan 18, 2023
    • A benchmark suited especially for deep learning operators
      Python
      Apache License 2.0
      4000Updated Aug 31, 2022Aug 31, 2022
    • Yinghan's Code Sample
      Cuda
      GNU General Public License v3.0
      54000Updated Jul 25, 2022Jul 25, 2022
    • A sparse BLAS lib supporting multiple backends
      C
      MIT License
      6000Updated Jul 12, 2022Jul 12, 2022
    • Benchmarks on SIMD instructions : SSE, AVX, AVX512
      C
      MIT License
      2000Updated Feb 1, 2021Feb 1, 2021
    • spmv

      Public
      SpMV optimizers
      C++
      1300Updated Feb 15, 2019Feb 15, 2019