write-your-own-operator-library

Write high performance operators for LLMs with CUDA/OpenCL/Triton

CUDA

CUDA kernels are implemented through pycuda, and Colab is recommended for trying：

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
cuda		cuda
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md