Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable XPU and optimize cpu/xpu op (#1418)
* enable new ipex API ipex weight is 4D so we cannot transpose fix dequant check require grad * use ipex op in backward * enable backward * Multi backend refactor (#8) * AMD: Clarify diagnostic messages; free up disk space for CI build * Add build job for rocm * Add rocm build script * Copy shared obj file into output_dir * upload build artifacts and enable wheels build * Remove cuda build temporarily * Add ROCm version to .so filename * Add rocm_version to whls build * Revert "Remove cuda build temporarily" This reverts commit 1413c5f. * Add rocm_version env var * Remove thrush header files * Print node info * print cuda node info * Revert "print cuda node info" This reverts commit cdb209a. * Revert "Print node info" This reverts commit 7e9a65c. * Add rocm arch to compile command * Rename .so files to rocm * Update default gpu arch * Skip cpu based igemmlt int tests on ROCm * Update Documentation * Update upstream repo name * Update docs * Update string format Co-authored-by: Aarni Koskela <[email protected]> * Remove pre-release option for torch install * Update pytorch install path Co-authored-by: Titus <[email protected]> * Add messages for Heuristics error * Remove toolcache for disk space * print disk usage * Clean disk space for linux * Fix for ubuntu * Add sudo for apt clean * Update clean up disk list * remove disk usage print * Add BNB_BACKEND variable * Update diagnostic functions for ROCm * Fix tuple error * Fix library detection bug for recursive and symlink cases * fix pre-commit errors * Remove recursive path lib search * Create function for runtime lib patterns * Update logger format Co-authored-by: Aarni Koskela <[email protected]> * Update error reporting Co-authored-by: Aarni Koskela <[email protected]> * Remove commented code Co-authored-by: Aarni Koskela <[email protected]> * Update error reporting Co-authored-by: Aarni Koskela <[email protected]> * Update error reporting * Create hip diagnostics functions * Fix Typo * Fix pre-commit checks --------- Co-authored-by: Aarni Koskela <[email protected]> Co-authored-by: Titus <[email protected]> * check grad before using ipex (#1358) * Enable packaging for ROCm 6.2 (#1367) * Enable 6.2 build * Update documentation for 6.2.0 pip install * Update for VS2022 17.11 compatibility with CUDA < 12.4 (#1341) * Update for VS2022 17.11 compatibility with CUDA < 12.4 * Try again * Enable continuous releases for multi-backend-refactor branch * Update release workflow * Publish continuous release for multi-backend * continuous release: revert wheel renaming due to install err * Revert "continuous release: revert wheel renaming due to install err" This reverts commit 0a2b539. * add dynamic tag-based versioning + git hash for dev vers * docs: update w/ changes from `main` * get tags for dynamic versioning * fine-tune continuous release params * reduce the pkg size + build times for the preview release * refine docs for multi-backend alpha release (#1380) * refine docs for multi-backend alpha release * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: add multi-backend feedback links * docs: add request for contributions * docs: small fixes * docs: small fixes * docs: add info about `main` continuous build * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: remove 2 obsolete lines --------- Co-authored-by: pnunna93 <[email protected]> Co-authored-by: Aarni Koskela <[email protected]> Co-authored-by: Titus <[email protected]> Co-authored-by: Matthew Douglas <[email protected]> * Revert "enable backward" This reverts commit cd7bf21. * Revert "use ipex op in backward" This reverts commit b8df1aa. * fix finetune * check training * fix gemv check * reformat * avoid double quant in backward if not needed * Zh/xpu support (#9) * Add xpu support * Add xpu support for int8 * Add xpu dequant kernel support * update code * remove debug comments * remove redundant comments * Add xpu integration for woqlinear * correct the comments * Update cpu_xpu_common.py --------- Co-authored-by: zhuhong61 <[email protected]> Co-authored-by: zhuhong61 <[email protected]> * avoid import triton if CPU and XPU backend * fix setup in docker without git config * xpu do not support compile for now Signed-off-by: jiqing-feng <[email protected]> * update xpu Signed-off-by: jiqing-feng <[email protected]> * update 4bit compute dtype * fix xpu int8 path Signed-off-by: jiqing-feng <[email protected]> * optimize 4bit dequant Signed-off-by: jiqing-feng <[email protected]> * fix xpu dequant Signed-off-by: jiqing-feng <[email protected]> * add empty cache in each xpu op * add nf4 dequant ipex kernel * fix dequant 4bit op * empty cache has negative effect on 4bit gemv * fix xpu save * fix save * xpu use float16 default Signed-off-by: jiqing-feng <[email protected]> * rm empty cache as it cause slower perf Signed-off-by: jiqing-feng <[email protected]> * fix xpu save Signed-off-by: jiqing-feng <[email protected]> * fix 8bit int8 param device Signed-off-by: jiqing-feng <[email protected]> * fix 8bit int8 param device Signed-off-by: jiqing-feng <[email protected]> * fix 8bit int8 param device Signed-off-by: jiqing-feng <[email protected]> * fix 8bit int8 param device Signed-off-by: jiqing-feng <[email protected]> * fix format * update readme for Intel CPU and XPU do not need make csrc codes * fix format * fix import --------- Signed-off-by: jiqing-feng <[email protected]> Co-authored-by: pnunna93 <[email protected]> Co-authored-by: Aarni Koskela <[email protected]> Co-authored-by: Titus <[email protected]> Co-authored-by: Matthew Douglas <[email protected]> Co-authored-by: zhuhong61 <[email protected]> Co-authored-by: zhuhong61 <[email protected]>
- Loading branch information