Releases · gtygo/llama.cpp

22 Nov 05:24

a5e4759

b4150 Latest

Latest

cuda : optimize argmax (#10441)

* cuda : optimize argmax

* remove unused parameter

ggml-ci

* fixup : use full warps

ggml-ci

* Apply suggestions from code review

Co-authored-by: Johannes Gäßler <[email protected]>

* fix ub

* ggml : check ne00 <= INT32_MAX in argmax and argsort

---------

Co-authored-by: Johannes Gäßler <[email protected]>

Assets 21

cudart-llama-bin-win-cu11.7.1-x64.zip

293 MB 2024-11-22T05:24:19Z
cudart-llama-bin-win-cu12.2.0-x64.zip

413 MB 2024-11-22T05:24:25Z
llama-b1-bin-win-hip-x64-gfx1030.zip

236 MB 2024-11-22T05:24:33Z
llama-b1-bin-win-hip-x64-gfx1100.zip

237 MB 2024-11-22T05:24:38Z
llama-b1-bin-win-hip-x64-gfx1101.zip

237 MB 2024-11-22T05:24:42Z
llama-b4150-bin-macos-arm64.zip

51.1 MB 2024-11-22T05:24:47Z
llama-b4150-bin-macos-x64.zip

52 MB 2024-11-22T05:24:49Z
llama-b4150-bin-ubuntu-x64.zip

56.1 MB 2024-11-22T05:24:50Z
llama-b4150-bin-win-avx-x64.zip

8.1 MB 2024-11-22T05:24:52Z
llama-b4150-bin-win-avx2-x64.zip

8.11 MB 2024-11-22T05:24:53Z
Source code (zip)

2024-11-21T17:18:50Z
Source code (tar.gz)

2024-11-21T17:18:50Z

09 Aug 19:54

github-actions

b3562

88105b7

b3562

Reuse querybatch to reduce frequent memory allocation

Assets 20

09 Aug 19:14

github-actions

b3561

fe6dc61

b3561

retrieval

Assets 20

09 Aug 19:13

github-actions

b3560

6afd1a9

b3560

llama : add support for lora adapters in T5 model (#8938)

Co-authored-by: Stanisław Szymczyk <[email protected]>

Assets 20

09 Aug 17:30

github-actions

b3559

272e3bd

b3559

make : fix llava obj file race (#8946)

ggml-ci

Assets 20

09 Aug 09:52

github-actions

b3556

4305b57

b3556

sync : ggml

Assets 20

07 Aug 06:48

github-actions

b3538

506122d

b3538

llama-bench : add support for getting cpu info on Windows (#8824)

* Add support for getting cpu info on Windows for llama_bench

* refactor

---------

Co-authored-by: slaren <[email protected]>

Assets 20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: gtygo/llama.cpp

b4150

b3562

b3561

b3560

b3559

b3556

b3538