feat(llama.cpp): expose cache_type_k and cache_type_v for quant of kv cache #7537
Job | Run time |
---|---|
6m 5s | |
2m 48s | |
6m 45s | |
2m 1s | |
2m 25s | |
1m 52s | |
4m 43s | |
3m 47s | |
3m 24s | |
33m 50s |
Job | Run time |
---|---|
6m 5s | |
2m 48s | |
6m 45s | |
2m 1s | |
2m 25s | |
1m 52s | |
4m 43s | |
3m 47s | |
3m 24s | |
33m 50s |