Skip to content

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q4 2024
#9006 opened Oct 1, 2024 by simon-mo
Open 22
vLLM's V1 Engine Architecture
#8779 opened Sep 24, 2024 by simon-mo
Open 9
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[RFC]: Make any model an embedding model RFC
#10674 opened Nov 26, 2024 by DarkLight1337
1 task done
[Usage]: Llama-2-7b-chat-hf as embedding model usage How to use vllm
#10673 opened Nov 26, 2024 by ra-MANUJ-an
1 task done
[Usage]: how to get every output token score? usage How to use vllm
#10670 opened Nov 26, 2024 by TonyUSTC
[Usage]: Cannot use xformers with old GPU usage How to use vllm
#10662 opened Nov 26, 2024 by baimushan
1 task done
[Bug]: No available block found in 60 second. bug Something isn't working
#10661 opened Nov 26, 2024 by Went-Liang
1 task done
[Feature]: add macos installation script feature request good first issue Good for newcomers
#10658 opened Nov 26, 2024 by youkaichao
1 task done
[Bug]: Qwen2.5-32B-GPTQ-Int4 inference !!!!! bug Something isn't working
#10656 opened Nov 26, 2024 by jklj077
1 task done
[Bug]: AMD GPU RX 7900XT: Failed to infer device type bug Something isn't working
#10653 opened Nov 26, 2024 by githust66
1 task done
[Bug]: Inference is exceptionally slow on the L20 GPU bug Something isn't working
#10652 opened Nov 26, 2024 by joey9503
1 task done
[Bug]: vllm infer for Qwen2-VL-72B-Instruct-GPTQ-Int8 bug Something isn't working
#10650 opened Nov 26, 2024 by DoctorTar
1 task done
[Feature]: Mixtral manual head_dim feature request
#10649 opened Nov 26, 2024 by wavy-jung
1 task done
[Bug]: Llama 3.2 90b crash bug Something isn't working
#10648 opened Nov 26, 2024 by yessenzhar
1 task done
[RFC]: Support KV Cache Compaction RFC
#10646 opened Nov 25, 2024 by YaoJiayi
1 task done
[Bug]:The parameter gpu_memory_utilization does not take effect bug Something isn't working
#10637 opened Nov 25, 2024 by liutao053877
1 task done
[Bug]: GPU memory leak when using bad_words feature bug Something isn't working
#10630 opened Nov 25, 2024 by wsp317
1 task done
[Bug]: Crash with Qwen2-Audio Model in vLLM During Audio Processing bug Something isn't working
#10627 opened Nov 25, 2024 by jiahansu
1 task done
ProTip! Updated in the last three days: updated:>2024-11-23.