Does vllm support pytorch/xla ? #3424

dinghaodhd · 2024-03-15T06:21:33Z

Anything you want to discuss about vllm.

Hi,
We have adapt our harware to pytorch/xla, what should we do to run vllm via pytorch/xla?

rkooo567 · 2024-03-15T08:14:39Z

I don't think it is working with pytorch/XLA now.

The core attention algorithm that supports paged-attention is written in Cuda (and there are some other custom cuda kernels).
vllm/vllm/config.py

Line 520 in 429284d

def __init__(self, device: str = "auto") -> None:

there are only neuron and cuda device options.

I think to make this work;

we should pass correct device settings. I don't know if xla requires additional changes other than providing a correct device, but if so, you need to change model_runner.py (preparing tensors) and general model path to make sure to use XLA config.
The core attention should use the kernel that works with paged attention.

richardliaw · 2024-03-15T16:54:21Z

There will be some exploration done from Google side I believe (cc @allenwang28)

allenwang28 · 2024-03-26T16:31:30Z

We are exploring TPU compatibility and support in #3620 through PyTorch/XLA.

but after this work, it should be clearer how to adapt PyTorch/XLA in vLLM for different hardware backends CC @WoosukKwon @miladm @shauheen

yiakwy-xpu-ml-framework-team · 2024-08-07T09:30:09Z

Google has made its efforts in supportin TPU. We will make efforts to utilize the canonical IR optimizattion ,auto tuning, auto fusion, layout optimization form xla and stableHLO in GPU devices.

github-actions · 2024-11-07T01:59:03Z

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions · 2024-12-07T02:06:15Z

This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you!

github-actions bot added the stale label Nov 7, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does vllm support pytorch/xla ? #3424

Does vllm support pytorch/xla ? #3424

dinghaodhd commented Mar 15, 2024

rkooo567 commented Mar 15, 2024

richardliaw commented Mar 15, 2024

allenwang28 commented Mar 26, 2024

yiakwy-xpu-ml-framework-team commented Aug 7, 2024

github-actions bot commented Nov 7, 2024

github-actions bot commented Dec 7, 2024

Does vllm support pytorch/xla ? #3424

Does vllm support pytorch/xla ? #3424

Comments

dinghaodhd commented Mar 15, 2024

Anything you want to discuss about vllm.

rkooo567 commented Mar 15, 2024

richardliaw commented Mar 15, 2024

allenwang28 commented Mar 26, 2024

yiakwy-xpu-ml-framework-team commented Aug 7, 2024

github-actions bot commented Nov 7, 2024

github-actions bot commented Dec 7, 2024