-
-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI/Build] improve python-only dev setup #9621
[CI/Build] improve python-only dev setup #9621
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
dc819d0
to
cf7383d
Compare
I think you might be interested in this @youkaichao |
cf7383d
to
44ea494
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really helpful @dtrifiro, just offering a few typo catches and suggestions.
|
||
If you update the vLLM wheel and rebuild from the source to make further edits, you will need to repeat the `Python-only build <#python-only-build>`_ steps again. | ||
Python code changes, will reflected when you run vLLM thanks to pip's ``--editable`` flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line makes more sense at the beginning of this section. I offered a suggestion there.
Python code changes, will reflected when you run vLLM thanks to pip's ``--editable`` flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick edit on the installation instructions. I still need to take the time to give it try.
58d9e35
to
6d8ac74
Compare
6d8ac74
to
2acec04
Compare
Nice! Looking forward to getting rid of our janky dev setups 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the idea looks good to me
setup.py
Outdated
with zipfile.ZipFile(wheel_filename) as wheel: | ||
for lib in filter(lambda file: file.filename.endswith(".so"), | ||
wheel.filelist): | ||
package_name = os.path.dirname(lib.filename).replace("/", ".") | ||
if package_name not in package_data: | ||
package_data[package_name] = [] | ||
|
||
wheel.extract(lib) | ||
package_data[package_name].append(lib.filename) | ||
print(f"Added {lib.filename} to package_data[\"{package_name}\"]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to copy some python file from vllm flash attention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the previous version was only version because of a previously extracted vllm_flash_attn
module.
With this patch, building from scratch results in all the required files being present:
$ unzip -l dist/vllm-0.6.3.post2.dev62+g2acec044e.d20241028.cu126.precompiled-py3-none-any.whl | grep -E flash_attn\|\\\.so
200404640 2024-10-28 15:45 vllm/_C.abi3.so
82008256 2024-10-28 15:45 vllm/_moe_C.abi3.so
31950 2024-10-28 09:11 vllm/attention/backends/flash_attn.py
27715 2024-10-28 09:11 vllm/attention/backends/rocm_flash_attn.py
347 2024-10-28 15:45 vllm/vllm_flash_attn/__init__.py
47716 2024-10-28 15:45 vllm/vllm_flash_attn/flash_attn_interface.py
242221240 2024-10-28 15:45 vllm/vllm_flash_attn/vllm_flash_attn_c.abi3.so
76a3131
to
c768569
Compare
Last push changed a few things:
|
975a3d6
to
95508b2
Compare
@youkaichao anything holding this back? |
will take a look when i have time this week. |
95508b2
to
e529f86
Compare
This pull request has merge conflicts that must be resolved before it can be |
e529f86
to
82f2535
Compare
why this pr is blocked? |
Signed-off-by: Daniele Trifirò <[email protected]>
Signed-off-by: Daniele Trifirò <[email protected]>
Signed-off-by: Daniele Trifirò <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
setup.py
Outdated
|
||
import tempfile | ||
|
||
# create a temporary directory to store the wheel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we create a temp directory to hold the file, because right now all wheels have the same name, while people can try to install different versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will redownload the wheel on every install, as well as filling up /tmp
, I think this will cause users issues in the long term.
Also, we should not have to have a workaround due to the choice of using the same wheel name for all releases, which breaks other scenarios (#9244)
I don't know why, but i can see the wheel file is downloaded twice:
|
Signed-off-by: youkaichao <[email protected]>
@dtrifiro I made several changes and locally tested it. please take a look and let me know if they look good to you. |
e74d9bf
to
971f164
Compare
@youkaichao I fixed the issue in the latest push. It made little sense to use a custom function, using a custom The reason for the duplicate calls was that
I have some concerns regarding the usage of temporary directories, but I'm fine with merging it as is, we can revisit/improve this setup when set up the pip index (#9831 (comment)) |
I think we can use a fixed tmp dir name and check wheel existence to avoid redownloading. This solution will also resolve #9621 (comment) by @dtrifiro |
971f164
to
73c07fc
Compare
Signed-off-by: Daniele Trifirò <[email protected]>
Signed-off-by: Daniele Trifirò <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
@dtrifiro I removed the I will merge the PR as-is right now after ci tests. but we should also add some tests to make sure it works as expected, e.g. removing all compilers and try to run python-only install, and check if it can be installed successfully. that can be a followup PR though. |
The current process for python-only development and/or builds is kinda painful.
With this PR I'm trying to resurrect the
VLLM_USE_PRECOMPILED
env var to skip extension compilation and make the overall process easier.The overall idea is to include the compiled libraries from the nightly wheel in the built package (when the env var is set), so that it can be installed as a regular package (e.g with
pip install -e .
).I'm including the possibility of setting the wheel location using the
VLLM_PRECOMPILED_WHEEL_LOCATION
, which can be:https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
)VLLM_PRECOMPILED_WHEEL_LOCATION=path/to/vllm.whl
)The wheel is downloaded, the compiled libraries are extracted and added to
package_data
so they are included in the wheel build and/or install.Note that by default, the wheel is only downloaded if not present in the current working directory, to avoid to have to do multiple downloads on each install.
This will make it possible to install specific branches quite easily. For example, to test out a branch based on
v0.6.3.post1
one could do the following:Quickly starting work with the dev branch without having to rebuild from scratch.
As part of this PR:
setup.py
to allow for downloading of the prebuilt wheel (only works for the CUDA build)python_only_dev
scriptsccache
section to the install guide right after theccache
section