Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Try experimental commit making tensorizer less intrusive #3

Draft
wants to merge 1 commit into
base: sangstar/integrate-tensorizer-new
Choose a base branch
from

Conversation

sangstar
Copy link

@sangstar sangstar commented Dec 20, 2024

I feel my current feature branch was proposing fairly "ambitious" changes for exllamav2, or at least ones that would ask the maintainer(s) to add fairly significant properties to their inference engine (like a state_dict as a public attribute for their config class).

I tried in this commit to see if I could integrate tensorizer in a way that minimizes modifications plainly in the source code of exllamav2's model loading machinery by applying the tensorizer_context decorator to functions where tensorizer hooks are needed.

This is unfinished and shoddy. I honestly do not need to go this overboard to make my PR "less intrusive" with this proposal of changes, but felt it might be interesting to consider. Some motivations for this weren't necessarily made "better" either (for instance, I still create a private _state_dict attribute for their config class, but only if the decorator is called).

I figured I'd get initial feedback before going off in this direction, and scrapping it otherwise, while this change is still unfinished.

This change should (hopefully) make for a smaller line count in their core logic. The diff here is for my feature branch, but this can be made clear when comparing to this fork's master branch. It also likely might make the integration less readable as it plays with things like attributes for function objects. It also tries to modularize certain parts of exl2's loading logic in to functions to apply hooks in to, which goes against the unintrusiveness idealism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant