Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream Main: Fused Ops and Kernels, FSDP and Memory Fixes #35

Merged
merged 8 commits into from
Jun 7, 2024
Merged

Conversation

achew010 and others added 8 commits May 28, 2024 15:18
* linting and formatting changes

* removed AutoGPTQ dep in linting

* added additional comments in tox
* workaround low-mem patch

* resolve conflicts and define patch function

* resolve conflicts and define patch function

* Apply suggestions from code review

Co-authored-by: Yu Chin Fabian Lim <[email protected]>

* revert hack to avoid low memory bug in HF memory metrics calculation

* reversed formatting

* reverse more formatting

---------

Co-authored-by: Yu Chin Fabian Lim <[email protected]>
* group memory field names with  prefix and minor fixes

* change to drop index on index reset
* initial commit

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* add fast quantized plugin

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* add mistral and fix plugin

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* add licensing notices and instructions for adding new plugin.

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* handle linting, formatting

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* 2nd round of linting

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* activate workflow and some more lint fixes

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* add sample config

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* updates to benchmark, scenarios

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* fix tests

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

---------

Signed-off-by: Yu Chin Fabian Lim <[email protected]>
* refactor

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* fixes

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* refactor mistral

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* add mixtral

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* some refactoring after introducing mlp

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* remove extranous files

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* add bnb

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* lint + fmt and improvements to readme

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* bench fixes

* need to handle lora adapters device due to #26

* allow replay of failed benches, addressing comment in #14

* update benches (remove l40)

---------

Signed-off-by: Yu Chin Fabian Lim <[email protected]>
#31)

* properly ignore lora adapters

* handle qlora quant state

* improve fix

* further simplification of fix

* updated benchmark reference (#34)

---------

Co-authored-by: achew010 <[email protected]>
* shift gpu mem computation to gather_report

* addressed comments
@fabianlim fabianlim merged commit 40aad46 into main Jun 7, 2024
8 checks passed
@fabianlim fabianlim added the main Merged dev to main label Jun 7, 2024
@fabianlim fabianlim changed the title Fused Ops and Kernels, FSDP and Memory Fixes Upstream Main: Fused Ops and Kernels, FSDP and Memory Fixes Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
main Merged dev to main
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants