-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
params partition for skip_init (#4722)
Some models use ```skip_init``` to initialize weights. ```skip_init``` first initializes on a meta device in ```__init__``` of a module and then uses ```to_empty()```. This conflicts with the deepspeed hook ```module.__init__``` mechanism. it's necessary to wait for ```skip_init``` to finish before executing ```_post_init_method```. However, the ```from ... import skip_init``` behavior typically occurs outside the context, there seems to be no good way to directly hook into ```skip_init```. Hence, the approach here is to delay the execution of ```_post_init_method``` to resolve this issue. Known affected models include HuggingFace models like chatglm2 and chatglm3." --------- Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Masahiro Tanaka <[email protected]>
- Loading branch information
1 parent
870ae04
commit 3110c38
Showing
2 changed files
with
134 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters