This repository has been archived by the owner on Oct 19, 2024. It is now read-only.
[FEATURE] Reduce the required peak RAM on a single node while converting weights #792
Labels
good first issue
Good for newcomers
System information
Describe the new feature and the current behavior/state
Referring to here, for now, weight conversion for OPT-175B requires a peak RAM usage as large as twice of the model size. It will be great to do this in a distributed way to reduce the required peak RAM on a single node.
Will this change the current API? How?
Changes will mostly happen in the
step_2_consolidate_992_shards_to_singleton.py
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: