[FEATURE] Reduce the required peak RAM on a single node while converting weights #792

zhanyuanucb · 2022-12-01T17:53:52Z

System information

Alpa version: v0.2.2
Are you willing to contribute to it (Yes/No): Yes, but not immediately

Describe the new feature and the current behavior/state
Referring to here, for now, weight conversion for OPT-175B requires a peak RAM usage as large as twice of the model size. It will be great to do this in a distributed way to reduce the required peak RAM on a single node.

Will this change the current API? How?
Changes will mostly happen in the step_2_consolidate_992_shards_to_singleton.py

Describe alternatives you've considered

Additional context

The text was updated successfully, but these errors were encountered:

zhisbug · 2022-12-31T05:04:29Z

This is non-trivial to do. I discussed with @merrymercy and he will update on this issue.

merrymercy · 2023-01-29T03:33:28Z

@Ying1123

sammeralomair · 2023-05-08T23:43:39Z

Working on this.
I have a list of questions if anyone is available to disuss over slack

zhisbug added the good first issue Good for newcomers label Dec 19, 2022

merrymercy changed the title ~~Reduce the required peak RAM on a single node while converting weights~~ [Feature] Reduce the required peak RAM on a single node while converting weights Dec 20, 2022

merrymercy changed the title ~~[Feature] Reduce the required peak RAM on a single node while converting weights~~ [FEATURE] Reduce the required peak RAM on a single node while converting weights Dec 20, 2022

zhisbug assigned merrymercy Dec 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Reduce the required peak RAM on a single node while converting weights #792

[FEATURE] Reduce the required peak RAM on a single node while converting weights #792

zhanyuanucb commented Dec 1, 2022

zhisbug commented Dec 31, 2022

merrymercy commented Jan 29, 2023

sammeralomair commented May 8, 2023

[FEATURE] Reduce the required peak RAM on a single node while converting weights #792

[FEATURE] Reduce the required peak RAM on a single node while converting weights #792

Comments

zhanyuanucb commented Dec 1, 2022

zhisbug commented Dec 31, 2022

merrymercy commented Jan 29, 2023

sammeralomair commented May 8, 2023