You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Need to add reward model inference for when the RM is a sizable model. Currently attempts to have RM on each GPU. This is problematic because there are many cases where RM is too big to fit alongside the denoiser model. Solution in LLM case is often to use Triton inference server or to put RM on one gpu while main model uses rest of GPUs. Should be explored further.
The text was updated successfully, but these errors were encountered:
Need to add reward model inference for when the RM is a sizable model. Currently attempts to have RM on each GPU. This is problematic because there are many cases where RM is too big to fit alongside the denoiser model. Solution in LLM case is often to use Triton inference server or to put RM on one gpu while main model uses rest of GPUs. Should be explored further.
The text was updated successfully, but these errors were encountered: