You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, this resulted in an error indicating that the gpuOffload parameter was expected to be an object, not a string. Actually gpuOffload requires 3 parameters
ratio: Specifies the proportion of the workload to be offloaded to the GPU. A value of 1.0 means the entire workload will be handled by the GPU. mainGpu: Indicates which GPU ID to use as the primary one. For example, 0 refers to the first GPU in the system. tensorSplit: An array that specifies how to split the tensors among the GPUs. [1.0] means the entire workload will be handled by the primary GPU.
The text was updated successfully, but these errors were encountered:
I encountered an issue while trying to use the gpuOffload configuration as stated in the README
const llama3 = await client.llm.load(modelPath, { config: { gpuOffload: "max" } });
However, this resulted in an error indicating that the gpuOffload parameter was expected to be an object, not a string. Actually gpuOffload requires 3 parameters
ratio: Specifies the proportion of the workload to be offloaded to the GPU. A value of 1.0 means the entire workload will be handled by the GPU.
mainGpu: Indicates which GPU ID to use as the primary one. For example, 0 refers to the first GPU in the system.
tensorSplit: An array that specifies how to split the tensors among the GPUs. [1.0] means the entire workload will be handled by the primary GPU.
The text was updated successfully, but these errors were encountered: