Error in GPU Offload Configuration in the example #67

vegax87 · 2024-09-07T23:06:09Z

I encountered an issue while trying to use the gpuOffload configuration as stated in the README

const llama3 = await client.llm.load(modelPath, { config: { gpuOffload: "max" } });

However, this resulted in an error indicating that the gpuOffload parameter was expected to be an object, not a string. Actually gpuOffload requires 3 parameters

const llama3 = await client.llm.load(modelPath, { 
  config: { 
    gpuOffload: { 
      ratio: 1.0, 
      mainGpu: 0, 
      tensorSplit: [1.0] 
    } 
  } 
});

ratio: Specifies the proportion of the workload to be offloaded to the GPU. A value of 1.0 means the entire workload will be handled by the GPU.
mainGpu: Indicates which GPU ID to use as the primary one. For example, 0 refers to the first GPU in the system.
tensorSplit: An array that specifies how to split the tensors among the GPUs. [1.0] means the entire workload will be handled by the primary GPU.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in GPU Offload Configuration in the example #67

Error in GPU Offload Configuration in the example #67

vegax87 commented Sep 7, 2024

Error in GPU Offload Configuration in the example #67

Error in GPU Offload Configuration in the example #67

Comments

vegax87 commented Sep 7, 2024