Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in GPU Offload Configuration in the example #67

Open
vegax87 opened this issue Sep 7, 2024 · 0 comments
Open

Error in GPU Offload Configuration in the example #67

vegax87 opened this issue Sep 7, 2024 · 0 comments

Comments

@vegax87
Copy link

vegax87 commented Sep 7, 2024

I encountered an issue while trying to use the gpuOffload configuration as stated in the README

const llama3 = await client.llm.load(modelPath, { config: { gpuOffload: "max" } });

However, this resulted in an error indicating that the gpuOffload parameter was expected to be an object, not a string. Actually gpuOffload requires 3 parameters

const llama3 = await client.llm.load(modelPath, { 
  config: { 
    gpuOffload: { 
      ratio: 1.0, 
      mainGpu: 0, 
      tensorSplit: [1.0] 
    } 
  } 
});

ratio: Specifies the proportion of the workload to be offloaded to the GPU. A value of 1.0 means the entire workload will be handled by the GPU.
mainGpu: Indicates which GPU ID to use as the primary one. For example, 0 refers to the first GPU in the system.
tensorSplit: An array that specifies how to split the tensors among the GPUs. [1.0] means the entire workload will be handled by the primary GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant