Skip to content

how to specify the number of cpu threads? #3215

Closed Answered by imtuyethan
raingart asked this question in Get Help
Discussion options

You must be logged in to vote

"n_gl" or "n_gpu_layers" is a setting that controls how many layers of the AI model are loaded into the GPU memory. It's basically a way to balance between speed and memory usage:

  • Higher n_gpu_layers = more GPU memory used, faster processing
  • Lower n_gpu_layers = less GPU memory used, potentially slower processing

Suggestions

  • For 7B models: Try starting with 32 layers (n_gpu_layers=32)
  • For 13B models: Start with around 20-24 layers
  • For larger models: You may need to go even lower, perhaps 16 or fewer layers

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by imtuyethan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants