-
-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setup of ChatRWKV #29
Comments
python 3.8/3.9/3.10 pip install numpy tokenizers prompt_toolkit ninja :) |
There is no need to toss the environment, just use the container @bello7777 |
Thanks mate, I will do it . torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 200.00 MiB (GPU 0; 14.62 GiB total capacity; 13.77 GiB already allocated; 163.94 MiB free; 13.97 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF can I try to reduce the batch sizes to smaller values if yes where are they? |
@soulteary |
@bello7777 You probably need to adjust the strategy. If you're using the pull request: model = RWKV(model=model_path, strategy='cuda fp16i8 *20 -> cuda fp16') That's around line 28 in But the most likely solution is to find whatever is running and how it's setting the strategy and reduce the number of layers it will send to the GPU. For example, in the line above you could try using After you get it going, you can use other tools to see how much GPU memory you have available and adjust the setting according. |
Hey guys great stuff, can we have very easy setup step process to install ChatRWKV on a ubuntu server for example?
The text was updated successfully, but these errors were encountered: