Mac/Metal thread #3760

oobabooga · 2023-08-30T16:52:12Z

This thread is dedicated to discussing the setup of the webui on Metal GPUs and Mac computers in general.

You are welcome to ask questions as well as share your experiences, tips, and insights to make the process easier for all Mac users.

GV43 · 2023-09-07T14:57:48Z

Has anyone been able to get GGFU models to load in webui? I've updated llama-cpp-python, but I'm still getting traceback errors.

dmi · 2023-09-10T00:31:29Z

Loaded fine without issues. Installed via zip file. But does not use GPU. Standalone compiled llama.cpp uses GPU.

cfmbrand · 2023-09-11T09:23:43Z

Hi,

Fairly new to all of this, so I may be making very basic errors, but:

I installed everything through terminal per instructions in the ReadMe - this still caused me to get an error relating to cumsum and PyTorch when tried to run the model (can always load the 7B FP16 CodeLlama from TheBloke). I'm running Mac Pro 16GB (14 core) with Ventura 13.5.2, python 3.10.9 per the ReadMe instructions in a clean environment, and I installed packages per the requirements_nocuda.txt.

The error was:

To create a public link, set share=True in launch().
/Users/appe/works/one-click-installers/installer_files/env/lib/python3.10/site-packages/transformers/generation/utils.py:690: UserWarning: MPS: no support for int64 repeats mask, casting it to int32 (Triggered internally at /Users/runner/work/_temp/anaconda/conda-bld/pytorch_1678454852765/work/aten/src/ATen/native/mps/operations/Repeat.mm:236.)
input_ids = input_ids.repeat_interleave(expand_size, dim=0)
Traceback (most recent call last):
File "/Users/appe/works/one-click-installers/text-generation-webui/modules/callbacks.py", line 71, in gentask
ret = self.mfunc(callback=_callback, **self.kwargs)
File "/Users/appe/works/one-click-installers/text-generation-webui/modules/text_generation.py", line 290, in generate_with_callback
shared.model.generate(**kwargs)
File "/Users/appe/works/one-click-installers/installer_files/env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/Users/appe/works/one-click-installers/installer_files/env/lib/python3.10/site-packages/transformers/generation/utils.py", line 1485, in generate
return self.sample(
File "/Users/appe/works/one-click-installers/installer_files/env/lib/python3.10/site-packages/transformers/generation/utils.py", line 2521, in sample
model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
File "/Users/appe/works/one-click-installers/installer_files/env/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 736, in prepare_inputs_for_generation
position_ids = attention_mask.long().cumsum(-1) - 1
RuntimeError: MPS does not support cumsum op with int64 input

I would also get this warning when running the server.py file to start up the GUI:

UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.

I then tried to change the torch/torchaudio/torchvision installation to the 'nightly' version (not the stable release) - this successfully got rid of the error message, but the warning remained, and although I could run the model through the 'Default' prompt window, it now runs extremely slowly - like 0.01 tokens/second.

This is all summarised in this other thread: #1686 (comment)

Does anyone have any ideas on what is going wrong here? Thanks in advance.

oobabooga · 2023-09-24T13:05:53Z

The updated one-click installer now installs llama.cpp wheels with Metal acceleration. They are obtained from these files:

https://github.com/oobabooga/text-generation-webui/blob/main/requirements_apple_intel.txt
https://github.com/oobabooga/text-generation-webui/blob/main/requirements_apple_silicon.txt

llama.cpp with GGUF models and n-gpu-layers set to greater than 0 should in principle work now.

danch99 · 2023-09-27T00:56:29Z

Hi,

using the updated one-click installer and not able to install it. I'm on a Mac M2 and I get this error:

Building wheels for collected packages: exllamav2
Building wheel for exllamav2 (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [168 lines of output]
No CUDA runtime is found, using CUDA_HOME='/Users/dan/LLMs/text-generation-webui/installer_files/env'
warning: no previously-included files matching '.pyc' found anywhere in distribution
warning: no previously-included files matching 'dni_' found anywhere in distribution
/Users/dan/LLMs/text-generation-webui/installer_files/env/lib/python3.10/site-packages/setuptools/command/build_py.py:201: _Warning: Package 'exllamav2.exllamav2_ext' is absent from the packages configuration.
!!

Thanks for your help.

oobabooga · 2023-09-27T01:15:04Z

@danch99 what version of exllamav2 is written in your requirements_apple_silicon.txt file?

danch99 · 2023-09-27T01:17:46Z

@oobabooga
exllamav2==0.0.4

oobabooga · 2023-09-27T01:23:59Z

@jllllll do you see a reason why the new exllamav2==0.0.4 wheel would refuse to install on mac?

philippjbauer · 2023-09-27T02:47:32Z

I run into similar troubles installing for Apple Silicon with its requirements_apple_silicon.txt.

raise EnvironmentError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

Collecting exllamav2==0.0.4 (from -r requirements_apple_silicon.txt (line 11))
  Using cached exllamav2-0.0.4.tar.gz (56 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [12 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/private/var/folders/d4/xd3y159j1d5dcz3kdg1xqz6m0000gn/T/pip-install-e3vtw5el/exllamav2_d424aa9a25b7472682fcc4b4587265ea/setup.py", line 25, in <module>
          cpp_extension.CUDAExtension(
        File "/Users/philippbauer/.miniforge3/envs/oobabooga/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1048, in CUDAExtension
          library_dirs += library_paths(cuda=True)
        File "/Users/philippbauer/.miniforge3/envs/oobabooga/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1179, in library_paths
          if (not os.path.exists(_join_cuda_home(lib_dir)) and
        File "/Users/philippbauer/.miniforge3/envs/oobabooga/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2223, in _join_cuda_home
          raise EnvironmentError('CUDA_HOME environment variable is not set. '
      OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

jllllll · 2023-09-27T02:49:42Z

Probably my fault: turboderp/exllamav2#61
Completely slipped my mind that this might cause issues with the exllamav2 sdist.

4 options for resolving this:

Add ; platform_system != "Darwin" to the rest of the requirements.txt files.
- It's probably sufficient to just remove exllamav2 from the Mac files since it can't be used there anyway.
I make a PR to have JIT compiling be the default in exllamav2 like it used to be.
Use the JIT compile wheel instead:
https://github.com/turboderp/exllamav2/releases/download/v0.0.4/exllamav2-0.0.4-py3-none-any.whl
- This option may not be viable long-term as turboderp may stop building that wheel.
Ask turboderp to upload the JIT compile wheel to PyPI. Currently, it is just the sdist that is uploaded there.

An immediate, temporary solution is to set the EXLLAMA_NOCOMPILE environment variable before installing on Mac.

philippjbauer · 2023-09-27T03:23:33Z

~~The only way I got it to run is to remove exllamav2 from the requirements_nowheels.txt file and install llama-cpp-python with CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python.~~

~~The requirements_apple_silicon.txt file does not work on macOS 14.0 (Sonoma) yet.~~

~~I suppose you will need to create a new file for the new OS version? Like this one for macOS 13.x:~~

https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases/download/metal/llama_cpp_python-0.2.6-cp310-cp310-macosx_13_0_arm64.whl; platform_system == "Darwin" and platform_release >= "22.0.0" and platform_release < "23.0.0"

Nevermind, the model itself couldn't be executed when I tried the apple_silicon install file and I only noticed after trying the nowheels install file.

turboderp · 2023-09-27T10:26:41Z

I'm definitely keeping the JIT version around, if nothing else then because it makes development a whole lot easier.

But it doesn't really matter to me which version is the default. I just want it to be as unsurprising as possible to the most users.

For now I have uploaded the JIT version to PyPI.

danch99 · 2023-09-27T13:34:45Z

Thanks a lot for your fast support gentlemen. The new install works.

One small thing, I can load GGUF models but no success with GGML models.

Falenos · 2023-10-04T07:35:55Z

Just a tip if someone is running miniconda on M1 and has issues, check this

goranapivis · 2023-10-29T18:28:42Z

Didn't work, tried every trick, lost a whole day, running LM studio instead, works out of the box with M1 GPU! :-)

github-actions · 2023-12-10T23:16:18Z

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

b8kings0ga · 2024-01-09T07:55:48Z

TypeError: BFloat16 is not supported on MPS

leedrake5 · 2024-01-24T02:38:26Z

Metal no longer seems to be working? Having n_gpus > 0 used to always be the required step for GPU use on M1-M3 Macs. Now it no longer does that. Is there another setting that must be enabled? llama-cpp-python has been installed with metal flags on.

PeterFujiyu · 2024-01-28T03:24:10Z

The edge browser on macos has no option when using webui

EDGE Version Microsoft Edge
Version 121.0.2277.83 (official version) (arm64)

mkhia · 2024-04-02T17:39:48Z

hi! I have a problem with the MPS i am a noob and have no idea what to do with this code. MPS backend out of memory (MPS allocated: 15.16 GB, other allocations: 104.67 MB, max allowed: 18.13 GB). Tried to allocate 6.28 GB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure). can anyone help?

robik72 · 2024-04-25T10:19:26Z

Hi, I am getting the infamous "OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root." when i try to load a gptq model on my Macbook Pro M3 with Sonoma 14.3. During setup, I choose Apple Silicon GPU but it does not seem to have an effect. I also tried to add a line to the requirements_apple_silicon.txt file with the following, with no result:
https://github.com/turboderp/exllamav2/releases/download/v0.0.4/exllamav2-0.0.4-py3-none-any.whl; platform_system != "Darwin"
What did i miss ? Glad to get your help I am really lost....

stefanbeeman-em · 2024-09-15T17:46:11Z

Hi, I am getting the infamous "OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root." when i try to load a gptq model on my Macbook Pro M3 with Sonoma 14.3. During setup, I choose Apple Silicon GPU but it does not seem to have an effect. I also tried to add a line to the requirements_apple_silicon.txt file with the following, with no result: https://github.com/turboderp/exllamav2/releases/download/v0.0.4/exllamav2-0.0.4-py3-none-any.whl; platform_system != "Darwin" What did i miss ? Glad to get your help I am really lost....

I'm having this issue as well. It seems like it's a problem detecting my graphics card, despite me specifying Apple sillicon in the install script.

oobabooga pinned this issue Aug 30, 2023

oobabooga mentioned this issue Aug 30, 2023

Fix for MPS support on Apple Silicon #393

Merged

github-actions bot added the stale label Dec 10, 2023

github-actions bot closed this as completed Dec 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mac/Metal thread #3760

Mac/Metal thread #3760

oobabooga commented Aug 30, 2023 •

edited

Loading

GV43 commented Sep 7, 2023

dmi commented Sep 10, 2023 •

edited

Loading

cfmbrand commented Sep 11, 2023 •

edited

Loading

oobabooga commented Sep 24, 2023 •

edited

Loading

danch99 commented Sep 27, 2023

oobabooga commented Sep 27, 2023

danch99 commented Sep 27, 2023

oobabooga commented Sep 27, 2023

philippjbauer commented Sep 27, 2023

jllllll commented Sep 27, 2023 •

edited

Loading

philippjbauer commented Sep 27, 2023 •

edited

Loading

turboderp commented Sep 27, 2023 •

edited

Loading

danch99 commented Sep 27, 2023

Falenos commented Oct 4, 2023

goranapivis commented Oct 29, 2023 •

edited

Loading

github-actions bot commented Dec 10, 2023

b8kings0ga commented Jan 9, 2024

leedrake5 commented Jan 24, 2024

PeterFujiyu commented Jan 28, 2024

mkhia commented Apr 2, 2024

robik72 commented Apr 25, 2024

stefanbeeman-em commented Sep 15, 2024

Mac/Metal thread #3760

Mac/Metal thread #3760

Comments

oobabooga commented Aug 30, 2023 • edited Loading

GV43 commented Sep 7, 2023

dmi commented Sep 10, 2023 • edited Loading

cfmbrand commented Sep 11, 2023 • edited Loading

oobabooga commented Sep 24, 2023 • edited Loading

danch99 commented Sep 27, 2023

oobabooga commented Sep 27, 2023

danch99 commented Sep 27, 2023

oobabooga commented Sep 27, 2023

philippjbauer commented Sep 27, 2023

jllllll commented Sep 27, 2023 • edited Loading

philippjbauer commented Sep 27, 2023 • edited Loading

turboderp commented Sep 27, 2023 • edited Loading

danch99 commented Sep 27, 2023

Falenos commented Oct 4, 2023

goranapivis commented Oct 29, 2023 • edited Loading

github-actions bot commented Dec 10, 2023

b8kings0ga commented Jan 9, 2024

leedrake5 commented Jan 24, 2024

PeterFujiyu commented Jan 28, 2024

mkhia commented Apr 2, 2024

robik72 commented Apr 25, 2024

stefanbeeman-em commented Sep 15, 2024

oobabooga commented Aug 30, 2023 •

edited

Loading

dmi commented Sep 10, 2023 •

edited

Loading

cfmbrand commented Sep 11, 2023 •

edited

Loading

oobabooga commented Sep 24, 2023 •

edited

Loading

jllllll commented Sep 27, 2023 •

edited

Loading

philippjbauer commented Sep 27, 2023 •

edited

Loading

turboderp commented Sep 27, 2023 •

edited

Loading

goranapivis commented Oct 29, 2023 •

edited

Loading