Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webui prototype #13

Merged
merged 8 commits into from
Sep 14, 2023
Merged

Webui prototype #13

merged 8 commits into from
Sep 14, 2023

Conversation

fearnworks
Copy link
Contributor

image
#7

  • Adds gradio interface to run Base Model and MoE model side by side
  • Includes expert selection table with info on weights of the selected experts and their ids
  • Includes parameter selection for Max tokens, expertsK, and method
  • Add container build/run option with docker compose

@fearnworks fearnworks changed the base branch from main to v1 September 13, 2023 23:37
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was geting a lot of warnings about having max_tokens set without this being passed in. Saw an improvement in runtime performance once it was added in.

@pharaouk
Copy link
Contributor

pharaouk commented Sep 14, 2023

I am facing this error when building the docker with webui

hydra-moe-webui-1 | Traceback (most recent call last):
hydra-moe-webui-1 | File "/hydra-moe/server.py", line 127, in
hydra-moe-webui-1 | moe.initialize_model()
hydra-moe-webui-1 | File "/hydra-moe/moe.py", line 66, in initialize_model
hydra-moe-webui-1 | model, tokenizer = get_inference_model(args, checkpoint_dirs)
hydra-moe-webui-1 | File "/hydra-moe/moe_utils.py", line 52, in get_inference_model
hydra-moe-webui-1 | model = AutoModelForCausalLM.from_pretrained(
hydra-moe-webui-1 | File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained
hydra-moe-webui-1 | return model_class.from_pretrained(
hydra-moe-webui-1 | File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2903, in from_pretrained
hydra-moe-webui-1 | ) = cls._load_pretrained_model(
hydra-moe-webui-1 | File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3260, in _load_pretrained_model
hydra-moe-webui-1 | new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
hydra-moe-webui-1 | File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 725, in _load_state_dict_into_meta_model
hydra-moe-webui-1 | set_module_quantized_tensor_to_device(
hydra-moe-webui-1 | File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 99, in set_module_quantized_tensor_to_device
hydra-moe-webui-1 | new_value = bnb.nn.Params4bit(new_value, requires_grad=False, **kwargs).to(device)
hydra-moe-webui-1 | File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/nn/modules.py", line 178, in to
hydra-moe-webui-1 | return self.cuda(device)
hydra-moe-webui-1 | File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/nn/modules.py", line 156, in cuda
hydra-moe-webui-1 | w_4bit, quant_state = bnb.functional.quantize_4bit(w, blocksize=self.blocksize, compress_statistics=self.compress_statistics, quant_type=self.quant_type)
hydra-moe-webui-1 | File "/usr/local/lib/python3.10/dist-packages/bitsandbytes/functional.py", line 799, in quantize_4bit
hydra-moe-webui-1 | absmax = torch.zeros((blocks,), device=A.device)
hydra-moe-webui-1 | RuntimeError: CUDA error: no kernel image is available for execution on the device
hydra-moe-webui-1 | CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
hydra-moe-webui-1 | For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
hydra-moe-webui-1 | Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

@pharaouk pharaouk merged commit 0994e5c into SkunkworksAI:v1 Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 😾Doing
Development

Successfully merging this pull request may close these issues.

2 participants