Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

device_map=balanced not support in NPU when using FLUX.1-dev model #3271

Open
2 of 4 tasks
baymax591 opened this issue Dec 4, 2024 · 1 comment
Open
2 of 4 tasks

Comments

@baymax591
Copy link

baymax591 commented Dec 4, 2024

System Info

- `Accelerate` version: 1.0.1
- Platform: Linux-4.19.90-vhulk2211.3.0.h1543.eulerosv2r10.aarch64-aarch64-with-glibc2.28
- `accelerate` bash location: /root/miniconda3/envs/baymax/bin/accelerate
- Python version: 3.10.0
- Numpy version: 1.24.4
- PyTorch version (GPU?): 2.1.0 (False)
- PyTorch XPU available: False
- PyTorch NPU available: True
- PyTorch MLU available: False
- PyTorch MUSA available: False
- System RAM: 1511.10 GB
- CANN version: 8.0.RC3
- `Accelerate` default config:
        Not found

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • My own task or dataset (give details below)

Reproduction

When I used the example provided by Hugging Face and set the device_map to 'balanced', I encountered an error.

code example:

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("models/FLUX.1-dev", 
                                    torch_dtype=torch.bfloat16,
                                    device_map="balanced"
                                    )

prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=3.5,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")

error:

Loading pipeline components...:  14%|██████                                    | 1/7 [00:00<00:02,  2.18it/s]
Traceback (most recent call last):
  File "/data/baymax/test_diffusers.py", line 4, in <module>
    pipe = FluxPipeline.from_pretrained("/data/baymax/models/FLUX.1-dev", 
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 896, in from_pretrained
    loaded_sub_model = load_sub_model(
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/diffusers/pipelines/pipeline_loading_utils.py", line 704, in load_sub_model
    loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 886, in from_pretrained
    accelerate.load_checkpoint_and_dispatch(
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/accelerate/big_modeling.py", line 613, in load_checkpoint_and_dispatch
    load_checkpoint_in_model(
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1749, in load_checkpoint_in_model
    loaded_checkpoint = load_state_dict(checkpoint_file, device_map=device_map)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1471, in load_state_dict
    return safe_load_file(checkpoint_file, device=target_device)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/safetensors/torch.py", line 315, in load_file
    result[k] = f.get_tensor(k)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/torch/cuda/__init__.py", line 289, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Expected behavior

Runs normally without any errors.

@baymax591
Copy link
Author

This PR created by @statelesshz has resolved my issue, and I hope it can be pushed forward for integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant