device_map=balanced not support in NPU when using FLUX.1-dev model #3271

baymax591 · 2024-12-04T09:01:41Z

System Info

- `Accelerate` version: 1.0.1
- Platform: Linux-4.19.90-vhulk2211.3.0.h1543.eulerosv2r10.aarch64-aarch64-with-glibc2.28
- `accelerate` bash location: /root/miniconda3/envs/baymax/bin/accelerate
- Python version: 3.10.0
- Numpy version: 1.24.4
- PyTorch version (GPU?): 2.1.0 (False)
- PyTorch XPU available: False
- PyTorch NPU available: True
- PyTorch MLU available: False
- PyTorch MUSA available: False
- System RAM: 1511.10 GB
- CANN version: 8.0.RC3
- `Accelerate` default config:
        Not found

Information

The official example scripts
My own modified scripts

Tasks

One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
My own task or dataset (give details below)

Reproduction

When I used the example provided by Hugging Face and set the device_map to 'balanced', I encountered an error.

code example:

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("models/FLUX.1-dev", 
                                    torch_dtype=torch.bfloat16,
                                    device_map="balanced"
                                    )

prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=3.5,
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")

error:

Loading pipeline components...:  14%|██████                                    | 1/7 [00:00<00:02,  2.18it/s]
Traceback (most recent call last):
  File "/data/baymax/test_diffusers.py", line 4, in <module>
    pipe = FluxPipeline.from_pretrained("/data/baymax/models/FLUX.1-dev", 
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 896, in from_pretrained
    loaded_sub_model = load_sub_model(
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/diffusers/pipelines/pipeline_loading_utils.py", line 704, in load_sub_model
    loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 886, in from_pretrained
    accelerate.load_checkpoint_and_dispatch(
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/accelerate/big_modeling.py", line 613, in load_checkpoint_and_dispatch
    load_checkpoint_in_model(
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1749, in load_checkpoint_in_model
    loaded_checkpoint = load_state_dict(checkpoint_file, device_map=device_map)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1471, in load_state_dict
    return safe_load_file(checkpoint_file, device=target_device)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/safetensors/torch.py", line 315, in load_file
    result[k] = f.get_tensor(k)
  File "/root/miniconda3/envs/baymax/lib/python3.10/site-packages/torch/cuda/__init__.py", line 289, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Expected behavior

Runs normally without any errors.

The text was updated successfully, but these errors were encountered:

baymax591 · 2024-12-04T09:04:53Z

This PR created by @statelesshz has resolved my issue, and I hope it can be pushed forward for integration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

device_map=balanced not support in NPU when using FLUX.1-dev model #3271

device_map=balanced not support in NPU when using FLUX.1-dev model #3271

baymax591 commented Dec 4, 2024 •

edited

Loading

baymax591 commented Dec 4, 2024

device_map=balanced not support in NPU when using FLUX.1-dev model #3271

device_map=balanced not support in NPU when using FLUX.1-dev model #3271

Comments

baymax591 commented Dec 4, 2024 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

baymax591 commented Dec 4, 2024

baymax591 commented Dec 4, 2024 •

edited

Loading