Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

求助:各种模型都下好了怎么用npu进行推理呢?按照readme操作了好久报错了。 #551

Open
sguo112 opened this issue Dec 4, 2024 · 2 comments

Comments

@sguo112
Copy link

sguo112 commented Dec 4, 2024

从 使用的是910b

求助执行bash scripts/text_condition/npu/sample_t2v_v1_3.sh报错如上图

脚本配置修改如下
--model_path "//Open-Sora-Plan/Open-Sora-Plan-v1.3.0/any93x640x640"
--ae WFVAEModel_D8_4x8x8 --ae_path "Open-Sora-Plan/Open-Sora-Plan-v1.3.0/vae"
--text_encoder_name_1 "/Open-Sora-Plan/mt5" --rescale_betas_zero_snr

@yunyangge
Copy link
Collaborator

这是npu报的warning,是正常情况,你有报error吗?
This is a warning reported by NPU, which is a normal situation. Did you report an error?

@sguo112
Copy link
Author

sguo112 commented Dec 4, 2024

这是npu报的warning,是正常情况,你有报error吗? This is a warning reported by NPU, which is a normal situation. Did you report an error?

有的
The following error is reported:

ImportError: cannot import name 'cached_download' from 'huggingface_hub' (/root/anaconda3/envs/deepspeed/lib/python3.10/site-packages/huggingface_hub/__init__.py)
[ERROR] 2024-12-04-19:08:02 (PID:3663810, Device:-1, RankID:-1) ERR99999 UNKNOWN applicaiton exception
/root/anaconda3/envs/deepspeed/lib/python3.10/tempfile.py:869: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpga4jvq_c'>
  _warnings.warn(warn_message, ResourceWarning)
E1204 19:08:04.672000 281473368985632 torch/distributed/elastic/multiprocessing/api.py:833] failed (exitcode: 1) local_rank: 0 (pid: 3663810) of binary: /root/anaconda3/envs/deepspeed/bin/python
Traceback (most recent call last):
  File "/root/anaconda3/envs/deepspeed/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/root/anaconda3/envs/deepspeed/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 348, in wrapper
    return f(*args, **kwargs)
  File "/root/anaconda3/envs/deepspeed/lib/python3.10/site-packages/torch/distributed/run.py", line 901, in main
    run(args)
  File "/root/anaconda3/envs/deepspeed/lib/python3.10/site-packages/torch/distributed/run.py", line 892, in run
    elastic_launch(
  File "/root/anaconda3/envs/deepspeed/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 133, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/root/anaconda3/envs/deepspeed/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
opensora.sample.sample FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-12-04_19:08:04
  host      : devserver-314b-0
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 3663810)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

另外我想问下如何完整下载opensora 1 .3.0所需要的包,我在另一个环境下运行推理脚本总会提示缺少东西
In addition, I would like to ask how to fully download the package required for opensora 1.3.0, when I run inference scripts in another environment, it always indicates that something is missing
捕获123123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants