You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation. The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. thread exception: Traceback (most recent call last): File "/Users/.../h2ogpt/src/utils.py", line 524, in run self._return = self._target(*self._args, **self._kwargs) File "/Users/.../h2ogpt/src/gen.py", line 4288, in generate_with_exceptions func(*args, **kwargs) File "/Users/.../miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/Users/.../miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/generation/utils.py", line 1727, in generate model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation( File "/Users/.../miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/generation/utils.py", line 493, in _prepare_attention_mask_for_generation raise ValueError( ValueError: Can't infer missing attention mask on `mps` device. Please provide an `attention_mask` or use a different device.
It worked fine when I tried to run other older models like llama2 for example.
Do you know what could be the source of this issue ?
The text was updated successfully, but these errors were encountered:
It worked fine with GGUF, I had to install different package version than what is recommended though. I will propose a pull request to be able to start with llama 3.1
I tried to run h2ogpt with this command :
python generate.py --base_model=meta-llama/Meta-Llama-3.1-8B-Instruct --use_auth_token=...
and it triggered errors
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation. The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. thread exception: Traceback (most recent call last): File "/Users/.../h2ogpt/src/utils.py", line 524, in run self._return = self._target(*self._args, **self._kwargs) File "/Users/.../h2ogpt/src/gen.py", line 4288, in generate_with_exceptions func(*args, **kwargs) File "/Users/.../miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/Users/.../miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/generation/utils.py", line 1727, in generate model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation( File "/Users/.../miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/generation/utils.py", line 493, in _prepare_attention_mask_for_generation raise ValueError( ValueError: Can't infer missing attention mask on `mps` device. Please provide an `attention_mask` or use a different device.
It worked fine when I tried to run other older models like llama2 for example.
Do you know what could be the source of this issue ?
The text was updated successfully, but these errors were encountered: