Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speculative Decoding: TypeError: list indices must be integers or slices, not tuple (Apple M1 MacOS Sonoma 14.6.1) #153

Open
solitaryangler opened this issue Oct 1, 2024 · 0 comments

Comments

@solitaryangler
Copy link

solitaryangler commented Oct 1, 2024

Hi,

I am trying to run Speculative Decoding from the example given here: huggingface.co/distil-whisper/distil-large-v2#speculative-decoding. I'm using the code:

from transformers import pipeline, AutoModelForCausalLM, AutoModelForSpeechSeq2Seq, AutoProcessor
import torch
from datasets import load_dataset


device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32

assistant_model_id = "distil-whisper/distil-large-v2"

assistant_model = AutoModelForCausalLM.from_pretrained(
    assistant_model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)
assistant_model.to(device)

model_id = "openai/whisper-large-v2"

model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)
model.to(device)

processor = AutoProcessor.from_pretrained(model_id)

pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    max_new_tokens=128,
    generate_kwargs={"assistant_model": assistant_model},
    torch_dtype=torch_dtype,
    device=device,
)

dataset = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
sample = dataset[0]["audio"]

result = pipe(sample, return_timestamps=True)
print(result["text"])

My environment has: python-3.10.13 with (non-exhaustive list)

torch==2.6.0.dev20240925
torchaudio==2.5.0.dev20240925
torchvision==0.20.0.dev20240925
ffmpeg-python==0.2.0
future==1.0.0
librosa==0.10.2.post1
transformers==4.45.0
accelerate==0.34.2

I am running everything on an Apple M1 chip with MacOS Sonoma 14.6.1.

I am getting the following error:

miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py:496: FutureWarning: The input name `inputs` is deprecated. Please make sure to use `input_features` instead.
  warnings.warn(
Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`.
Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.
From v4.47 onwards, when a model cache is to be returned, `generate` will return a `Cache` instance instead by default (as opposed to the legacy tuple of tuples format). If you want to keep returning the legacy format, please set `return_legacy_cache=True`.
Traceback (most recent call last):
  File "test_specdec.py", line 41, in <module>
    result = pipe(sample, return_timestamps=True)
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 284, in __call__
    return super().__call__(inputs, **kwargs)
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1260, in __call__
    return next(
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/pipelines/pt_utils.py", line 124, in __next__
    item = next(self.iterator)
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/pipelines/pt_utils.py", line 269, in __next__
    processed = self.infer(next(self.iterator), **self.params)
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1175, in forward
    model_outputs = self._forward(model_inputs, **forward_params)
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/pipelines/automatic_speech_recognition.py", line 512, in _forward
    tokens = self.model.generate(
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py", line 671, in generate
    ) = self.generate_with_fallback(
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py", line 834, in generate_with_fallback
    seek_outputs = super().generate(
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/generation/utils.py", line 1992, in generate
    result = self._assisted_decoding(
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/generation/utils.py", line 4015, in _assisted_decoding
    candidate_input_ids, candidate_logits = candidate_generator.get_candidates(input_ids)
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/generation/candidate_generator.py", line 207, in get_candidates
    self.assistant_kwargs["past_key_values"] = _crop_past_key_values(
  File "miniconda3/envs/py3.10.13/lib/python3.10/site-packages/transformers/generation/candidate_generator.py", line 404, in _crop_past_key_values
    past_key_values[idx][0][:, :, :max_length, :],
TypeError: list indices must be integers or slices, not tuple

Kindly help!
Thanks.

@solitaryangler solitaryangler changed the title Speculative Decoding: TypeError: list indices must be integers or slices, not tuple Speculative Decoding: TypeError: list indices must be integers or slices, not tuple (Apple M1 MacOS Sonoma 14.6.1) Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant