Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cookbook "Receipt Data Extraction with VLMs" is not reproducible #1282

Open
hoesler opened this issue Nov 25, 2024 · 4 comments · May be fixed by #1306
Open

Cookbook "Receipt Data Extraction with VLMs" is not reproducible #1282

hoesler opened this issue Nov 25, 2024 · 4 comments · May be fixed by #1306
Labels

Comments

@hoesler
Copy link

hoesler commented Nov 25, 2024

Describe the issue as clearly as possible:

I tried to run the Cookbook "Receipt Data Extraction with VLMs" using "Qwen/Qwen2-VL-2B-Instruct" but ran into the described error. In short, the model seems to output just a ! character, which leads to a json parsing error.

A colleague of mine was able to reproduce the error on his machine.
Both our machines are MacBooks.

What is also interesting: downgrading to transformers 4.45 leads to the same error, but with output ```.

Steps/code to reproduce the bug:

# As described in the cookbook

# LLM stuff
import outlines
import torch
from transformers import AutoProcessor, AutoModelForCausalLM
from pydantic import BaseModel, Field
from typing import Literal, Optional, List

# Image stuff
from PIL import Image
import requests

# Rich for pretty printing
from rich import print

import outlines.caching as cache

cache.disable_cache()

# To use Qwen-2-VL:
from transformers import Qwen2VLForConditionalGeneration
model_name = "Qwen/Qwen2-VL-2B-Instruct"
model_class = Qwen2VLForConditionalGeneration

model = outlines.models.transformers_vision(
    model_name,
    model_class=model_class,
    model_kwargs={
        "device_map": "auto",
        "torch_dtype": torch.bfloat16,
        "trust_remote_code": True
    },
    processor_kwargs={
        "device": "mps", # set to "cpu" if you don't have a GPU
    },
)

def load_and_resize_image(image_path, max_size=1024):
    """
    Load and resize an image while maintaining aspect ratio

    Args:
        image_path: Path to the image file
        max_size: Maximum dimension (width or height) of the output image

    Returns:
        PIL Image: Resized image
    """
    image = Image.open(image_path)

    # Get current dimensions
    width, height = image.size

    # Calculate scaling factor
    scale = min(max_size / width, max_size / height)

    # Only resize if image is larger than max_size
    if scale < 1:
        new_width = int(width * scale)
        new_height = int(height * scale)
        image = image.resize((new_width, new_height), Image.Resampling.LANCZOS)

    return image

# Path to the image
image_path = "https://github.com/dottxt-ai/outlines/raw/main/docs/cookbook/images/trader-joes-receipt.jpg"

# Download the image
response = requests.get(image_path)
with open("receipt.png", "wb") as f:
    f.write(response.content)

# Load + resize the image
image = load_and_resize_image("receipt.png")

class Item(BaseModel):
    name: str
    quantity: Optional[int]
    price_per_unit: Optional[float]
    total_price: Optional[float]

class ReceiptSummary(BaseModel):
    store_name: str
    store_address: str
    store_number: Optional[int]
    items: List[Item]
    tax: Optional[float]
    total: Optional[float]
    # Date is in the format YYYY-MM-DD. We can apply a regex pattern to ensure it's formatted correctly.
    date: Optional[str] = Field(pattern=r'\d{4}-\d{2}-\d{2}', description="Date in the format YYYY-MM-DD")
    payment_method: Literal["cash", "credit", "debit", "check", "other"]

# Set up the content you want to send to the model
messages = [
    {
        "role": "user",
        "content": [
            {
                # The image is provided as a PIL Image object
                "type": "image",
                "image": image,
            },
            {
                "type": "text",
                "text": f"""You are an expert at extracting information from receipts.
                Please extract the information from the receipt. Be as detailed as possible --
                missing or misreporting information is a crime.

                Return the information in the following JSON schema:
                {ReceiptSummary.model_json_schema()}
            """},
        ],
    }
]

# Convert the messages to the final prompt
processor = AutoProcessor.from_pretrained(model_name)
prompt = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)

# Prepare a function to process receipts
receipt_summary_generator = outlines.generate.json(
    model,
    ReceiptSummary,

    # Greedy sampling is a good idea for numeric
    # data extraction -- no randomness.
    sampler=outlines.samplers.greedy()
)

# Generate the receipt summary
result = receipt_summary_generator(prompt, [image])
print(result)

Expected result:

# As described in the cookbook

Error message:

The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
`Qwen2VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:04<00:00,  2.07s/it]
.venv/lib/python3.11/site-packages/transformers/generation/configuration_utils.py:590: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.0` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
  warnings.warn(
.venv/lib/python3.11/site-packages/transformers/generation/configuration_utils.py:590: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.0` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
  warnings.warn(
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Traceback (most recent call last):
  File "./receipt-digitization.py", line 132, in <module>
    result = receipt_summary_generator(prompt, [image])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "outlines/generate/api.py", line 565, in __call__
    return self._format(completions)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "outlines/generate/api.py", line 488, in _format
    return self.format_sequence(sequences)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "outlines/generate/json.py", line 50, in <lambda>
    generator.format_sequence = lambda x: schema_object.model_validate_json(x)
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/pydantic/main.py", line 651, in model_validate_json
    return cls.__pydantic_validator__.validate_json(json_data, strict=strict, context=context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for ReceiptSummary
  Invalid JSON: expected value at line 1 column 1 [type=json_invalid, input_value='!', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/json_invalid

Outlines/Python version information:

Version information

``` 0.1.5.dev0+gc406da8.d20241121 Python 3.11.10 (main, Oct 16 2024, 08:56:36) [Clang 18.1.8 ] accelerate==1.1.1 aiohappyeyeballs==2.4.3 aiohttp==3.11.6 aiosignal==1.3.1 airportsdata==20241001 annotated-types==0.7.0 anyio==4.6.2.post1 attrs==24.2.0 beartype==0.15.0 certifi==2024.8.30 cfgv==3.4.0 chardet==5.2.0 charset-normalizer==3.4.0 cloudpickle==3.1.0 coverage==7.6.7 cramjam==2.9.0 datasets==3.1.0 diff_cover==9.2.0 dill==0.3.8 diskcache==5.6.3 distlib==0.3.9 distro==1.9.0 exllamav2==0.2.4 fastparquet==2024.11.0 filelock==3.16.1 frozenlist==1.5.0 fsspec==2024.9.0 h11==0.14.0 httpcore==1.0.7 httpx==0.27.2 huggingface-hub==0.26.2 identify==2.6.2 idna==3.10 iniconfig==2.0.0 interegular==0.3.3 Jinja2==3.1.4 jiter==0.7.1 jsonschema==4.23.0 jsonschema-specifications==2024.10.1 lark==1.2.2 llama_cpp_python==0.3.2 markdown-it-py==3.0.0 MarkupSafe==3.0.2 mdurl==0.1.2 mlx==0.20.0 mlx-lm==0.19.3 mpmath==1.3.0 multidict==6.1.0 multiprocess==0.70.16 nest-asyncio==1.6.0 networkx==3.4.2 ninja==1.11.1.1 nodeenv==1.9.1 numpy==1.26.4 openai==1.55.0 -e git+https://github.com/dottxt-ai/outlines.git@c406da8#egg=outlines outlines_core==0.1.17 packaging==24.2 pandas==2.2.3 pillow==11.0.0 platformdirs==4.3.6 pluggy==1.5.0 pre_commit==4.0.1 propcache==0.2.0 protobuf==5.28.3 psutil==6.1.0 py-cpuinfo==9.0.0 pyarrow==18.0.0 pycountry==24.6.1 pydantic==2.10.0 pydantic_core==2.27.0 Pygments==2.18.0 pytest==8.3.3 pytest-benchmark==5.1.0 pytest-cov==6.0.0 pytest-mock==3.14.0 python-dateutil==2.9.0.post0 pytz==2024.2 PyYAML==6.0.2 referencing==0.35.1 regex==2024.11.6 requests==2.32.3 responses==0.25.3 rich==13.9.4 rpds-py==0.21.0 safetensors==0.4.5 sentencepiece==0.2.0 six==1.16.0 sniffio==1.3.1 sympy==1.13.1 tokenizers==0.20.3 torch==2.5.1 tqdm==4.67.0 transformers==4.46.3 typing_extensions==4.12.2 tzdata==2024.2 urllib3==2.2.3 virtualenv==20.27.1 websockets==14.1 xxhash==3.5.0 yarl==1.18.0 ```

Context for the issue:

No response

@hoesler hoesler added the bug label Nov 25, 2024
@cpfiffer
Copy link
Contributor

I'm on Linux and cannot replicate, seems to be a Mac specific issue related to MPS.

@hoesler
Copy link
Author

hoesler commented Nov 26, 2024

Thanks for the reply. It is definitely not MPS related. Setting device to CPU doesn't change the result.

Any help is still appreciated.

I tried to debug it a bit. The problem is somewhere in the logits_processor.
Running this code block here with the logits_processor set to None, gives me a markdown formatted (wrapped with three backticks) json response.

My gut feeling was, that the initial token was not selected properly and then the FSM gets to an end state after that.
Then I found out, that there was recently a somewhat bigger change by introducing the Index class.
So i gave it a try using outlines_core==0.1.14 and yep, here is what I get now. Still an error, but one step further:

pydantic_core._pydantic_core.ValidationError: 1 validation error for ReceiptSummary
  Invalid JSON: trailing characters at line 1 column 1326 [type=json_invalid, input_value='{"store_name": "Trader J...!!!!!!!!!!!!!!!!!!!!!!!', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/json_invalid

This is the full output. Seems like, now the end state transition isn't behaving as expected.

{"store_name": "Trader Joe's", "store_address": "401 Bay Street, San Francisco, CA 94133", "store_number": null, "items": [{"name": "BANANA EACH", "quantity": 7, "price_per_unit": 0.23, "total_price": 1.61}, {"name": "BAREBELLS CHOCOLATE DOUG", "quantity": 2, "price_per_unit": 2.29, "total_price": 4.58}, {"name": "BAREBELLS CREAMY CRISP", "quantity": 2, "price_per_unit": 2.29, "total_price": 4.58}, {"name": "BAREBELLS CHOCOLATE DOUG", "quantity": 2, "price_per_unit": 2.29, "total_price": 4.58}, {"name": "BAREBELLS CARAMEL CASHEW", "quantity": 2, "price_per_unit": 2.29, "total_price": 4.58}, {"name": "BAREBELLS CREAMY CRISP", "quantity": 2, "price_per_unit": 2.29, "total_price": 4.58}, {"name": "SPINDRIFT ORANGE MANGO 8", "quantity": 8, "price_per_unit": 7.49, "total_price": 6.79}, {"name": "MILK ORGANIC GALLON WHOL", "quantity": 1, "price_per_unit": 6.79, "total_price": 6.79}, {"name": "CLASSIC GREEK SALAD", "quantity": 1, "price_per_unit": 3.49, "total_price": 3.49}, {"name": "COBB SALAD", "quantity": 1, "price_per_unit": 5.99, "total_price": 5.99}, {"name": "PEPPER BELL RED XL EACH", "quantity": 1, "price_per_unit": 1.29, "total_price": 1.29}, {"name": "BAG FEE.", "quantity": 1, "price_per_unit": 0.25, "total_price": 0.25}, {"name": "BAG FEE.", "quantity": 1, "price_per_unit": 0.25, "total_price": 0.25}], "tax": 7.89, "total": 41.98, "date": "2023-04-15", "payment_method": "debit"}!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!&!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!&!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!&!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

@hoesler hoesler linked a pull request Dec 2, 2024 that will close this issue
@hoesler
Copy link
Author

hoesler commented Dec 2, 2024

@cpfiffer You were right with your suspicion. My fault was, that I missed the device="mps" argument to transformers_vision() and only changed the processor_kwargs["device"] in my test. See linked PR for a suggested fix.

@cpfiffer
Copy link
Contributor

cpfiffer commented Dec 2, 2024

Ah excellent! Appreciate you digging into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants