Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OVModelForVisionCausalLM #883

Merged
merged 21 commits into from
Oct 7, 2024
Merged

Conversation

eaidova
Copy link
Collaborator

@eaidova eaidova commented Aug 29, 2024

What does this PR do?

Enables conversion and inference for multimodality llm like llava, llava-next, falcon-vl, pixtral, internvl
Example of usage:

from PIL import Image
import requests
from optimum.intel.openvino import OVModelForVisualCausalLM
from transformers import AutoProcessor

model_id = "llava-hf/llava-v1.6-mistral-7b-hf"
model = OVModelForVisualCausalLM.from_pretrained(model_id)
image_file = "http://images.cocodataset.org/val2017/000000039769.jpg"
processor = AutoProcessor.from_pretrained(model_id)

conversation = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "What are these?"},
            {"type": "image"},
        ],
    },
]
prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)
raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(images=raw_image, text=prompt, return_tensors="pt")

output = model.generate(**inputs, max_new_tokens=20, do_sample=False)
print(processor.decode(output[0], skip_special_tokens=True))

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@eaidova eaidova changed the title Ea/llava model OVModelForVisionCausalLM Aug 29, 2024
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@eaidova eaidova marked this pull request as ready for review September 17, 2024 19:29
@eaidova
Copy link
Collaborator Author

eaidova commented Sep 18, 2024

@echarlaix could you please take a look? Thanks!

P.S. I'm still working on extending models coverage, but I think it makes sense to start looking on general API now

@eaidova eaidova force-pushed the ea/llava_model branch 6 times, most recently from c35d1f2 to e0da998 Compare September 26, 2024 12:32
**kwargs,
):
"""
Export a vanilla Transformers model into an ONNX model using `transformers.onnx.export_onnx`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really export to the ONNX? The docstrings should be revised accordingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I accidentally copied that from https://github.com/huggingface/optimum-intel/blob/main/optimum/intel/openvino/modeling_base.py#L534
I see the same doc string in other model classes for openvino (seq2seq, stable diffusion support e.t.c.), @echarlaix maybe it is time to revise them in all models too?

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thanks a lot @eaidova

optimum/exporters/openvino/stateful.py Outdated Show resolved Hide resolved
Comment on lines +808 to +812
if model_type == "internvl-chat" and preprocessors is not None:
model.config.img_context_token_id = preprocessors[0].convert_tokens_to_ids("<IMG_CONTEXT>")

if hasattr(model, "image_newline"):
model.config.image_newline = model.image_newline.tolist()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is tranable parameter filled by random values during model weights init, we can not capture this during model export as it is not a part of submodels inference forward pass, that is why for transferring it from pt to ov I need to use config for that. If you have better suggestion, I can try to implement this

optimum/exporters/openvino/convert.py Show resolved Hide resolved
optimum/exporters/openvino/model_patcher.py Show resolved Hide resolved
optimum/intel/utils/modeling_utils.py Outdated Show resolved Hide resolved
optimum/exporters/openvino/model_configs.py Outdated Show resolved Hide resolved
optimum/exporters/openvino/model_configs.py Outdated Show resolved Hide resolved
optimum/intel/openvino/modeling_visual_language.py Outdated Show resolved Hide resolved
optimum/intel/openvino/modeling_visual_language.py Outdated Show resolved Hide resolved
@eaidova eaidova requested a review from echarlaix October 2, 2024 07:32
Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge work, thanks @eaidova, left couple of minor comments

optimum/intel/openvino/modeling_visual_language.py Outdated Show resolved Hide resolved
optimum/intel/openvino/modeling_visual_language.py Outdated Show resolved Hide resolved
optimum/intel/openvino/modeling_visual_language.py Outdated Show resolved Hide resolved
optimum/intel/openvino/modeling_visual_language.py Outdated Show resolved Hide resolved
@eaidova eaidova force-pushed the ea/llava_model branch 2 times, most recently from 1164b36 to a97f962 Compare October 3, 2024 05:43
@echarlaix
Copy link
Collaborator

P.S. I'm still working on extending models coverage, but I think it makes sense to start looking on general API now

Very nice, let's merge this PR and we can always extend support in following PRs, does it work for you @eaidova ?

@eaidova
Copy link
Collaborator Author

eaidova commented Oct 4, 2024

P.S. I'm still working on extending models coverage, but I think it makes sense to start looking on general API now

Very nice, let's merge this PR and we can always extend support in following PRs, does it work for you @eaidova ?

@echarlaix , yes, I agree, could you please merge?

@echarlaix echarlaix merged commit 749a1d6 into huggingface:main Oct 7, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants