Introduce outlines.models.transformers_multimodal #33

lapp0 · 2024-06-19T18:09:12Z

Docs: https://github.com/lapp0/outlines/blob/multimodal-models/docs/reference/models/multimodal.md

Done:

core implementation and all components necessary for structured generation with image and video input

Todo:

more unit tests for MultiModalSequenceGeneratorAdapter
Find a tiny vision model for test_generate.py, current model is too expensive to be part of test suite.
Test models and architectures other than llava-hf/llava-v1.6-mistral-7b-hf
fix batch request handling, you can have multiple images per prompt

Improve docs:

reference transformers.md for capabilities
show direct image loading, and local file image loading
more detailed introduction section
how to add multiple images
caveat on including <image> token in prompt

lapp0 · 2024-06-19T18:11:15Z

outlines/generate/api.py

+    def __call__(  # type: ignore
+        self,
+        prompts: Union[str, List[str]],
+        media: Union[str, Any],


change Any to PIL.Image

rlouf · 2024-06-20T07:56:41Z

Let’s call it transformers_vision which I think is a more specific name.

rlouf · 2024-07-19T11:58:03Z

outlines/generate/api.py

+        return prompts, media
+
+    @classmethod
+    def _load_media(cls, media):


I'm not sure this should be part of the library?

Probably not, it's a convenience, but unnecessary. Removing.

rlouf · 2024-07-19T11:58:53Z

outlines/models/transformers_multimodal.py

+    from outlines.processors import OutlinesLogitsProcessor
+
+
+class TransformersMultiModal(Transformers):


Suggested change

class TransformersMultiModal(Transformers):

class TransformersVision(Transformers):

rlouf · 2024-07-19T11:59:36Z

outlines/models/transformers_multimodal.py

+            yield self._decode_generation(output_group_ids)
+
+
+def transformers_multimodal(


Suggested change

def transformers_multimodal(

def transformers_vision(

lapp0 commented Jun 19, 2024

View reviewed changes

lapp0 force-pushed the multimodal-models branch 2 times, most recently from 11c0db1 to a6c229e Compare June 19, 2024 20:02

lapp0 force-pushed the fix-mamba-integration branch from f17913b to 48b6f8f Compare July 15, 2024 09:05

rlouf force-pushed the fix-mamba-integration branch from 48b6f8f to bf3694c Compare July 15, 2024 13:53

lapp0 force-pushed the fix-mamba-integration branch 8 times, most recently from 75dc370 to acb0759 Compare July 16, 2024 00:08

lapp0 force-pushed the multimodal-models branch from a6c229e to a40ec2f Compare July 19, 2024 10:11

lapp0 changed the base branch from fix-mamba-integration to main July 19, 2024 10:13

lapp0 force-pushed the multimodal-models branch 4 times, most recently from bdf4097 to 653fe26 Compare July 19, 2024 11:56

rlouf reviewed Jul 19, 2024

View reviewed changes

lapp0 force-pushed the multimodal-models branch 9 times, most recently from 9cc0775 to 56b5918 Compare July 19, 2024 14:22

lapp0 force-pushed the multimodal-models branch 17 times, most recently from 6adb73b to 9ae6e70 Compare July 19, 2024 15:50

Introduce outlines.models.transformers_vision

43424c6

lapp0 force-pushed the multimodal-models branch from 9ae6e70 to 43424c6 Compare July 19, 2024 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce outlines.models.transformers_multimodal #33

Introduce outlines.models.transformers_multimodal #33

lapp0 commented Jun 19, 2024 •

edited

Loading

lapp0 Jun 19, 2024

rlouf commented Jun 20, 2024

rlouf Jul 19, 2024

lapp0 Jul 19, 2024

rlouf Jul 19, 2024

rlouf Jul 19, 2024

		from outlines.processors import OutlinesLogitsProcessor


		class TransformersMultiModal(Transformers):

	class TransformersMultiModal(Transformers):
	class TransformersVision(Transformers):

		yield self._decode_generation(output_group_ids)


		def transformers_multimodal(

Introduce outlines.models.transformers_multimodal #33

Are you sure you want to change the base?

Introduce outlines.models.transformers_multimodal #33

Conversation

lapp0 commented Jun 19, 2024 • edited Loading

lapp0 Jun 19, 2024

Choose a reason for hiding this comment

rlouf commented Jun 20, 2024

rlouf Jul 19, 2024

Choose a reason for hiding this comment

lapp0 Jul 19, 2024

Choose a reason for hiding this comment

rlouf Jul 19, 2024

Choose a reason for hiding this comment

rlouf Jul 19, 2024

Choose a reason for hiding this comment

lapp0 commented Jun 19, 2024 •

edited

Loading