Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add colpali #1

Closed
wants to merge 25 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
e377f97
feat: run `add-new-model-like`
tonywu71 Sep 18, 2024
e6b9d0c
feat: add paligemma code with "copied from"
tonywu71 Sep 19, 2024
75041b5
feat: add ColPaliProcessor
tonywu71 Sep 19, 2024
d942533
feat: add ColPaliModel
tonywu71 Sep 19, 2024
289c808
feat: add ColPaliConfig
tonywu71 Sep 19, 2024
07b5a98
feat: rename `ColPaliForConditionalGeneration` to `ColPaliModel`
tonywu71 Sep 19, 2024
9a06c08
fixup modeling colpali
tonywu71 Sep 19, 2024
d4fd2d3
fix: fix root import shortcuts
tonywu71 Sep 19, 2024
34e443b
fix: fix `modeling_auto` dict
tonywu71 Sep 19, 2024
5e52bb3
feat: comment out ColPali test file
tonywu71 Sep 19, 2024
8ba5649
fix: fix typos from `add-new-model-like`
tonywu71 Sep 19, 2024
3a16d6b
feat: explicit the forward input args
tonywu71 Sep 26, 2024
54d5310
feat: move everything to `modular_colpali.py`
tonywu71 Sep 26, 2024
48f27f2
fix: put back ColPaliProcesor
tonywu71 Sep 26, 2024
710b4e2
feat: add auto-generated files
tonywu71 Sep 26, 2024
dda3312
fix: run `fix-copies`
tonywu71 Sep 26, 2024
3500c63
fix: remove DOCStRING constants to make modular converter work
tonywu71 Sep 26, 2024
fce02cf
fix: fix typo + modular converter
tonywu71 Sep 26, 2024
ad3ea52
fix: add missing imports
tonywu71 Sep 26, 2024
dea3965
feat: no more errors when loading ColPaliModel
tonywu71 Sep 26, 2024
bb0818d
fix: remove unused args in forward + tweak doc
tonywu71 Sep 26, 2024
fd5456b
feat: rename `ColPaliModel` to `ColPaliForRetrieval`
tonywu71 Sep 26, 2024
c36b20c
fix: apply `fix-copies`
tonywu71 Sep 26, 2024
1643705
temp fix for modular converter: drop commit when PR is merged!
tonywu71 Sep 26, 2024
901cc8d
feat: add ColPaliProcessor to `modular_colpali`
tonywu71 Sep 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -812,6 +812,8 @@
title: CLIPSeg
- local: model_doc/clvp
title: CLVP
- local: model_doc/colpali
title: ColPali
- local: model_doc/data2vec
title: Data2Vec
- local: model_doc/deplot
Expand Down
1 change: 1 addition & 0 deletions docs/source/en/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ Flax), PyTorch, and/or TensorFlow.
| [CodeGen](model_doc/codegen) | ✅ | ❌ | ❌ |
| [CodeLlama](model_doc/code_llama) | ✅ | ❌ | ✅ |
| [Cohere](model_doc/cohere) | ✅ | ❌ | ❌ |
| [ColPali](model_doc/colpali) | ❌ | ❌ | ❌ |
| [Conditional DETR](model_doc/conditional_detr) | ✅ | ❌ | ❌ |
| [ConvBERT](model_doc/convbert) | ✅ | ✅ | ❌ |
| [ConvNeXT](model_doc/convnext) | ✅ | ✅ | ❌ |
Expand Down
47 changes: 47 additions & 0 deletions docs/source/en/model_doc/colpali.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.

⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.

-->

# ColPali

## Overview

The ColPali model was proposed in [<INSERT PAPER NAME HERE>](<INSERT PAPER LINK HERE>) by <INSERT AUTHORS HERE>.
<INSERT SHORT SUMMARY HERE>

The abstract from the paper is the following:

*<INSERT PAPER ABSTRACT HERE>*

Tips:

<INSERT TIPS ABOUT MODEL HERE>

This model was contributed by [INSERT YOUR HF USERNAME HERE](https://huggingface.co/<INSERT YOUR HF USERNAME HERE>).
The original code can be found [here](<INSERT LINK TO GITHUB REPO HERE>).


## ColPaliConfig

[[autodoc]] ColPaliConfig

## ColPaliProcessor

[[autodoc]] ColPaliProcessor

## ColPaliForRetrieval

[[autodoc]] ColPaliForRetrieval
- forward
14 changes: 14 additions & 0 deletions src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -640,6 +640,7 @@
"OwlViTVisionConfig",
],
"models.paligemma": ["PaliGemmaConfig"],
"models.colpali": ["ColPaliConfig"],
"models.patchtsmixer": ["PatchTSMixerConfig"],
"models.patchtst": ["PatchTSTConfig"],
"models.pegasus": [
Expand Down Expand Up @@ -1735,6 +1736,12 @@
]
)
_import_structure["models.cohere"].extend(["CohereForCausalLM", "CohereModel", "CoherePreTrainedModel"])
_import_structure["models.colpali"].extend(
[
"ColPaliForRetrieval",
"ColPaliProcessor",
]
)
_import_structure["models.conditional_detr"].extend(
[
"ConditionalDetrForObjectDetection",
Expand Down Expand Up @@ -5091,6 +5098,9 @@
CodeGenTokenizer,
)
from .models.cohere import CohereConfig
from .models.colpali import (
ColPaliConfig,
)
from .models.conditional_detr import (
ConditionalDetrConfig,
)
Expand Down Expand Up @@ -6532,6 +6542,10 @@
CohereModel,
CoherePreTrainedModel,
)
from .models.colpali import (
ColPaliForRetrieval,
ColPaliProcessor,
)
from .models.conditional_detr import (
ConditionalDetrForObjectDetection,
ConditionalDetrForSegmentation,
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
code_llama,
codegen,
cohere,
colpali,
conditional_detr,
convbert,
convnext,
Expand Down
3 changes: 3 additions & 0 deletions src/transformers/models/auto/configuration_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,7 @@
("owlv2", "Owlv2Config"),
("owlvit", "OwlViTConfig"),
("paligemma", "PaliGemmaConfig"),
("colpali", "ColPaliConfig"),
("patchtsmixer", "PatchTSMixerConfig"),
("patchtst", "PatchTSTConfig"),
("pegasus", "PegasusConfig"),
Expand Down Expand Up @@ -512,6 +513,8 @@
("owlv2", "OWLv2"),
("owlvit", "OWL-ViT"),
("paligemma", "PaliGemma"),
("colpali", "ColPali"),
("colpali", "ColPali"),
("patchtsmixer", "PatchTSMixer"),
("patchtst", "PatchTST"),
("pegasus", "Pegasus"),
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/auto/modeling_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,7 @@
("big_bird", "BigBirdForPreTraining"),
("bloom", "BloomForCausalLM"),
("camembert", "CamembertForMaskedLM"),
("colpali", "ColPaliForRetrieval"),
("ctrl", "CTRLLMHeadModel"),
("data2vec-text", "Data2VecTextForMaskedLM"),
("deberta", "DebertaForMaskedLM"),
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/auto/processing_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
("clip", "CLIPProcessor"),
("clipseg", "CLIPSegProcessor"),
("clvp", "ClvpProcessor"),
("colpali", "ColPaliProcessor"),
("flava", "FlavaProcessor"),
("fuyu", "FuyuProcessor"),
("git", "GitProcessor"),
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/auto/tokenization_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@
),
("codegen", ("CodeGenTokenizer", "CodeGenTokenizerFast" if is_tokenizers_available() else None)),
("cohere", (None, "CohereTokenizerFast" if is_tokenizers_available() else None)),
("colpali", ("PaligemmaTokenizer", "PaligemmaTokenizerFast" if is_tokenizers_available() else None)),
("convbert", ("ConvBertTokenizer", "ConvBertTokenizerFast" if is_tokenizers_available() else None)),
(
"cpm",
Expand Down
51 changes: 51 additions & 0 deletions src/transformers/models/colpali/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Copyright 2024 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from typing import TYPE_CHECKING

from ...utils import OptionalDependencyNotAvailable, _LazyModule, is_torch_available


_import_structure = {"configuration_colpali": ["ColPaliConfig"]}


try:
if not is_torch_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
pass
else:
_import_structure["modeling_colpali"] = [
"ColPaliForRetrieval",
"ColPaliPreTrainedModel",
]
_import_structure["processing_colpali"] = ["ColPaliProcessor"]


if TYPE_CHECKING:
from .configuration_colpali import ColPaliConfig

try:
if not is_torch_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
pass
else:
from .modeling_colpali import ColPaliForRetrieval
from .processing_colpali import ColPaliProcessor


else:
import sys

sys.modules[__name__] = _LazyModule(__name__, globals()["__file__"], _import_structure)
42 changes: 42 additions & 0 deletions src/transformers/models/colpali/configuration_colpali.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
# This file was automatically generated from <path_to_modular_file.py>.
# Do NOT edit this file manually as any edits will be overwritten by the generation of
# the file from the modular. If any change should be done, please apply the change to the
# modular_xxx.py file directly. One of our CI enforces this
# 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨
# coding=utf-8
# Copyright 2024 The HuggingFace Inc. team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


from ..paligemma import (
PaliGemmaConfig,
)


class ColPaliConfig(PaliGemmaConfig):
r"""
This is the configuration class to store the configuration of a [`ColPaliForRetrieval`]. It is used to instantiate an
ColPaliForRetrieval according to the specified arguments, defining the model architecture.

The ColPali config is stricly equivalent to the PaliGemma config, but with a different model type.

Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
documentation from [`PretrainedConfig`] for more information.
"""

def __init__(self, **kwargs):
super().__init__(**kwargs)
self.model_type = "colpali"
self.is_composition = False
Loading
Loading