New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add ColPali to 🤗 transformers #33736

Merged

ArthurZucker merged 137 commits into huggingface:main from tonywu71:add-colpali

Dec 17, 2024

Contributor

tonywu71 commented Sep 26, 2024 •

edited

Loading

What does this PR do?

Add ColPali support in 🤗 transformers.

Who can review?

@yonigozlan 😉
@ArthurZucker

Additional details

This PR uses the new Modular 🤗 transformers feature from v4.45.0
The ColPali is mainly inspired from the colpali-engine repository I'm maintaining with my co-authors. The initial code was taken from colpali-engine==v0.3.0.
To prevent adding the colpali-engine dependency, the weights are directly loaded from an exported state_dict stored in vidore/colpali-v1.2-merged-state_dict.
The newly converted model weights are stored in vidore/colpali-v1.2-hf.
I have contributed a small dataset for integration testing for visual retrievers: vidore/document-visual-retrieval-test. I believe this should be moved to hf-internal-testing/document-visual-retrieval-test.
I had a lot of trouble with the weight conversion part. Turns out there was a reproducibility issue depending on the version of torch. For reproducibility, I've added the freezed deps below in the PR.

Progress checklist

TODO

This was referenced Sep 26, 2024

Add colpali tonywu71/transformers#1

Closed

Fix modular model converter unable to generate Processor classes #33737

Merged

yonigozlan reviewed

View reviewed changes

Member

yonigozlan left a comment

Thanks so much for working on this! Excited to see ColPali in Transformers 🤗.
Main comment for now is to try and inherit directly from PaliGemmaForConditionalGeneration to make full use of modular and to avoid instantiating PaliGemmaForConditionalGeneration inside the model

src/transformers/models/auto/configuration_auto.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/processing_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

tonywu71 force-pushed the add-colpali branch 3 times, most recently from e0db84b to 9c742c5 Compare

September 27, 2024 22:18

yonigozlan reviewed

View reviewed changes

Member

yonigozlan left a comment

Looks like something is going wrong with the files generated by Modular, but otherwise looks good!
For the processor and modeling tests, you can take a look at other model tests to see how they should be implemented

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

tests/models/colpali/test_modeling_colpali.py Outdated Show resolved Hide resolved

tests/models/colpali/test_modeling_colpali.py Outdated Show resolved Hide resolved

tests/models/colpali/test_modeling_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

tonywu71 force-pushed the add-colpali branch 2 times, most recently from 08caa9d to e189393 Compare

October 2, 2024 20:38

tonywu71 mentioned this pull request

Modular converter ignores my Config and my ModelOutput classes #33900

Closed

4 tasks

tonywu71 force-pushed the add-colpali branch from af55055 to 08d1ac8 Compare

October 2, 2024 21:49

yonigozlan mentioned this pull request

Add support for inheritance from class with different suffix in modular #34077

Merged

tonywu71 force-pushed the add-colpali branch 4 times, most recently from f71f94a to f9895a8 Compare

October 16, 2024 12:34

tonywu71 commented

View reviewed changes

src/transformers/models/colpali/modular_colpali.py Show resolved Hide resolved

tonywu71 requested a review from yonigozlan

October 29, 2024 16:23

tonywu71 force-pushed the add-colpali branch from 1bef156 to c3e26ec Compare

October 29, 2024 16:49

yonigozlan reviewed

View reviewed changes

Member

yonigozlan left a comment •

edited

Loading

Thanks for working on this, apart from some nits and the ProcessingKwargs issue, it looks almost ready to go for me!

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

tests/models/colpali/test_modeling_colpali.py Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/modular_colpali.py Outdated Show resolved Hide resolved

src/transformers/models/colpali/convert_colpali_weights_to_hf.py Outdated Show resolved Hide resolved

docs/source/en/index.md Outdated Show resolved Hide resolved

Contributor Author

tonywu71 commented Oct 30, 2024

Deps used for the weight conversion (requirements.txt):

accelerate==1.0.1
aiohappyeyeballs==2.4.3
aiohttp==3.10.10
aiosignal==1.3.1
appnope==0.1.4
asttokens==2.4.1
attrs==24.2.0
certifi==2024.8.30
charset-normalizer==3.4.0
colpali_engine==0.3.3
comm==0.2.2
datasets==3.0.2
debugpy==1.8.7
decorator==5.1.1
dill==0.3.8
executing==2.1.0
filelock==3.16.1
frozenlist==1.5.0
fsspec==2024.9.0
gitdb==4.0.11
GitPython==3.1.18
huggingface-hub==0.26.2
idna==3.10
ipykernel==6.29.5
ipython==8.29.0
isort==5.13.2
jedi==0.19.1
Jinja2==3.1.4
jupyter_client==8.6.3
jupyter_core==5.7.2
libcst==1.5.0
markdown-it-py==3.0.0
MarkupSafe==3.0.2
matplotlib-inline==0.1.7
mdurl==0.1.2
mpmath==1.3.0
multidict==6.1.0
multiprocess==0.70.16
nest-asyncio==1.6.0
networkx==3.4.2
numpy==2.1.2
packaging==24.1
pandas==2.2.3
parso==0.8.4
pexpect==4.9.0
pillow==11.0.0
platformdirs==4.3.6
prompt_toolkit==3.0.48
propcache==0.2.0
psutil==6.1.0
ptyprocess==0.7.0
pure_eval==0.2.3
pyarrow==18.0.0
Pygments==2.18.0
python-dateutil==2.9.0.post0
pytz==2024.2
PyYAML==6.0.2
pyzmq==26.2.0
regex==2024.9.11
requests==2.32.3
rich==13.9.3
ruff==0.5.1
safetensors==0.4.5
six==1.16.0
smmap==5.0.1
stack-data==0.6.3
sympy==1.13.1
tokenizers==0.20.1
torch==2.5.1
tornado==6.4.1
tqdm==4.66.6
traitlets==5.14.3
-e git+https://github.com/tonywu71/transformers.git@05df69d83bdc5398f590ca0bb623de330cec49dd#egg=transformers
typing_extensions==4.12.2
tzdata==2024.2
urllib3==1.26.20
wcwidth==0.2.13
xxhash==3.5.0
yarl==1.17.1

Contributor Author

tonywu71 commented Oct 31, 2024

Hi @ArthurZucker, could you review my PR for ColPali please? 😁 @yonigozlan was kind enough to do multiple reviews of the PR already, so it should almost ready to ship!

However, I noticed a few bugs with modular that will need to be fixed before merging:

There is the bug shown below where the ColPaliProcessorKwargs are not properly updated in the processing file.

Screenshot 2024-10-31 at 09 06 32

The docstrings seem not to be properly synchronized: see the forward documentation for ColPaliForRetrieval in src/transformers/models/colpali/modular_colpali.py vs src/transformers/models/colpali/modeling_colpali.py.

Any chance you could use with these please? 🙏🏼

tonywu71 force-pushed the add-colpali branch from 75d012b to 75a3c03 Compare

October 31, 2024 12:49

qubvel added New model Vision Multimodal labels

yonigozlan requested a review from ArthurZucker

October 31, 2024 19:01

tonywu71 force-pushed the add-colpali branch from 94ed816 to 9d2fb85 Compare

November 1, 2024 09:59

Contributor Author

tonywu71 commented Nov 1, 2024 •

edited

Loading

Hi @ArthurZucker, could you review my PR for ColPali please? 😁 @yonigozlan was kind enough to do multiple reviews of the PR already, so it should almost ready to ship!

However, I noticed a few bugs with modular that will need to be fixed before merging:

There is the bug shown below where the ColPaliProcessorKwargs are not properly updated in the processing file.

The docstrings seem not to be properly synchronized: see the forward documentation for ColPaliForRetrieval in src/transformers/models/colpali/modular_colpali.py vs src/transformers/models/colpali/modeling_colpali.py.

Any chance you could use with these please? 🙏🏼

This was fixed thanks to #3448! 👍🏼

Collaborator

ArthurZucker commented Nov 5, 2024

Will review today! Sorry for the delay!


          fix: remove non-needed sanity checks in weight conversion script + tw…

9f34d80

…eaks

tonywu71 force-pushed the add-colpali branch from 28e35bd to 9f34d80 Compare

November 28, 2024 17:19

tonywu71 and others added 9 commits

November 28, 2024 18:41


          fix: fix issue with replace_return_docstrings in ColPali's forward

05c29da


          docs: update docstring for ColPaliConfig

f67e217


          test: change model path in ColPali test

2ed868c


          fix: fix ColPaliConfig

2aa5e9d


          fix: fix weight conversion script

e6944ad


          test: fix expected weights for ColPali model

337a0a0


          docs: update ColPali markdown

c10e760


          docs: fix minor typo in ColPaliProcessor

69d01fc


          Fix tests and add _no_split_modules

yonigozlan reviewed

View reviewed changes

src/transformers/models/colpali/modeling_colpali.py Show resolved Hide resolved

yonigozlan requested a review from ArthurZucker

November 29, 2024 17:59

ArthurZucker reviewed

View reviewed changes

Collaborator

ArthurZucker left a comment

LGTM, I think we need to just use the paligemma config, manually add the embedding dimension or use something PreTrainedConfig.num_labels

src/transformers/configuration_utils.py Outdated

Comment on lines 1125 to 1134

    
                      else:

                          # In case no valid text config is found, we might have a model with a vlm backbone

                          if hasattr(self, "vlm_config"):

                              for text_config_name in possible_text_config_names:

                                  if hasattr(self.vlm_config, text_config_name):

                                      text_config = getattr(self.vlm_config, text_config_name, None)

                                      if text_config is not None:

                                          valid_text_config_names += [text_config_name]

                          if len(valid_text_config_names) == 1:

                              return getattr(self.vlm_config, valid_text_config_names[0])

Collaborator

ArthurZucker Dec 2, 2024

I think it makes more sense to flatten whatever we can rather than introducing this! 🤗 You don't even need a new config class for ColPaliGemma, you can use the PaliGemma one!

src/transformers/models/colpali/modeling_colpali.py Show resolved Hide resolved

Collaborator

ArthurZucker commented Dec 2, 2024

Almost there 🤗 with the config update we should be good to merge

Collaborator

ArthurZucker commented Dec 5, 2024

Mmm okay, but in that case let's not have a very specifc change. vlm_config is just for paligemma. We just need a change in the ColPaliGemmaConfig not in modeling utils

Is the last thing to do


          add text_config to colpali config

7dce43f

yonigozlan added the run-slow label

yonigozlan and others added 4 commits

December 5, 2024 17:53


          [run slow] colpali

855f139


          Merge branch 'main' into add-colpali

603e9e4


          move inputs to torch_device in integration test

c41bad4


          skip test_model_parallelism

21c1309

Member

yonigozlan commented Dec 6, 2024

@ArthurZucker Fixed the issue with text_config and the slow tests ran successfully 🤗

tonywu71 added 2 commits

December 9, 2024 22:48


          docs: clarify quickstart snippet in ColPali's model card

505ad9e


          docs: update ColPali's model card

655bac7

yonigozlan requested a review from ArthurZucker

December 10, 2024 15:55


          Merge remote-tracking branch 'upstream/main' into add-colpali

e9af3a5

ArthurZucker approved these changes

View reviewed changes

Collaborator

ArthurZucker left a comment

Thanks for pushing through 🔥 Let's merge!

ArthurZucker merged commit f33a0ce into huggingface:main

28 checks passed

tonywu71 mentioned this pull request

Loosen package dependencies illuin-tech/colpali#166

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Multimodal New model run-slow Vision