-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image Captioning using Finetuned PaliGemma2 #112
Comments
@jethac @bebechien who actually worked on PaliGemma2 launch, any comment? |
Both of us worked on. We mainly focused on fine-tuning notebook for the launch but I agree with this request. I'll add a notebook with DOCCI soon. |
@windmaple @bebechien I want to say that we are using Keras CV and Keras NLP packages for Keras-specific and other examples. But recently Keras CV and Keras NLP merged into one package Keras Hub. The package is officially released in Pypi and further model releases will be there from now on. Keras's official documentation is also updated. Now do we plan to switch to Keras Hub for existing examples? As it is very easy, instead of keras nlp we need to install the Keras Hub and replace |
Correct, but no rushes, since all existing usage will continue to work and we are in the middle of transition. |
@bebechien Hello! In this image captioning context, we also have the ONNX model of PaliGemma2 that can be run with transformers.js. I experimented with it and prepared a Colab notebook to run the PaliGemma2 model with a Node.js application. Here is the link: https://colab.research.google.com/drive/1Ne6-j905479dmtlCMfyqiHCTD60LCNRN?usp=sharing You can view it and let me know your opinion on whether it is needed for example contribution or not. |
Looks great to me! I think the example demonstrates how to perform inference with Node.js. It's particularly useful for those who want to run the model directly in their browser without needing a server. |
Yes, that's the purpose of the notebook. Should I create a PR around it? |
Yes! And please follow the guide on contributing before sending a PR. Thanks! Let me close this issue since we added the image captioning example with PaliGemma2 |
Description of the feature request:
Recently, PaliGemma2 was released. We also have a dedicated folder for PaliGemma2. Both of the notebooks are actually about the fine-tuning domain using Jax and Keras. But we also need an inferencing notebook with PaliGemma2 that is missing right now. So my proposed notebook will add an inferencing notebook using Keras with the latest released fine-tuned checkpoint for the image captioning task.
What problem are you trying to solve with this feature?
With this PaliGemma2 release, we have a finetuned checkpoint for the DOCCI dataset. This can be a great use case for showing the image captioning task with this latest checkpoint. We can extend the feature to multilingual use cases as well.
Any other information you'd like to share?
The notebook will run the 3B PaliGemma2 version with bfloat16 that eventually can be run in the Colab T4 GPU via the multibackend Keras and Keras Hub.
cc: @windmaple
The text was updated successfully, but these errors were encountered: