Integrating GOT-OCR2.0 in Transformers 🤗 #137

yonigozlan · 2024-10-18T09:01:37Z

Hi!
First of all, congrats on such a great model!
I am an MLE at Hugging Face, and given the popularity and performance of your model, we are eager to integrate it into the Transformers 🤗 library. If you are interested in working with us (mostly helping us debugging if needed or clarifying certain aspects of the model) that would be great!
Looking forward to hearing back from you

Best,
Yoni

Ucas-HaoranWei · 2024-10-18T09:11:54Z

Hi Yoni,
It's an honor to integrate GOT into Transformers.
If you need any help, please feel free to contact me anytime.
My email is [email protected]

Best，
Haoran Wei

yonigozlan · 2024-11-14T17:37:05Z

Hi again Haoran,
The PR is up here :).
For now, I intend to only support inference in Transformers. In your experience since the model came out, are there a lot of users fine-tuning this model? I am guessing maybe at least for the stage3 post-training as described in the paper?
I am trying to gauge how useful it would be to support this last stage of post-training in Transformers. In any case, it can always be added later.
Thank you!

Ucas-HaoranWei · 2024-11-15T02:59:16Z

Hi again Haoran, The PR is up here :). For now, I intend to only support inference in Transformers. In your experience since the model came out, are there a lot of users fine-tuning this model? I am guessing maybe at least for the stage3 post-training as described in the paper? I am trying to gauge how useful it would be to support this last stage of post-training in Transformers. In any case, it can always be added later. Thank you!

Hi, Yoni,
Great job, I will review this PR as soon as possible.
There are still quite a few users who need to fine-tune the model, but I also suggest that you wait and see what people's future requirements are and then decide whether to incorporate training and fine-tuning later.
Thank you.

yonigozlan · 2024-11-24T20:13:01Z

Hi Haoran!
The PR is almost ready to go! I just wanted to ask some clarification on the box finegrained option. I saw that formats such as [x, y] or [x, y, x, y] could be added to the queries, but I was wondering what the x and y precisely are in both cases. Are they the coordinates in image pixels? In resized image pixels? And I'm confused in the first case how you can define a bow with two coordinates. Thank you in advance!

Ucas-HaoranWei · 2024-11-25T01:03:34Z

Hi Yoni,
Thank you for your work
[x y] is a single point, which can be ignored because GOT does not use the single-point positioning text (it is an initial idea that is not adopted later). [x0 y0 x1 y1] is a box, x0 y0 is the top left corner, and x1 y1 is the bottom right corner. They are the coordinates in image pixels. Their definition method is first to normalize the coordinates and then by 1000 times to eliminate decimal points.

yonigozlan · 2024-11-25T02:02:11Z

Thanks for the explanation!
By the way, would you like us to transfer the weights of the transformers implementation to your hub organization page? And if so what would you like us to name the weights? I was thinking of stepfun-ai/GOT-OCR2.0-hf , as we usually use the suffix "-hf" to signal a transformers model.

Ucas-HaoranWei · 2024-11-25T02:55:15Z

Of course, '-hf' is fine.

Ucas-HaoranWei added the good first issue Good for newcomers label Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrating GOT-OCR2.0 in Transformers 🤗 #137

Integrating GOT-OCR2.0 in Transformers 🤗 #137

yonigozlan commented Oct 18, 2024

Ucas-HaoranWei commented Oct 18, 2024

yonigozlan commented Nov 14, 2024 •

edited

Loading

Ucas-HaoranWei commented Nov 15, 2024

yonigozlan commented Nov 24, 2024

Ucas-HaoranWei commented Nov 25, 2024

yonigozlan commented Nov 25, 2024

Ucas-HaoranWei commented Nov 25, 2024

Integrating GOT-OCR2.0 in Transformers 🤗 #137

Integrating GOT-OCR2.0 in Transformers 🤗 #137

Comments

yonigozlan commented Oct 18, 2024

Ucas-HaoranWei commented Oct 18, 2024

yonigozlan commented Nov 14, 2024 • edited Loading

Ucas-HaoranWei commented Nov 15, 2024

yonigozlan commented Nov 24, 2024

Ucas-HaoranWei commented Nov 25, 2024

yonigozlan commented Nov 25, 2024

Ucas-HaoranWei commented Nov 25, 2024

yonigozlan commented Nov 14, 2024 •

edited

Loading