Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How many SAM images were used from ShareGPT4v? #44

Open
OpenJarvisAI opened this issue Apr 12, 2024 · 2 comments
Open

How many SAM images were used from ShareGPT4v? #44

OpenJarvisAI opened this issue Apr 12, 2024 · 2 comments

Comments

@OpenJarvisAI
Copy link

I downloaded sharegpt4v used fientuneing part data, but always got image not found, am using finetune stage.

Does finetune used sharegpt4v pretrain data?

sharegpt4v finetune just used very little data from SAM.

Shall we download the. sam_000000 - 0000050 whole 500GB images for it?

@JulianJuaner
Copy link
Member

We adopt the 100K ShareGPT (caption) data in the SFT.
I will calculate the number of image files used in SAM and I can extract them to form a shared link to you (if the total number is not that large). Please stay tuned.

@OpenJarvisAI
Copy link
Author

Thanks, but I downloaded sam_000000 sam_000001 sam_0000002 and seems there no file found error got during whole training process.

BTW, it's even good if you guys can share a laion_gpt4_dataset_imags.zip since many urls are broken since your download.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants