how to link the image id in DATA JSON to the image in IMAGE URLS for WuKong #11

ghost · 2023-07-18T09:15:15Z

No description provided.

ghost · 2023-07-18T09:23:41Z

hi, I have downloaded the wukong data from the url provided in https://github.com/phellonchen/X-LLM/blob/main/README_DATA.md, the order of samples in CSV files is not consistent with the image id/name in JSON file, so how can l link them between original image urls and filtered image names?
@MingLunHan @phellonchen

rumusan · 2023-07-18T15:27:07Z

same question for cc3m

phellonchen · 2023-08-10T03:09:39Z

For Wukong dataset, we filtered the first 50 million images using Chinese-CLIP (Vit-B-16 model) and only kept samples with a visual-textual similarity score greater than 0.475. So, you will need to pair the captions with the corresponding images based on the image captions.

For CC3M, we will try to restore their original correspondence.

ghost changed the title ~~how to link the image id~~ how to link the image id in DATA JSON to the image in IMAGE URLS for WuKong Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to link the image id in DATA JSON to the image in IMAGE URLS for WuKong #11

how to link the image id in DATA JSON to the image in IMAGE URLS for WuKong #11

ghost commented Jul 18, 2023

ghost commented Jul 18, 2023 •

edited by ghost

Loading

rumusan commented Jul 18, 2023

phellonchen commented Aug 10, 2023

how to link the image id in DATA JSON to the image in IMAGE URLS for WuKong #11

how to link the image id in DATA JSON to the image in IMAGE URLS for WuKong #11

Comments

ghost commented Jul 18, 2023

ghost commented Jul 18, 2023 • edited by ghost Loading

rumusan commented Jul 18, 2023

phellonchen commented Aug 10, 2023

ghost commented Jul 18, 2023 •

edited by ghost

Loading