-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Koniq10k dataloader do resize to (224, 224) and then apply transform with random crop? Why? #34
Comments
We select vision transformer as our feature extractor, which means the input images should be resized to the fixed image size(224X224). |
Hi @Stephen0808 ! Also, it means all Koniq10k images, which are initially, full resolution are resized to (224, 224). We loose the information of the full resolution image quality. (Usually IQA tramnsformers try to avoid resizing and leverage transformers architecture to accept different input sizes) What do you think? |
As mentioned in your question, we crop several images(224X224) for inference and average the scores to get the final score. |
I mean:
the question is: why not same process for training/inference? |
Both in the inference and training phases, we used cropped images. |
Ok maybe I misunderstood something.
So I was wondering why instead of step1, there are not several crops then send this crops to vit |
Hey!
I see in this line:
MANIQA/train_maniqa.py
Line 248 in b286649
Then you apply transform function that contains a random crop to size (224, 224).
Unless I miss something does the original image has to be resized?
Thanks!
The text was updated successfully, but these errors were encountered: