Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you get the position information for images? #1

Open
kywen1119 opened this issue Sep 19, 2019 · 5 comments
Open

How do you get the position information for images? #1

kywen1119 opened this issue Sep 19, 2019 · 5 comments

Comments

@kywen1119
Copy link

Hi!
What a great work. Could you tell me how did you get the pre-trained position information for images?Thansk a lot!

@HaoYang0123
Copy link
Owner

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

@LgQu
Copy link

LgQu commented Dec 17, 2019

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hi HaoYang,
I wonder that how do you get the raw region information(x, y, w, h). Did you run Faster rcnn by yourself or get it from other sources? How to align the region information with the precomputed features from SCAN?
Thank you very much.

@weiyunfei
Copy link

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hello, Hao Yang. Thanks for your excellent work and code. However, I have some questions about the paper and code. Firstly, I wonder how you transfered the coordinates to 15 dims like your comments describing in model_attention.py, as I know, the coordinates of a box should be 4-dim. For another, in your paper, you suggest that equally the whole image 𝐼 is splited into 𝐾×𝐾 blocks 𝐵, but I have not find this part in your code release.

@Liquor520
Copy link

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hi HaoYang, I wonder that how do you get the raw region information(x, y, w, h). Did you run Faster rcnn by yourself or get it from other sources? How to align the region information with the precomputed features from SCAN? Thank you very much.

hello,I also have such doubts. Have you solved them? If it is settled, I wonder if you can leave a contact way to discuss it?

Looking forward to receiving your reply

@Liquor520
Copy link

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hello, Hao Yang. Thanks for your excellent work and code. However, I have some questions about the paper and code. Firstly, I wonder how you transfered the coordinates to 15 dims like your comments describing in model_attention.py, as I know, the coordinates of a box should be 4-dim. For another, in your paper, you suggest that equally the whole image 𝐼 is splited into 𝐾×𝐾 blocks 𝐵, but I have not find this part in your code release.

Faster rcnn model can automatically get both the visual features (2048-dim vector) and the position information. The position information about region contains four values: the coordinates (x,y) of left-up point and the width/height of this region.

Hello, Hao Yang. Thanks for your excellent work and code. However, I have some questions about the paper and code. Firstly, I wonder how you transfered the coordinates to 15 dims like your comments describing in model_attention.py, as I know, the coordinates of a box should be 4-dim. For another, in your paper, you suggest that equally the whole image 𝐼 is splited into 𝐾×𝐾 blocks 𝐵, but I have not find this part in your code release.

hello,I also have such doubts. Have you solved them? If it is settled, I wonder if you can leave a contact way to discuss it?

Looking forward to receiving your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants