Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post processing removes all detections #26

Open
rvrsprdx opened this issue Jul 8, 2021 · 8 comments
Open

Post processing removes all detections #26

rvrsprdx opened this issue Jul 8, 2021 · 8 comments

Comments

@rvrsprdx
Copy link

rvrsprdx commented Jul 8, 2021

Hi,

thanks for your work.
I was wondering why I didn't get any detections on a video when running the demo.py .
I've then realized that the bboxes are empty [] after post_process() is invoked. In fact result is empty as well.
result = self.post_process(dets, meta, scale, output) in Line 141 detector.py
However, before that line the detections dets contain reasonable bboxes for the frame.
Can you tell me why that is and maybe how to solve this?
I'd like to save these detections in a text file. I can access the dets but I'm not sure how to access the corresponding tracking id.

Thanks.

@JialianW
Copy link
Owner

JialianW commented Jul 8, 2021

@rvrsprdx
Copy link
Author

rvrsprdx commented Jul 9, 2021

Thanks for your quick reply.

While trying to find the issue I found out that the output of dets["bboxes"] before post_processing changes with identical demo run configurations.
Sometimes I get some negative numbers as bbox coordinates. Other times, I get very realistic bboxes for all the frames. And sometimes, I get a mix of realistic and unrealistic bboxes for frames.

CUDA_VISIBLE_DEVICES=0 python demo.py tracking --load_model ../checkpoints/trades_epoch400_std_params.pth \ --demo ../videos/myvideo.avi \ --pre_hm --ltrb_amodal --pre_thresh 0.5 --track_thresh 0.1 --inference \ --clip_len 14 \ --trades \ --save_video \ --resize_video \ --input_h 544 --input_w 960 \
Again, I've run this exact command 10 times and I get different values for det["bboxes"] each time.
To come back to my initial problem: Even when getting realistic bboxes, I still get empty bboxes after post_processing.

Another question regarding video resolution: The input_h and input_w are set to the network size I have used to train the model. The video however has a resolution of w=724, h=708. I cannot set the input_h and input_w to these values, I get the following error:
RuntimeError: The size of tensor a (91) must match the size of tensor b (90) at non-singleton dimension 3

@JialianW
Copy link
Owner

JialianW commented Jul 9, 2021

Have u ever tried the provided demo with provided trained model? Is there any problem for that demo? I haven't encountered this issue before.

For your second question, the resolution values need to be evenly divided by 32.

@rvrsprdx
Copy link
Author

rvrsprdx commented Jul 9, 2021

Yes, the demo works without any problems on the provided mot model and video.

I had trained on a custom (vehicle) dataset without using a pretrained model, in particular not crowdhuman. There is no issue with that, is it?
I get many missing weights warnings when the model is loaded.

I have to add that when negative values for the bboxes appear, they're very small (close to zero). Also, I have trained Center Tack successfully on the same dataset. I think the problem lies with the trained model...do you have any idea on that?

Thanks.

@JialianW
Copy link
Owner

JialianW commented Jul 9, 2021

It is not a problem for not using a pretrained model or the warnings.

If you have reasonable results before post-processing, it looks like not a problem from trained model. Did u check the usage of 'ltrb_amodal' and keep consistent when training and testing? It is related to the boxes outside images.

@rvrsprdx
Copy link
Author

Thanks for your reply.

I didn't change the ltrb_amodal value during training and testing so it should be the same.

Can you explain what ltrb_amodal and ltrb mean?

Thanks.

@rvrsprdx
Copy link
Author

I kinda solved this issue now.

  • Making sure to have the correct resolution (closest to the actual input resolution of the video)
  • Commenting out Line 177/178 in tracker.py
    #for r_ in ret: #del r_['embedding']
  • And most importantly: Setting --track_thresh 0 when running the demo.py

I still don't get any bboxes shown on the video but results now contains everything, including the bboxes.

This makes sense because there seem to be reasonable detections made but "just" the tracking seems to fail completely. I have trained on custom images so that's ok.
Running the inference several times with the same configuration on a video I get a MOTP ranging between 60-62. Is this kind of variance normal? Also, the number of detections seem to be capped at 100 detections per video (which might answer the questions in the last sentence). I don't think that's supposed to be. Where can I change this?

@HELOBILLY
Copy link

HELOBILLY commented Oct 29, 2022

hi, @rvrsprdx, I saw your discussion under this issue, which is very similar to my task, i.e.: training with static images and testing on image sequences. There are two categories being tracked, so my training script is:

python main.py tracking --exp_id my-exp --load_model ../models/crowdhuman_pretrained.pth --dataset custom --custom_dataset_ann_path my_train.json --custom_dataset_img_path my_images --input_h 1024 --input_w 1024 --num_classes 2 --pre_hm --ltrb_amodal --shift 0.05 --scale 0.05 --same_aug --hm_disturb 0.05 --lost_disturb 0.4 --fp_disturb 0.1 --num_epochs 30 --lr_step 15,25 --save_point 20,25 --gpus 0 --batch_size 4 --num_workers 8

I use the provided crowdhuman_pretrained.pth as pre-training weights. After training, I use the following script for prediction:

python demo.py tracking --load_model ../exp/tracking/my-exp/model_last.pth --demo ../data/test/video-1/img1 --input_h 1600 --input_w 1920 --save_results --num_class 2 --pre_hm --ltrb_amodal --pre_thresh 0.5 --track_thresh 0 --inference --clip_len 2 --trades

I comment out the Line 177/178 in tracker.py
#for r_ in ret: #del r_['embedding']
and set --track_thresh 0 when running the demo.py

However, the results still look bad, and the scores are quite low. Can you help me by pointing out some suggestions? I'd appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants