Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V100无法复现您的帧率 #38

Open
jiajia131 opened this issue Apr 6, 2024 · 6 comments
Open

V100无法复现您的帧率 #38

jiajia131 opened this issue Apr 6, 2024 · 6 comments

Comments

@jiajia131
Copy link

jiajia131 commented Apr 6, 2024

Originally posted by @junjiehe96 in #37 (comment)

        +---------------------------------------------------------------------------------------+

| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla V100-PCIE-32GB On | 00000000:07:00.0 Off | 0 |
| N/A 31C P0 37W / 250W| 5146MiB / 32768MiB | 4% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

[04/06 17:46:17 d2.evaluation.evaluator]: Inference done 370/5000. Dataloading: 0.0022 s/iter. Inference: 0.0564 s/iter. Eval: 0.1694 s/iter. Total: 0.2281 s/iter. ETA=0:17:36
[04/06 17:46:22 d2.evaluation.evaluator]: Inference done 391/5000. Dataloading: 0.0022 s/iter. Inference: 0.0567 s/iter. Eval: 0.1699 s/iter. Total: 0.2289 s/iter. ETA=0:17:35
[04/06 17:46:27 d2.evaluation.evaluator]: Inference done 412/5000. Dataloading: 0.0022 s/iter. Inference: 0.0568 s/iter. Eval: 0.1704 s/iter. Total: 0.2296 s/iter. ETA=0:17:33
[04/06 17:46:32 d2.evaluation.evaluator]: Inference done 433/5000. Dataloading: 0.0023 s/iter. Inference: 0.0567 s/iter. Eval: 0.1712 s/iter. Total: 0.2303 s/iter. ETA=0:17:31
[04/06 17:46:37 d2.evaluation.evaluator]: Inference done 457/5000. Dataloading: 0.0022 s/iter. Inference: 0.0565 s/iter. Eval: 0.1706 s/iter. Total: 0.2295 s/iter. ETA=0:17:22
[04/06 17:46:42 d2.evaluation.evaluator]: Inference done 480/5000. Dataloading: 0.0022 s/iter. Inference: 0.0564 s/iter. Eval: 0.1704 s/iter. Total: 0.2292 s/iter. ETA=0:17:16

        请问,为什么我用V100无法复现你的帧率?完全使用fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml默认设置只有不到20帧。
@jiajia131 jiajia131 changed the title 您好,是的 V100无法复现您的帧率 Apr 6, 2024
@jiajia131
Copy link
Author

[04/07 04:39:41 d2.evaluation.evaluator]: Total inference time: 0:16:15.725855 (0.195341 s / iter per device, on 1 devices)
[04/07 04:39:41 d2.evaluation.evaluator]: Total inference pure compute time: 0:04:38 (0.055843 s / iter per device, on 1 devices)

@junjiehe96
Copy link
Owner

能提供你的详细测试脚本和log日志吗

@jiajia131
Copy link
Author

jiajia131 commented Apr 7, 2024

能提供你的详细测试脚本和log日志吗

log.txt
python train_net.py --eval-only --num-gpus 1 --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml MODEL.WEIGHTS ../fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth

谢谢您,这是使用 Tesla V100-SXM2-16GB

@junjiehe96
Copy link
Owner

能提供你的详细测试脚本和log日志吗

log.txt python train_net.py --eval-only --num-gpus 1 --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml MODEL.WEIGHTS ../fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth

谢谢您,这是使用 Tesla V100-SXM2-16GB

您能用下面这条命令再测一下吗
CUDA_VISIBLE_DEVICES=0 python tools/analyze_model.py --tasks speed
--config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml

@jiajia131
Copy link
Author

能提供你的详细测试脚本和log日志吗

log.txt python train_net.py --eval-only --num-gpus 1 --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml MODEL.WEIGHTS ../fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth
谢谢您,这是使用 Tesla V100-SXM2-16GB

您能用下面这条命令再测一下吗 CUDA_VISIBLE_DEVICES=0 python tools/analyze_model.py --tasks speed --config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml

CUDA_VISIBLE_DEVICES=0 python tools/analyze_model.py --tasks speed
--config-file ./configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml
MODEL.WEIGHTS ./fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth

用这个会更慢了
[04/07 06:20:28 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /hy-tmp/FastInst-main-240320/fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth ...
[04/07 06:20:28 fvcore.common.checkpoint]: [Checkpointer] Loading from /hy-tmp/FastInst-main-240320/fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth ...
[04/07 06:20:28 d2.evaluation.evaluator]: Start inference on 5000 batches
[04/07 06:20:31 d2.evaluation.evaluator]: Inference done 11/5000. Dataloading: 0.0149 s/iter. Inference: 0.1241 s/iter. Eval: 0.0000 s/iter. Total: 0.1391 s/iter. ETA=0:11:33
[04/07 06:20:36 d2.evaluation.evaluator]: Inference done 47/5000. Dataloading: 0.0247 s/iter. Inference: 0.1146 s/iter. Eval: 0.0000 s/iter. Total: 0.1394 s/iter. ETA=0:11:30
[04/07 06:20:41 d2.evaluation.evaluator]: Inference done 84/5000. Dataloading: 0.0277 s/iter. Inference: 0.1118 s/iter. Eval: 0.0000 s/iter. Total: 0.1396 s/iter. ETA=0:11:26
[04/07 06:20:46 d2.evaluation.evaluator]: Inference done 120/5000. Dataloading: 0.0276 s/iter. Inference: 0.1127 s/iter. Eval: 0.0000 s/iter. Total: 0.1404 s/iter. ETA=0:11:24
[04/07 06:20:51 d2.evaluation.evaluator]: Inference done 155/5000. Dataloading: 0.0293 s/iter. Inference: 0.1122 s/iter. Eval: 0.0000 s/iter. Total: 0.1416 s/iter. ETA=0:11:25
[04/07 06:20:56 d2.evaluation.evaluator]: Inference done 191/5000. Dataloading: 0.0298 s/iter. Inference: 0.1117 s/iter. Eval: 0.0000 s/iter. Total: 0.1416 s/iter. ETA=0:11:21
[04/07 06:21:01 d2.evaluation.evaluator]: Inference done 226/5000. Dataloading: 0.0296 s/iter. Inference: 0.1122 s/iter. Eval: 0.0000 s/iter. Total: 0.1419 s/iter. ETA=0:11:17

@junjiehe96
Copy link
Owner

这看起来有些奇怪。不过帧率跟模型运行时具体的硬件和软件环境有关,可以在同一台机器上测一下其他方法以进行公平比较

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants