Skip to content
This repository has been archived by the owner on May 17, 2023. It is now read-only.

[Urgent]Special layouts of tiled composition cause GPU Hung on 6305E and i3-1115G4 #2955

Open
Nicki-fu opened this issue Jul 5, 2022 · 2 comments
Labels

Comments

@Nicki-fu
Copy link

Nicki-fu commented Jul 5, 2022

gpuhung.zip

System information

  • CPU information(cat /proc/cpuinfo | grep "model name" | uniq):11th Gen Intel(R) Core(TM) i3-1115G4E @ 3.00GHz and Intel(R) Celeron(R) 6305E @ 1.80GHz
  • GPU information(lspci -nn | grep -E 'VGA|isplay):00:02.0 VGA compatible controller [0300]: Intel Corporation Device [8086:9a78] (rev 01)
  • Display server if rendering to display(X or wayland): drm
  • Ubuntu 20.04.3 LTS with 5.10.65-intel-ese-standard-lts, Running MediaSDK sample: sample_multi_transcode on i3-1115G4 and 6305E, this issue does not exist on i7-1185G7.

Issue behavior

Describe the current behavior

Customer uses MediaSDK sample sample_multi_transcode and wants an 88 composition for NVR use case. because there will be not all 88 input streams enabling at the beginning, they will dynamically enable the input streams during playback, and the dynamically enabled input streams are possibly at any grid of the layout. Here is a layout that will always cause GPU Hung after playing for 2-3s. the layout is defined in n64_38_tile.par(in attached), I draw the layout as below for a visual look, the composition output surface is 25601440, tile number is 5, the first input stream is starting from (640540), all the input streams’ size in Composition output surface is 320*179. I placehold the first 3 lines and the first 2 grids of the fourth line which have not input streams at the beginning.
image
image
Work around fixed:
This GPU Huang can be fixed by below methods:
• First method: Change the tile0 y from 540 to 537(537 = H:179 *3 )
• Second method: Change the tile0 H from 179 to an even number:178 or 180. Please note that the odd numbers such as 177,175 will cause GPU Huang.
In the Addition, below cases based on this layout also will not meet “GPU HUNG”.
• If there is only the first grid absent in the fourth line, not meet “GPU HUNG”.
• If there are first 4 grids absent in the fourth line, not meet “GPU HUNG”.
• If there is no absent grid in the fourth line(8 input streams is full), not meet “GPU HUNG”.
• If it does not submitted composition with user tiled customized, not meet “GPU HUNG”.

Describe the expected behavior

No GPU HANG

Debug information

  • What's libva/libva-utils/gmmlib/media-driver/Media SDK version?
    Media Driver: intel-media-21.1.3 / gmmlib:21.1.1 / libva:2.11.0 / MediaSDK: intel-mediasdk-21.1.3
  • Could you confirm whether GPU hardware exist or not by ls /dev/dri?
    by-path card0 renderD128
  • Could you attach dmesg log if it's GPU hang by ``?
    Check attached
  • Could you provide vainfo log if possible by vainfo -a >vainfo.log 2>&1?
    Check attached
  • Could you provide strace log if possible by strace YOUR_CMD >strace.log 2>&1?
  • Could you provide libva trace log if possible? Run cmd export LIBVA_TRACE=/tmp/libva_trace.log first then execute the case.
    Check attached
  • Media SDK tracer output (https://github.com/Intel-Media-SDK/MediaSDK/blob/master/tools/tracer/README.md)?
  • Do you want to contribute a PR? (yes/no):
    -No
@Nicki-fu Nicki-fu added the bug label Jul 5, 2022
@Nicki-fu Nicki-fu changed the title Special layouts of tiled composition cause GPU Hung on i3-1115G4 [Urgent]Special layouts of tiled composition cause GPU Hung on 6305E and i3-1115G4 Sep 14, 2022
@Nicki-fu
Copy link
Author

Here is the renderpicture commands sequence captured from va trace:

<style> </style>
Index Group target_surface/render_targets surface surface_region output_region surface_color_standard output_background_color output_color_standard pipeline_flags filter_flags
1 1 0x00000000 0x00000000 "(640,540,1920,179)" "(640,540,1920,179)" 0 0x00000000 0 0x00000000 0x00000000
2 1   0x00000002 "(0,0,320,179)" "(640,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
3 1   0x0000000e "(0,0,320,179)" "(690,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
4 1   0x0000001a "(0,0,320,179)" "(1280,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
5 1   0x00000026 "(0,0,320,179)" "(1600,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
6 1   0x00000032 "(0,0,320,179)" "(1920,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
7 1   0x0000003e "(0,0,320,179)" "(2240,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
8 2 0x00000000 0x00000000 "(0,720,2560,179)" "(0,720,2560,179)" 0 0x00000000 0 0x00000000 0x00000000
9 2   0x0000004a "(0,0,320,179)" "(0,720,320,179)" 0 0xff000000 1 0x00000001 0x00000200
10 2   0x00000056 "(0,0,320,179)" "(320,720,320,179)" 0 0xff000000 1 0x00000001 0x00000200
11 2   0x00000062 "(0,0,320,179)" "(640,720,320,179)" 0 0xff000000 1 0x00000001 0x00000200
12 2   0x0000006e "(0,0,320,179)" "(960,720,320,179)" 0 0xff000000 1 0x00000001 0x00000200
13 2   0x0000007a "(0,0,320,179)" "(1280,720,320,179)" 0 0xff000000 1 0x00000001 0x00000200
14 2   0x00000086 "(0,0,320,179)" "(1600,720,320,179)" 0 0xff000000 1 0x00000001 0x00000200
15 2   0x00000092 "(0,0,320,179)" "(1920,720,320,179)" 0 0xff000000 1 0x00000001 0x00000200
16 2   0x0000009e "(0,0,320,179)" "(2240,720,320,179)" 0 0xff000000 1 0x00000001 0x00000200
17 3 0x00000000 0x00000000 "(0,900,2560,179)" "(0,900,2560,179)" 0 0x00000000 0 0x00000000 0x00000000
18 3   0x000000aa "(0,0,320,179)" "(0,900,320,179)" 0 0xff000000 1 0x00000001 0x00000200
19 3   0x000000b6 "(0,0,320,179)" "(320,900,320,179)" 0 0xff000000 1 0x00000001 0x00000200
20 3   0x000000c2 "(0,0,320,179)" "(640,900,320,179)" 0 0xff000000 1 0x00000001 0x00000200
21 3   0x000000ce "(0,0,320,179)" "(960,900,320,179)" 0 0xff000000 1 0x00000001 0x00000200
22 3   0x000000da "(0,0,320,179)" "(1280,900,320,179)" 0 0xff000000 1 0x00000001 0x00000200
23 3   0x000000e6 "(0,0,320,179)" "(1600,900,320,179)" 0 0xff000000 1 0x00000001 0x00000200
24 3   0x000000f2 "(0,0,320,179)" "(1920,900,320,179)" 0 0xff000000 1 0x00000001 0x00000200
25 3   0x000000fe "(0,0,320,179)" "(2240,900,320,179)" 0 0xff000000 1 0x00000001 0x00000200
26 4 0x00000000 0x00000000 (0,1080,2560,179) (0,1080,2560,179) 0 0x00000000 0 0x00000000 0x00000000
27 4   0x0000010a "(0,0,320,179)" "(0,1080,320,179)" 0 0xff000000 1 0x00000001 0x00000200
28 4   0x00000116 "(0,0,320,179)" "(320,1080,320,179)" 0 0xff000000 1 0x00000001 0x00000200
29 4   0x00000122 "(0,0,320,179)" "(640,1080,320,179)" 0 0xff000000 1 0x00000001 0x00000200
30 4   0x0000012e "(0,0,320,179)" "(960,1080,320,179)" 0 0xff000000 1 0x00000001 0x00000200
31 4   0x00000073 "(0,0,320,179)" "(1280,1080,320,179)" 0 0xff000000 1 0x00000001 0x00000200
32 4   0x00000146 "(0,0,320,179)" "(1600,1080,320,179)" 0 0xff000000 1 0x00000001 0x00000200
33 4   0x00000152 "(0,0,320,179)" "(1920,1080,320,179)" 0 0xff000000 1 0x00000001 0x00000200
34 4   0x0000015e "(0,0,320,179)" "(2240,1080,320,179)" 0 0xff000000 1 0x00000001 0x00000200
35 5 0x00000000 0x00000000 (0,1260,2560,179) "(0,1260,2560,179)" 0 0x00000000 0 0x00000000 0x00000000
36 5   0x0000016a "(0,0,320,179)" "(0,1260,320,179)" 0 0xff000000 1 0x00000001 0x00000200
37 5   0x00000176 "(0,0,320,179)" "(320,1260,320,179)" 0 0xff000000 1 0x00000001 0x00000200
38 5   0x00000182 "(0,0,320,179)" "(640,1260,320,179)" 0 0xff000000 1 0x00000001 0x00000200
39 5   0x0000018e "(0,0,320,179)" "(960,1260,320,179)" 0 0xff000000 1 0x00000001 0x00000200
40 5   0x0000019a "(0,0,320,179)" "(1280,1260,320,179)" 0 0xff000000 1 0x00000001 0x00000200
41 5   0x000001a6 "(0,0,320,179)" "(1600,1260,320,179)" 0 0xff000000 1 0x00000001 0x00000200
42 5   0x000001b2 "(0,0,320,179)" "(1920,1260,320,179)" 0 0xff000000 1 0x00000001 0x00000200
43 5   0x000001be "(0,0,320,179)" "(2240,1260,320,179)" 0 0xff000000 1 0x00000001 0x00000200
44 1 0x00000001 0x00000001 “(640,540,1920,179)” “(640,540,1920,179)” 0 0x00000000 0 0x00000000 0x00000000
45 1   0x00000003 "(0,0,320,179)" "(640,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
46 1   0x0000000f "(0,0,320,179)" "(960,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
47 1   0x0000001b "(0,0,320,179)" "(1280,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
48 1   0x00000027 "(0,0,320,179)" "(1600,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
49 1   0x00000033 "(0,0,320,179)" "(1920,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200
50 1   0x0000003f "(0,0,320,179)" "(2240,540,320,179)" 0 0xff000000 1 0x00000001 0x00000200

@Nicki-fu
Copy link
Author

tplink_tile_gpuhang.zip
Here are the par file and vatrace log in the zip.
the par file can be run as the -par parameter by MediaSDK's sample app: sample_multiple_transcode

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant