Skip to content

Debug and Profiling Tools

Neil Widmaier edited this page Oct 4, 2022 · 1 revision

CPU Profiler

Version History
v1.00 - Initial Documentation

Description
The Cpu Profiler, found under the menu "Profile → Cpu Profiler," provides CPU timing information including the time spent working on RHI threads and time spent waiting on the GPU to finish its work. Currently the timing is very static; nothing dynamic (passes, scopes, etc) will be timed nor displayed. The values displayed are instantaneous and represent the timings from the frame prior to rendering the values. Work may need to be done in the future to allow smoothing of the data.

image2020-6-18_9-52-39

Test Cases
Since the data is not smoothed out over frames, some tests may require examining a single screenshot of the panel as opposed to just looking at it live.

Test Case 1 -
Ensure that the Cpu Profiler window can be displayed by choosing "Cpu Profiler" menu item under "Profile".

Test Case 2 -
Ensure that the Cpu Profiler window displays the same way (other than the specific numbers) for every RHI. Some RHIs (like Vulkan on mobile) may have a different set of items under "Queue Executes"

Test Case 3 -
Ensure that none of the numbers are "wildly" out of bounds.

  • No timing value should be larger than "Frame to Frame Time"
  • The sum of all "Frame Scheduler" values should be less than or equal to "Frame to Frame Time"

Platform Support
All platforms should support displaying the profiler.

Culling Debug Window

Description
The Culling Debug Window lets you see what the culling system is doing in real time, as well as toggle some debugging variables.

See Render Culling feature documentation for a more detailed description of the current state of the culling system. It is a work in progress and subject to change.

The Culling Debug window should look something like this: image2020-6-25_11-4-45 There are stats displayed for each view, showing how many objects are visible over how many total objects are in that view. When culling is enabled, the number of visible objects will go down as non-visible objects are removed from the draw lists by the culling system.

Test Cases
For all of these test cases, unless otherwise noted, open up the Features/ShadowedBistro sample to begin, and then open up the Culling Debug Window. The ShadowedBistro sample is a good scene that has lots of objects and many views.

Since the ShadowedBistro sample is subject to change at any time, none of these tests require specific numbers to pass. We are just verifying that the basic behavior is intact.

Test Case 1: basic window behavior
Ensure that the Culling Debug window opens and closes via the "Culling"→"Culling Debug Window" menu item.

Ensure the window is resizable, and that scrollbars appear and scrolling works when the window is small.

Test Case 2: basic frustum culling
Ensure "Enable Frustum Culling" is checked and "Freeze Frustums" is unchecked.

Maneuver the camera so it is above the scene (using WASD + shift to move faster).

Point the camera at the sky, observe the number of visible objects, and then point the camera at the middle of the scene, and observe the number of visible objects. There should be fewer visible objects when pointing the camera at the sky.

Note: the number of visible objects in the SpotLightShadowViews will not change even if the camera moves. This is normal.

Test Case 3: verify we can enable/disable frustum culling
Toggle "Enable Frustum Culling" off.

The number of visible objects should go to the maximum.

Toggle "Enable Frustum Culling" back on.

The number of visible objects should go back down to the previous number.

Test Case 4: verify we can freeze the frustums for debugging purposes
Toggle "Freeze Frustums" on.

Move the camera away from the current position. Observe that there is a wireframe frustum drawn on the screen, extruding from where the camera used to be. Observe that objects outside of that frozen frustum are not visible, and that the number of visible objects stays constant. Toggle "Freeze Frustums" back off. Observe that the frozen wireframe frustum disappears, and that culling behavior goes back to normal.

Note: objects that have any part of their bounding box partially intersecting the frustum will still draw in their entirety. This is normal. For now, we just need to verify that objects well outside the frozen frustum are not drawn.

Note: the frozen frustum debug drawing is currently a little wonky: the solid portions of the frustum clip against the far and near planes, creating odd looking clipping artifacts. Ignore them for now.

Test Case 5: verify debug drawing
Toggle "Debug Draw" on.

Observe that yellow wireframe boxes are drawn around each renderable object.

Toggle "Show Bounding Spheres" on.

Observe that a gray sphere is drawn around each renderable object.

Toggle "Debug Draw" off, but leave "Show Bounding Spheres" on.

Verify that both the yellow boxes and gray spheres no longer render.

Test Case 6: verify correct bounding volumes
Restart BaseViewer. Open the RPI/Mesh sample, and open the Culling Debug Window.

Select materials/defaultpbr.azmaterial for the material, and select objects/shaderball_simple.azmodel for the model.

Toggle on "Debug Draw" and verify that the yellow box is a tight fit around the model: image2020-6-25_17-10-30

If the box is too big, or too small, report it as a test failure.

Next, toggle on "Show Bounding Spheres" and verify that the sphere is a tight fit around the yellow box: image2020-6-25_17-11-51

GPU Profiler

Version History
v1.00 - Initial documentation

Description
The GPU Profiler gives insight on the GPU workload. Each pass's workload is expressed with a timestamp and a visual representation to a certain target frame rate (i.e 30 and 60 FPS).

In Atom, a scope is a context which represents work that is executed by the GPU which is owned by either render- or compute passes. These passes can consists of one or more scopes. The GPU Profiler visualizes these timestamps on a pass frequency, meaning that all the TimeStamps from the scopes will be accumulated to display the total time it took to execute all the GPU work of a pass. Passes have a hierarchical structure. A pass can have one or multiple child passes; these are parent passes.

The GPU Profiler can be accessed via Profile → GPU Profiler.

Test Cases
gpu_querysystem

Test Case 1:
Start the GPU Profiler window, a similar view should appear as the screenshot shown above without any warning or error.

Test Case 2:
A list of all active parent-, render- and compute passes. These should be listed under RenderPass Names. These pass entries should be initially visualized in a hierarchical view, but can be changed by selecting the Flat bullet. Two buttons should appear above the render pass names and timestamps. These allow the flat view to be sorted by names alphabetically, and by time. When a state is selected twice, the list will be sorted by its state's inverse (e.g when pressing the RenderPass Names button when the list is already sorted alphabetically, it will sort it inversely).

Test Case 3:
When unfolding the Advanced options, two options will be available. The timestamp option allows the user to change the time metric from milliseconds to nanoseconds, and the Frame load option should change the workload bar visuals to represent the renderpass workload.

The time in milliseconds and nanoseconds should change the metric, but the time should still be same.

Going from 30 FPS → 60 FPS should double the workload percentage, and going from 60 FPS → 30 FPS should halve it.

Test Case 4:
The search function, which can be found above the RenderPass Names should allow the user to filter passes by name. The list will consist of all passes that fully or partially match the pass's name. In the Hierarchical view, the parent passes of the filtered pass will also be listed to perverse the hierarchical view.

Test Case 5:
When changing samples in the BaseViewer, the GPU timestamps should scale accordingly. Complex and GPU heavy scenes should show higher statistics. For example: The Shadow Example is a good environment to confirm that shadow related passes show correct results.

Platform Support
Any platform that supports queries

PassTree

Version History
Version 1.0. June 18th, 2020.

Description PassTree is a debug menu which can be used to view all the passes and their hierarchies.

It also has some functions related to pass' attachments:

  • View each pass' attachments with some detailed information.
  • Preview pass' any image attachment
  • Output pass' any attachment to a file. The image attachment will be saved as a dds file and the buffer attachment will be saved as a file with raw data.

Use may access PassTree debug menu via Pass→PassTree in BaseViewer or BaseViewerLauncher. The PassTree menu also can be access in LY Editor after enter game mode then press Home key.

Screenshots
Screenshot 1: open PassTree debug menu
image2020-6-16_13-53-37

Screenshot 2: Enable showing pass attachments
image2020-6-16_13-54-22

Screenshot 3: Preview selected pass image attachment on screen
image2020-6-16_13-55-14

Screenshot 4: after save selected image attachment
image2020-6-16_13-55-51

Screenshot 5: preview another selected image attachment on screen image2020-6-16_13-56-42

Debug Menu
When PassTree menu is opened, there are two imgui views which will be showing on the screen. The first view is the view with different options. And the second view is for displaying pass tree.
image2020-6-18_15-48-5

Show Pass Attachments. Toggle this option on will display pass's attachments for each pass in the pass tree view. The pass attachment info has the following format:

/[SlotType/] /[SlotName/] /[AttachmentType/] /[AttachmentName/] /[AttachmentSize/]

Preview attachment. Toggle this option on can display any selected pass attachment at the left bottom corner of the screen. Note: the swapchain pass's image attachments can't be previewed.

Save Attachment. Click this button to save the current selected attachment to a file. An image attachment will be saved as a dds file. A buffer attachment will be saved as a .buffer file. The last saved file path is in the text under this button.

Test Cases

Test Case 1 The menu can be opened with all supported platforms. The 'Preview Attachment' should work with any image attachments except attachments for SwapChainPass. The 'Save Attachment' button should work with PC, Mac or Linux but not mobile or console platforms.

Platform Support This debug menu should work with any supported platforms.

TransientAttachmentProfiler

Description
The transient attachment profiler provides information about transient allocations that happen during a frame. Transient allocations are memory assignments that do not outlive the duration of the frame and due to this property we can reuse the same memory space for multiple allocations.

Transient allocations can be of 3 types: Image, Buffer or Rendertarget. Each resource is part of a heap, which represents the block of memory where the resource is allocated from. Each heap belongs to a pool. The pool can contain as many heaps as it needs in order to satisfy the demand for transient resources.

Atom currently uses one transient attachment pool to allocate all transient resources. Most RHI implementations create 1 heap per resource type, which means that 3 heaps will show in the profiler.
image2020-6-22_0-56-35

Memory Section
Describes global stats of the transient attachment pool.

  • Strategy: The transient pool supports 3 modes: Fixed, Paging and Memory Hint. At the moment there's no way to change the memory * strategy without modifying the code. The default mode is "Fixed".
  • Buffer Memory: Transient memory used for buffers.
  • Image Memory: Transient memory used for images.
  • Rendertarget Memory: Transient memory used for render targets.
  • Total Memory: The total transient memory used by the pool.

Heaps Section
List all heaps that the pool is using. For each Heap it shows the name and an arrow to access more information.
image2020-6-22_1-3-10

Heap Details Section
For each heap, all resources allocation are shown in the following way:

The first green column shows the total memory of the heap. The full height represents the memory space, with 0 at the top, and the full memory address at the bottom.

The top blue boxes represents the scopes of the framegraph, with the first scope the one at the most left, and the last scope being the one at the right most. Each transient attachment life spans between two scopes. Blue scopes mean graphic scopes.

Transient attachment resources are represented by the boxes between the scopes and heap memory. The width of the box shows the lifespans of the resource (e.g. from which scope to what scope the resource is used). The height of the box represents the amount of memory is using from the whole heap memory. The top is the beginning of the memory address, and the bottom the end. The taller the box the more memory is using. The wider the box the longer is the lifespan of the resource. Aliasing of resources can be visually inspected by resources that use the same memory region.

Since there's not a lot of space to show a lot of details, hovering over the boxes will provide the extra information that the user may want to know.

Heap Memory details:

  • Type: The resource type the heap supports.
  • Size: The amount of total memory allocated by the heap.
  • Watermark: The max amount of memory used by the heap.
  • Waste: Percentage of the total heap memory that is not used.
    image2020-6-22_1-15-54

Scope details:

  • Id: The name of the scope.
  • Hardware Queue Class: The type of scope (graphics, compute or copy). image2020-6-22_1-19-0

Resource details:

  • Id: The name of the resource.
  • Heap Begin: The begin of the memory address in the heap.
  • Heap End: The end address of the resource.
  • Size: amount of bytes used by the resource.
  • Scope Begin: The scope where the resource start being used.
  • Scope End: The scope where the resource end being used.
    image2020-6-22_1-20-40

Since the amount of transient resources is high, and the space to show them is limited, the profile has a zoom functionality in order to access smaller allocations. To access the zoom focus the details area of a heap and while holding the CTRL key use the mouse wheel to zoom in and out. When zooming in two scroll bars will appear for vertical and horizontal scrolling. You can also scroll up/down using just the mouse wheel.

image2020-6-22_1-24-25

Test Cases

Test Case 1 Ensure that the Transient Attachment Profiler window can be displayed by choosing "Transient Attachment Profiler" menu item under "Profile".

Test Case 2 Ensure that the global "Memory" section shows values for Image, Texture and Buffer memory.

Test Case 3 Check that one or more heaps appear when opening the "Heaps" section.

Test Case 4 Check that each heap has a name.

Test Case 5 Open the details section of the heap and check that the memory box (green), the scopes (blue) boxes and the resource boxes (red or purple) display correctly.

Test Case 6 Hover over the different boxes and check that each of them display the details as described in the previous section and that they change color to show that they are being highlighted.