HPCC-32241 Temp size for sort to use disk size and track actual graph temp disk usage #18883

shamser · 2024-07-16T13:18:48Z

SizePeakTempDisk for graphs was previous calculated by summing the each
activity's active temp file size. However, as this was done at internals,
spikes in temp file sizes between intervals were not recorded in the
SizePeakTempDisk stats. This change uses a TempFileSize tracker at the
graph level to track the actual peak/active temp file size for the graph.

Sort activities was using the size of the uncompressed data as the size
of the temp file. This change uses the actual size of temp file for sort
activity's peak/active temp file size

Type of change:

This change is a bug fix (non-breaking change which fixes an issue).
This change is a new feature (non-breaking change which adds functionality).
This change improves the code (refactor or other change that does not change the functionality)
This change fixes warnings (the fix does not alter the functionality or the generated code)
This change is a breaking change (fix or feature that will cause existing behavior to change).
This change alters the query API (existing queries will have to be recompiled)

Checklist:

Smoketest:

Send notifications about my Pull Request position in Smoketest queue.
Test my draft Pull Request.

Testing:

github-actions · 2024-07-16T13:19:05Z

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32241

Jirabot Action Result:
Workflow Transition: Merge Pending
Updated PR

jakesmith · 2024-07-16T14:02:36Z

@shamser - could you update the commit message/PR message, to describe the what the problem was caused by/how it was fixed? It will aide the review process and future change history.

jakesmith

@shamser - could you update the commit message/PR message, to describe the what the problem was caused by/how it was fixed? It will aide the review process and future change history.

I think this has essentially change from recalculating the peak in serializeStats, to introducing a size tracker in the graph and pushing up the grow/shrink stat sto the subgraph.
That makes sense afaics (see comment re. child graphs).

Please update the commit message/PR message to contain a description of the bug, and how this addresses it.

jakesmith · 2024-07-16T17:12:38Z

thorlcr/graph/thgraph.hpp

@@ -1175,7 +1192,9 @@ class graph_decl CActivityBase : implements CInterfaceOf<IThorRowInterfaces>, im
    CFileSizeTracker * queryTempFileSizeTracker()
    {
        if (!tempFileSizeTracker)
-            tempFileSizeTracker.setown(new CFileSizeTracker);
+        {
+            tempFileSizeTracker.setown(new CFileSizeTracker(queryGraph().queryTempFileSizeTracker()));


I think this needs to use queryGraph().queryParent(), because in a child query, queryGraph() will be the child graph, not the subgraph that's tracking the usage. queryGraph().queryParent(), should always point to the subgraph, no matter what the depth of the nested child graphs.

NB: note to self - CGraphBase::queryParent() and CGraphBase::queryOwner() are counter-intuitive, it would have made more sense if queryOwner() was the top-level graph (the subgraph) and queryOwner() was the activity containing graph (whether that be the subgraph, or a nested child graph).

Do you have a spilling child graph test, you can test this with?

I've attached loopsort2.ecl to the parent JIRA in case useful. It should spill in each iteration of the loop.
I've had a very brief look at the SizePeakTempDisk numbers:

sg1 (subgraph): <attr kind='SizePeakTempDisk' value='166813325' formatted='159.084MB' unit='sz' ctype='thor' creator='[email protected]' ts='2024-07-17T08:40:07.780Z'/> sg5 (child loop graph): <attr kind='SizePeakTempDisk' value='132142234' formatted='126.020MB' unit='sz' ctype='thor' creator='[email protected]' ts='2024-07-17T08:40:07.780Z'/> sg5:a7 (sort act in child graph) <attr kind='SizePeakTempDisk' value='29424207' formatted='28.060MB' unit='sz' ctype='thor' creator='[email protected]' ts='2024-07-17T08:40:07.780Z'/>

I think the numbers should be broadly the same, i.e. it's not clear why sg5's number is significantly higher.
NB: the loop result may also spill, but I don't think it's currently being tracked.

jakesmith

@shamser Looks good.

#18883 (review)

commit/PR message needs updating.

jakesmith

@shamser - looks good. Please squash, keeping the new commit message in the squashed commit and add it to the PR message.

temp disk usage SizePeakTempDisk for graphs was previous calculated by summing the each activity's active temp file size. However, as this was done at internals, spikes in temp file sizes between intervals were not recorded in the SizePeakTempDisk stats. This change uses a TempFileSize tracker at the graph level to track the actual peak/active temp file size for the graph. Sort activities was using the size of the uncompressed data as the size of the temp file. This change uses the actual size of temp file for sort activity's peak/active temp file size Signed-off-by: Shamser Ahmed <[email protected]>

shamser · 2024-07-18T08:48:21Z

SizePeakTempDisk for graphs was previous calculated by summing the each
activity's active temp file size. However, as this was done at internals,
spikes in temp file sizes between intervals were not recorded in the
SizePeakTempDisk stats. This change uses a TempFileSize tracker at the
graph level to track the actual peak/active temp file size for the graph.

Sort activities was using the size of the uncompressed data as the size
of the temp file. This change uses the actual size of temp file for sort
activity's peak/active temp file size

jakesmith · 2024-07-18T08:53:50Z

@shamser - can you also copy the commit message in to this PR description, i.e. above the checkbox test (which is done automatically when pushing the 1st commit) ?

Done now. @jakesmith

jakesmith

@shamser - tagging you back re. #18883 (comment)

jakesmith

@shamser - looks good

shamser force-pushed the issue32241 branch from 66478c4 to 0b38505 Compare July 16, 2024 13:19

shamser changed the title ~~HPCC-32241 Fix for SizePeakTempDisk sort and graph~~ HPCC-32241 Fix SizePeakTempDisk sort and graphs Jul 16, 2024

shamser requested review from ghalliday and jakesmith July 16, 2024 13:49

shamser marked this pull request as ready for review July 16, 2024 13:49

jakesmith reviewed Jul 16, 2024

View reviewed changes

shamser requested a review from jakesmith July 17, 2024 10:50

jakesmith reviewed Jul 17, 2024

View reviewed changes

shamser force-pushed the issue32241 branch from ca25764 to 3a66c7e Compare July 17, 2024 12:26

shamser changed the title ~~HPCC-32241 Fix SizePeakTempDisk sort and graphs~~ HPCC-32241 Temp size for sort to use disk size and track actual graph temp disk usage Jul 17, 2024

shamser requested a review from jakesmith July 17, 2024 12:27

jakesmith reviewed Jul 18, 2024

View reviewed changes

shamser force-pushed the issue32241 branch from 3a66c7e to 582c0eb Compare July 18, 2024 08:47

shamser requested a review from jakesmith July 18, 2024 08:48

jakesmith reviewed Jul 18, 2024

View reviewed changes

shamser requested a review from jakesmith July 18, 2024 10:14

jakesmith approved these changes Jul 18, 2024

View reviewed changes

ghalliday approved these changes Jul 18, 2024

View reviewed changes

ghalliday merged commit 4a2ee9a into hpcc-systems:candidate-9.6.x Jul 18, 2024
49 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPCC-32241 Temp size for sort to use disk size and track actual graph temp disk usage #18883

HPCC-32241 Temp size for sort to use disk size and track actual graph temp disk usage #18883

shamser commented Jul 16, 2024 •

edited

Loading

github-actions bot commented Jul 16, 2024

jakesmith commented Jul 16, 2024

jakesmith left a comment

jakesmith Jul 16, 2024

jakesmith Jul 17, 2024

jakesmith left a comment

jakesmith left a comment

shamser commented Jul 18, 2024

jakesmith commented Jul 18, 2024 •

edited by shamser

Loading

jakesmith left a comment

jakesmith left a comment

HPCC-32241 Temp size for sort to use disk size and track actual graph temp disk usage #18883

HPCC-32241 Temp size for sort to use disk size and track actual graph temp disk usage #18883

Conversation

shamser commented Jul 16, 2024 • edited Loading

Type of change:

Checklist:

Smoketest:

Testing:

github-actions bot commented Jul 16, 2024

jakesmith commented Jul 16, 2024

jakesmith left a comment

Choose a reason for hiding this comment

jakesmith Jul 16, 2024

Choose a reason for hiding this comment

jakesmith Jul 17, 2024

Choose a reason for hiding this comment

jakesmith left a comment

Choose a reason for hiding this comment

jakesmith left a comment

Choose a reason for hiding this comment

shamser commented Jul 18, 2024

jakesmith commented Jul 18, 2024 • edited by shamser Loading

jakesmith left a comment

Choose a reason for hiding this comment

jakesmith left a comment

Choose a reason for hiding this comment

shamser commented Jul 16, 2024 •

edited

Loading

jakesmith commented Jul 18, 2024 •

edited by shamser

Loading