-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-31984 Capture additional stats from CSmartRowBuffer temp files #18831
Conversation
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-31984 Jirabot Action Result: |
1a2cce2
to
713bff9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - looks good, but I think some thread safety issues.
@@ -424,6 +424,13 @@ class CSmartRowBuffer: public CSimpleInterface, implements ISmartRowBuffer, impl | |||
{ | |||
return this; | |||
} | |||
virtual unsigned __int64 getStatistic(StatisticKind kind) const override | |||
{ | |||
if (tempFileIO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think theoretically thread unsafe, the other thread could be assigning to the pointer, as this is testing it. I think should really be protected with an atomic. In practice on Intel it may be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - I don't think this one has been addressed, still thread unsafe?
virtual unsigned __int64 getStatistic(StatisticKind kind) const | ||
{ | ||
unsigned __int64 v = inactiveStats.queryStatistic(kind).get(); | ||
if (currentOutputIFileIO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't look thread safe.
An existing currentOutputIFileIO could be destroyed (in createNextOutputStream()) after this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - please see new comments, and also this old one : https://github.com/hpcc-systems/HPCC-Platform/pull/18831/files#r1759147511
@@ -1433,6 +1433,8 @@ class CDistributorBase : implements IHashDistributor, implements IExceptionHandl | |||
|
|||
virtual void mergeStats(CRuntimeStatisticCollection &stats) const | |||
{ | |||
if (piperd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have missed this one before, but is this thread safe?
Thread 1: stop()->disconnect()->piperd.clear()
Thread 2: [passed reading pointer from Owned], piperd could be cleared by T1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put a criticalblock around this as well.
thorlcr/thorutil/thbuf.cpp
Outdated
if (currentOutputIFileIO) | ||
{ | ||
CriticalBlock b(critCurrentOutputIFileIO); | ||
if (currentOutputIFileIO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think strictly speaking, the 1st check needs to be atomic for it to be thread safe, otherwise it could be reordered.
But it's not worth it. Better to just enter mutex and test once - this is not highly contended, i.e. make it:
virtual unsigned __int64 getStatistic(StatisticKind kind) const
{
unsigned __int64 v = inactiveStats.queryStatistic(kind).get();
CriticalBlock b(critCurrentOutputIFileIO);
if (currentOutputIFileIO)
v += currentOutputIFileIO->getStatistic(kind);
return v;
}
426179c
to
28bf41f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - looks good, but needs rebasing to resolve clash
@@ -146,6 +147,7 @@ class CSmartRowBuffer: public CSimpleInterface, implements ISmartRowBuffer, impl | |||
} | |||
if (!tempFileIO) { | |||
SpinUnblock unblock(lock); | |||
CriticalBlock block(critTmpFileIO); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using AtomicShared for tempFileIO would be an alternative here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - looks good please squash.
But we should probably not be targeting 9.6 for this kind of change.
Please target against 9.8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser looks good. Please squash.
One comment on naming.
Also, recent experience has taught us to be very careful about calling set/setown inside a critical section because releasing the old object can be expensive. A better alternative is to use swap. In this case I think the only thread that would be blocked would be the stats reporting - so it is unlikely to be an issue.
@jakesmith are we completely happy with this going in 9.6.x?
@@ -95,6 +95,7 @@ class CDistributorBase : implements IHashDistributor, implements IExceptionHandl | |||
size32_t fixedEstSize; | |||
Owned<IRowWriter> pipewr; | |||
Owned<ISmartRowBuffer> piperd; | |||
mutable CriticalSection critPiperd; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random general comment on naming:
piperdCrit is a more natural name than critPiperd (and more consistent with other variables e.g. fixedEstSize) - because piperd is an adjective/qualifier for the critical section, and in english qualifiers come before the noun. Not worth changing in this PR.
I suggested previously (#18831 (review)) that it should be retargeted to 9.8 (arguably shouldn't go into 9.8 either, but separate discussion). |
* Capture stats such as StCycleSpillElapsedCycles and StTimeSpillElapsed from temp file * Have CSmartRowBuffer use StSizeDiskWrite from tempFileIO for noteSize (should mean actual disk size used for size tracking) Signed-off-by: Shamser Ahmed <[email protected]>
@jakesmith @ghalliday Squashed & retargeted. |
Jirabot Action Result: |
Type of change:
Checklist:
Smoketest:
Testing: