Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-31689 Add checkfilesize to daliadmin #18582

Merged

Conversation

jakesmith
Copy link
Member

@jakesmith jakesmith commented Apr 25, 2024

Add a command ('checkFileSize'), that given a logical filename or a mask, scans logical files for parts with missing @SiZe attributes, reports them, and optionally fixes them. See HPCC-31662 for background.

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copy link

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-31689

Jirabot Action Result:
Workflow Transition: Merge Pending
Updated PR
Assigning user: [email protected]

printf(" dfspart <logicalname> <part> -- get meta information for part num\n");
printf(" checksuperfile <superfilename> [fix=true|false] -- check superfile links consistent and optionally fix\n");
printf(" checksubfile <subfilename> -- check subfile links to parent consistent\n");
printf(" checkfilesize <logicalfilemask> [fix=true|false] -- check file size attributes and optionally fix");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: this is the new command, the other changes here are only alphabetic reordering

@jakesmith jakesmith requested a review from ghalliday April 25, 2024 09:18
@jakesmith jakesmith force-pushed the HPCC-31689-fixfilesize branch from 6e0f471 to 2999667 Compare April 25, 2024 10:36
@@ -148,6 +149,8 @@ int main(int argc, const char* argv[])
StringBuffer tmps;
for (int i=1;i<argc;i++) {
const char *param = argv[i];
if (startsWith(param, "--"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: this is added so --config=/etc/config/.. can be added to the command line, which will be needed on a AKS system, to pickup plane configuration (from global) for path striping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add param to params here ?
params.append(param);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - I see now loadConfiguration() uses argv so we just skip param here.

@jakesmith jakesmith requested a review from mckellyln April 25, 2024 10:45
offset_t partSize;
try
{
partSize = part.getFileSize(true, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont really need to call getFileSize() if fix == false ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, I think I was originally thinking of logging it even when fix=false, but.. I'll move it in the if (fix) block.

Copy link
Contributor

@mckellyln mckellyln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - just one comment that maybe we don't need to call getFileSize() if we are not fixing it or printing it.

@jakesmith jakesmith requested a review from mckellyln April 25, 2024 13:11
@mckellyln
Copy link
Contributor

@jakesmith looks good - squash-worthy

@jakesmith jakesmith force-pushed the HPCC-31689-fixfilesize branch from c756d2b to 19a25f9 Compare April 25, 2024 13:42
@jakesmith
Copy link
Member Author

@mckellyln - squashed.

@mckellyln
Copy link
Contributor

Good to merge

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakesmith a couple of questions.

}
if (!locked)
{
part.lockProperties(30000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be protected by the catch as well, in case there is a weird timeout?? I think very unlikely.
...
Fine as it stands - the outer catch is sufficient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be better in the catch I agree. I'll move.

if (!locked)
{
part.lockProperties(30000);
locked = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is using knowledge of the internal implementation to avoid locking multiple times. Should it lock the file instead to make it explicit??
also, there should be a corresponding unlockProperties.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, locking just the file isn't really right either - that would also be using knowledge of internal implementation (i.e. that locking file equates to part attributes being protected too).
I will modify.

also, there should be a corresponding unlockProperties.

there is, via COnScopeExit ensureUnlock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, locking just the file isn't really right either - that would also be using knowledge of internal implementation (i.e. that locking file equates to part attributes being protected too).
I will modify.

it was also wrong (but harmless) in the sense that the underlying implementation relies on knowing if each part is 'dirty', so should lock each part as it goes (I've changed it). It's a NOP though, if the file is already locked (which I've changed it to be).

break;
else if (timer.elapsedCycles() >= queryOneSecCycles()*10) // log every 10 secs
{
PROGLOG("Processed %u files", count);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this tracing become excessive (e.g. if run as an esp service)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is never intended to be run as a service - it's very specific to the missing @SiZe issue..
I think if it ever becomes something else, excessive tracing/other issues can be looked at then.

@jakesmith jakesmith requested a review from ghalliday April 26, 2024 10:34
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakesmith please squash

Add a command ('checkFileSize'), that given a logical filename
or a mask, scans logical files for parts with missing @SiZe
attributes, reports them, and optionally fixes them.
See HPCC-31662 for background.

Signed-off-by: Jake Smith <[email protected]>
@jakesmith jakesmith force-pushed the HPCC-31689-fixfilesize branch from 6e19bb4 to 96bbb26 Compare April 26, 2024 12:15
@jakesmith
Copy link
Member Author

@ghalliday - squashed.

@ghalliday ghalliday merged commit 88e39ff into hpcc-systems:candidate-9.4.x Apr 26, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants