Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-31213 Add plane name to MoveExternalFile, etc #18275

Merged
merged 1 commit into from
Feb 22, 2024

Conversation

wangkx
Copy link
Member

@wangkx wangkx commented Feb 2, 2024

The STD.File.MoveExternalFile moves files from one landing zone folder to anothor. The existing method can only specify a landing zone using host name. In this PR, a landing zone can be specified by plane name.
Similar functions are added to STD.File.CreateExternalDirectory and STD.File.DeleteExternalFile.

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copy link

github-actions bot commented Feb 2, 2024

@wangkx wangkx requested a review from jakesmith February 3, 2024 01:05
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - please see comments.

@@ -2726,15 +2726,19 @@ FILESERVICES_API char * FILESERVICES_CALL fsfResolveHostName(const char *hostna
return ret.detach();
}

static void checkExternalFileRights(ICodeContext *ctx, CDfsLogicalFileName &lfn, bool rd,bool wr)
static void checkExternalFileRights(ICodeContext *ctx,CDfsLogicalFileName &lfn,bool checkScopePermissions,bool rd,bool wr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "checkScopePermissions" is confusing. It suggests if false, not to check scope perms at all.

I think it would better and clearer to remove this option here, and in checkExternalFilePath, and instead perform any checks where it is not clear in the caller, and always ensure the lfn is to the scope/directory, and call getScopePermissions always.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

dlfn.setExternal(host,path); //for backward compatibility
dlfn.getExternalFilename(rfn);

if (checkScopePermissionIfDir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see prev. comment. I think this should be done in implementMoveExternalFile, ensuring the lfn is always to the dir.

Owned<IPropertyTree> plane = checkPlaneOrHost(planename,location);
RemoteFilename fromrfn,torfn;
checkExternalFilePath(ctx,plane,location,frompath,true,true,true,fromrfn);
checkExternalFilePath(ctx,plane,location,topath,false,false,true,torfn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comments, but if frompath is a directory, it implies topath should be too right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. And the access to the directory (not a parent) should be checked.

FILESERVICES_API void FILESERVICES_CALL fsDeleteExternalFile(ICodeContext * ctx,const char *location,const char *path)
{
SocketEndpoint ep(location);
if (ep.isNull())
throw MakeStringException(-1,"fsDeleteExternalFile: Cannot resolve location %s",location);
CDfsLogicalFileName lfn;
lfn.setExternal(location,path);
checkExternalFileRights(ctx,lfn,false,true);
checkExternalFileRights(ctx,lfn,false,false,true); //The path could be a directory. So, it may need to be checked as a scope. Will re-visit this when adding plane name here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. CreateExternalDirectory also implies it must be a directory.
I think all cases need to ensure that the lfn passed to checking function is a directory.

@wangkx
Copy link
Member Author

wangkx commented Feb 9, 2024

@jakesmith I fixed one regress test failure in f592bac. That makes me wondering if I should do the similar changes for other exception related code I added. But, the old code also throws some exceptions.
With the 2 new commits, there still exists a regress test failure when running despray.ecl. I think it is because I change the code from the 'rfn.setPath(ep,path)' to 'lfn.getExternalFilename(rfn);' for path '/var/lib/HPCCSystems/mydropzone/../W20240208-185950-persons'. Please advise.

@wangkx
Copy link
Member Author

wangkx commented Feb 9, 2024

Looks like the commit 7834bcb fixes the regress test failures.

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - returning to you for now.
I think approach should be changed with regard to validating a path that may be a dir..
And, want to check this is not duplicating existing functionality like validateDZPathScopePermsAndLegacyPhysicalPerms.

RemoteFilename rfn;
lfn.getExternalFilename(rfn);
Owned<IFile> file = createIFile(rfn);
if (!file->exists())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly, I think this is a security leak. It means unauthorized users can discover which files do/don't exist.

It should not leak this information for unauthorized users.
To overcome this, I think you need to assume it's a file and validate the parent scope 1st.
Then check if directory, and only then check the tail if it is.

Owned<IPropertyTree> plane = checkPlaneOrHost(planename,location);
RemoteFilename fromrfn,torfn;
checkExternalFilePath(ctx,plane,location,frompath,true,true,fromrfn);
checkExternalFilePath(ctx,plane,location,topath,false,true,torfn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't thought about too deeply, but feels like dropzone vs paths within dropzones checking + perm. checking functionality should already be covered by existing implemented functionality e.g. validateDZPathScopePermsAndLegacyPhysicalPerms etc.

Plus, see other comment re. if if unknown whether file or dir, assume file, and check user has perms to access parent dir. 1st, before returning checking returning error that file doesn't exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateDZPathScopePermsAndLegacyPhysicalPerms

@wangkx I realize methods like validateDZPathScopePermsAndLegacyPhysicalPerms are currently in SMCLib and use IEspContext's and therefore inaccessible to the fileservices plugin.

But if this is a common functional need, that currently requires very similar functionality to be duplicated, it suggests it should be commoned up somewhere, where it is accessible by both.

It may be that it is best for this PR to leave it a duplicated functionality, but open a JIRA and revisit to refactor to avoid the duplication/future confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened https://track.hpccsystems.com/browse/HPCC-31280 to refactor to avoid the duplication/future confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some comments to that JIRA.

@wangkx wangkx force-pushed the h31213 branch 2 times, most recently from 69bfa64 to 6267042 Compare February 13, 2024 13:23
@@ -2848,19 +2841,11 @@ FILESERVICES_API void FILESERVICES_CALL fsDeleteExternalFile(ICodeContext * ctx
throw makeStringExceptionV(-1,"Cannot resolve location %s",location);
CDfsLogicalFileName lfn;
lfn.setExternal(location,path);
checkExternalFileRights(ctx,lfn.get(),false,true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re. this previous comment : #18275 (comment)

It looks like it now, passes the full filename here (e.g. scope1::scope2::filename), then scope checks it, including "filename" tail.

The previous comment was suggesting, that since you do not know whether the tail is a directory or a filename, you should assume it's a filename and scope check the parent ("scope1::scope2"), only then once you have proved the user has access to that scope, should you check if the tail ("filename" in this case) is a directory, if it is, you need to scope check that too, if not, you are done.

//If the path is a folder, check if it can be accessed.
//If the path is a file, the LDAP will skip the file layer (because it is undefined in the scope access setting)
//and check if its scope can be accessed.
checkExternalFileRights(ctx,dlfn.get(),rd,wr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment below (related to previous comment : #18275 (comment))

If this function is to be used in context where you do not know whether the user supplied path is a path to a directory, or a path to a file, you need to validate access to the parent scope e.g. "scope1::scope2" from "scope1::scope2::tail", then check what 'tail' is, and then only if it is a directory, scope check it.

@wangkx wangkx force-pushed the h31213 branch 2 times, most recently from 3c0b3cf to 7b3114f Compare February 13, 2024 17:08
@wangkx
Copy link
Member Author

wangkx commented Feb 13, 2024

@jakesmith I squashed the old commits into 1f7152e. In ad1b09e, I changed the code to pass different scopes to checkExternalFileRights(). Since the code still throws the exception if a file is not found and the user can access the scope, a smote test failure is occurred as expected. The 7b3114f is added to fix the failure. Please review.

@wangkx wangkx requested a review from jakesmith February 13, 2024 21:17
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - please see comments.

if (rd && !f->exists())
{
//The path could be a file or a directory. If the path is a file, the LDAP will skip the file layer
//(because it is undefined in the scope access setting) and check if its scope can be accessed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it is undefined in the scope access setting

it will still look it up to find it's not there. i.e. for every call that reaches here, there will be an unwanted LDAP lookup of the file name.

I think this code (lines 2773 -> 2788) should be:

    Owned<IFile> f = createIFile(rfn);
    if (f->exists() && (f->isDirectory()==fileBool::foundYes))
        checkExternalFileRights(ctx,dlfn.get(),rd,wr);
    else
    {
        StringBuffer scopes;
        checkExternalFileRights(ctx,dlfn.getScopes(scopes),rd,wr);
    }

i.e. if it's a directory, validate the full scope. Otherwise, assume it's a file (whether it exists or not), and validate ignoring the tail.

The standard physical validation (of 'frompath' and 'topath'), will be done by implementMoveExternalFile - it will throw a file not found, or IFile::move error, or pickup physical permission problems etc.

Owned<IPropertyTree> plane = checkPlaneOrHost(planename,location);
RemoteFilename fromrfn,torfn;
checkExternalFilePath(ctx,plane,location,frompath,true,true,fromrfn);
checkExternalFilePath(ctx,plane,location,topath,false,true,torfn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some comments to that JIRA.

static void implementMoveExternalFile(ICodeContext *ctx,const char *location,
const char *frompath,const char *topath,const char *planename)
{
Owned<IPropertyTree> plane = checkPlaneOrHost(planename,location);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be using findDropZonePlane or similar.
MoveExternalFile should be restricted to moving files in dropzones.

This should be consistent with behaviour elsewhere, e.g. the way we lockdown sprays to valid dropzones only, with the exception in bare-metal when useDropZoneRestriction=false.
This should also be refactored to use common code, but can wait for https://track.hpccsystems.com/browse/HPCC-31280)

}
}

static IPropertyTree *checkPlaneOrHost(const char *planeName,const char *host)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment where used, I think in next refactoring, this new method should go.

ecllibrary/std/File.ecl Show resolved Hide resolved
@wangkx wangkx requested a review from jakesmith February 15, 2024 14:27
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - pl ease see latest comments.

checkExternalFileRights(ctx,lfn.get(),false,true);
s.appendf("fsDeleteExternalFile: File %s not found.')",path);
WUmessage(ctx,SeverityInformation,nullptr,s.str());
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not clear why this is coded differently to MoveExternalFile.
This is check file exists 1st, and then checking the full scope (including the supposed filename tail).

Afaics, it should be coded in the same way that moveExternalFile is, i.e. only check full scope if (f->exists() && (f->isDirectory()==fileBool::foundYes)).

In fact, why not use checkExternalFilePath (with plane=nullptr) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But - this JIRA/PR is titled "Add plane parameter to STD.File.MoveExternalFile"

You should either:

  1. not make changes to DeleteExternalFile in this PR, and make it a separate JIRA/PR, OR
  2. Change title/comments, and add a plane parameter to DeleteExternalFile at same time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakesmith in the existing commits, both DeleteExternalFile and CreateExternalDirectory are changed because the changes in the checkExternalFileRights(). I do not want the changes in the checkExternalFileRights() to break those 2 functions. The original intention is not adding a plane parameter in those functions in this PR. If you want, I can roll back the changes in DeleteExternalFile and CreateExternalDirectory. Please let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakesmith in the existing commits, both DeleteExternalFile and CreateExternalDirectory are changed because the changes in the checkExternalFileRights(). I do not want the changes in the checkExternalFileRights() to break those 2 functions. The original intention is not adding a plane parameter in those functions in this PR. If you want, I can roll back the changes in DeleteExternalFile and CreateExternalDirectory. Please let me know.

@wangkx - Given the knock on impact change to MoveExternalFile has had, and the required changes to DeleteExternalFile and CreateExternalDirectory, and the desire to add plane to those anyway, I would make them all consistent and add a plane parameter to both of these in this PR (and retitle the JIRA and commit.

plugins/fileservices/fileservices.cpp Show resolved Hide resolved
@wangkx wangkx changed the title HPCC-31213 Add plane name to MoveExternalFile HPCC-31213 Add plane name to MoveExternalFile, etc Feb 16, 2024
@wangkx
Copy link
Member Author

wangkx commented Feb 16, 2024

@jakesmith looks like that the smoke test failure is because of the code:

static void checkExternalFilePath(ICodeContext *ctx,IPropertyTree *plane,const char *host,
    const char *path,bool rd,bool wr,RemoteFilename &rfn)
{
    if (containsRelPaths(path)) //Detect a path like: a/../../../f
        throw makeStringExceptionV(-1,"Invalid file path %s",path);

please advise. Should I add a try/catch to switch the exception to a log?

@wangkx
Copy link
Member Author

wangkx commented Feb 17, 2024

The smoke test failure 'Invalid file path' is fixed by 8a70be3. But, the code got another smoke test failure (see below) which may happen if the smoke test runs on a cloud env or a bare metal env with isDropZoneRestrictionEnabled.

Error:
188. Test: despray.ecl
'--- despray.xml
+++ despray.xml
@@ -1,3 +1,4 @@
+4294967295RoxieDropZone Plane not found for host . path /var/lib/HPCCSystems/mydropzona/W20240217-030601-1-persons.

@wangkx wangkx requested a review from jakesmith February 17, 2024 12:41
@jakesmith
Copy link
Member

@jakesmith looks like that the smoke test failure is because of the code:

static void checkExternalFilePath(ICodeContext *ctx,IPropertyTree *plane,const char *host,
    const char *path,bool rd,bool wr,RemoteFilename &rfn)
{
    if (containsRelPaths(path)) //Detect a path like: a/../../../f
        throw makeStringExceptionV(-1,"Invalid file path %s",path);

please advise. Should I add a try/catch to switch the exception to a log?

@wangkx - I see the last commit has moved the relative check from within checkExternalFilePath to callers....
What exactly failed (and why) ?

@jakesmith
Copy link
Member

The smoke test failure 'Invalid file path' is fixed by 8a70be3. But, the code got another smoke test failure (see below) which may happen if the smoke test runs on a cloud env or a bare metal env with isDropZoneRestrictionEnabled.

Error: 188. Test: despray.ecl '--- despray.xml +++ despray.xml @@ -1,3 +1,4 @@ +4294967295RoxieDropZone Plane not found for host . path /var/lib/HPCCSystems/mydropzona/W20240217-030601-1-persons.

I ran with your branch + despray.ecl.
The above error is being fired from DeleteExternalFile.
That's because despray.ecl is trying to delete files for the invalid tests (i.e. those that are specificially outside of a dropzone). See DestFile6.

The test is wrong, these files will never have been created.
The test(s) should be fixed.

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - see comments/questions

@wangkx
Copy link
Member Author

wangkx commented Feb 20, 2024

I see the last commit has moved the relative check from within checkExternalFilePath to callers....
What exactly failed (and why) ?

It happened when running the following code in despray.ecl. Because the DestFile4 contains the 'mydropzone/./', the code before that commit throws: 4294967295RoxieInvalid file path /var/lib/HPCCSystems/mydropzone/./W20240216-193252-5-persons

// This should fail based on '/./' used in target path
DestFile4 := dropzonePath + './' + DestFileName;
dst4 := NOFOLD(DATASET([{SourceFile, DestFile4, SrcAddrIp, True, '', ''}], rec));

p4 := PROJECT(NOFOLD(dst4), t(LEFT));

c4 := CATCH(NOFOLD(p4), ONFAIL(TRANSFORM(rec,
SELF.result := 'Fail',
SELF.destFile := DestFile4,
SELF.sourceFile := SourceFile,
SELF.ip := SrcAddrIp,
SELF.allowOverwrite := True,
SELF.msg := FAILMESSAGE
)));
#if (VERBOSE = 1)
o4 := output(c4);
#else
o4 := output(c4, {result});
#end

@wangkx
Copy link
Member Author

wangkx commented Feb 20, 2024

That's because despray.ecl is trying to delete files for the invalid tests (i.e. those that are specificially outside of a dropzone). See DestFile6.The test is wrong, these files will never have been created. The test(s) should be fixed.

I think that those tests were added intentionally (including the tests for DestFile4 and DestFile6). I am not sure which tests should be fixed.

@wangkx
Copy link
Member Author

wangkx commented Feb 20, 2024

I guess I should remove the following code from the despray.ecl.

/// FileServices.DeleteExternalFile('.', DestFile4),
/// FileServices.DeleteExternalFile('.', DestFile5),
/// FileServices.DeleteExternalFile('.', DestFile6),
/// FileServices.DeleteExternalFile('.', DestFile7),
/// FileServices.DeleteExternalFile('.', DestFile8),

@wangkx wangkx force-pushed the h31213 branch 3 times, most recently from 9250073 to b0c6a71 Compare February 20, 2024 14:51
@@ -2750,9 +2750,6 @@ static void checkExternalFileRights(ICodeContext *ctx,const char *scope,bool rd,
static void checkExternalFilePath(ICodeContext *ctx,IPropertyTree *plane,const char *host,
const char *path,bool rd,bool wr,RemoteFilename &rfn)
{
if (containsRelPaths(path)) //Detect a path like: a/../../../f
throw makeStringExceptionV(-1,"Invalid file path %s",path);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - Now the rogue DeleteExternalFile's are removed (commented out), that was the root cause of the regression suite failure, is this movement of this relative check correct?
If not should move back.
If it is, when does it cause failures?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relative check has been moved back to here.

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - looks fine. Please squash.

The STD.File.MoveExternalFile moves files from one
landing zone folder to anothor. The existing method
can only specify a landing zone using host name. In
this PR, a landing zone can be specified by plane
name. Also add plane name to DeleteExternalFile and
CreateExternalDirectory.

Revise based on review:
1. Change the checkExternalFileRights() to call getScopePermissions always.
2. Pass correct scopes to the checkExternalFileRights().
3. In the checkExternalFilePath(), if file->exists(), assume it's a
file and validate ignoring the tail.
4. If plane name is not specified, use the findDropZonePlane() and
useDropZoneRestriction.
5. In ecllibrary/std/File.ecl, make the planename optional for the
MoveExternalFile().
6. Add a comment for checkPlaneOrHost()
7. Fix smoke test failures: In the despray.ecl, remove the DeleteExternalFile
calls for malformed filenames and add a comment.

Signed-off-by: wangkx <[email protected]>
@wangkx
Copy link
Member Author

wangkx commented Feb 21, 2024

The commits are squashed.
@ghalliday I think the windows build failure is not related to this PR. Please check and merge this PR.
This is a dependent PR for #18292,

@ghalliday
Copy link
Member

@wangkx please can you document the changes in the jira

@wangkx
Copy link
Member Author

wangkx commented Feb 22, 2024

@ghalliday the changes are documented inside https://track.hpccsystems.com/browse/HPCC-31213. If I miss anything, please let me know.

@ghalliday ghalliday merged commit fb740b6 into hpcc-systems:master Feb 22, 2024
47 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants