Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-30809 Add ESP WsSasha service and implement 3 methods #18065

Merged
merged 1 commit into from
Dec 19, 2023

Conversation

wangkx
Copy link
Member

@wangkx wangkx commented Nov 21, 2023

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copy link

@wangkx wangkx requested a review from jakesmith November 21, 2023 21:13
@wangkx
Copy link
Member Author

wangkx commented Nov 21, 2023

@jakesmith I created this PR based on the comments on #18018.
There are 3 commits here. The 1st commit is a copy of c2ac378. The 2nd one contains the changes related to the SashaAccess settings for bare-metal environment. Those changes should be in the 1st commit. Inside the last commit, the GetVersion is implemented.

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - please see feedback.

It's probably worth a 1:1 chat about this before moving forward.

@@ -76,6 +76,7 @@
<AuthenticateFeature description="Access to sign ECL code" path="CodeSignAccess" resource="CodeSignAccess" service="ws_codesign"/>
<AuthenticateFeature description="Access to LogAccess service" path="WsLogAccess" resource="WsLogAccess" service="ws_logaccess"/>
<AuthenticateFeature description="Access to DFS service" path="DfsAccess" resource="DfsAccess" service="ws_dfsservice"/>
<AuthenticateFeature description="Access to HPCC Archive" path="SashaAccess" resource="SashaAccess" service="ws_sasha"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we should be adding this new service to bare-metal at all (or the ability to configure it via configmgr).
At least for now, bare-metal should continue to work as it has, and continue to be able to use the existing CLI, direct to sasha.
The main impetus behind this change is that in cloud, the sasha services cannot be directly reached anymore, and access to them must go via an esp service.

For now, let's ensure the bare-metal sasah CLI continues to work directly via a command line provided sasha 'server', but also (for cloud) the CLI should support being given a service=, which talks to the new esp service.

@@ -40,6 +40,7 @@
#define ECLWATCH_DFU_EX_ACCESS_DENIED ECLWATCH_ERROR_START+14
#define ECLWATCH_FILE_SPRAY_ACCESS_DENIED ECLWATCH_ERROR_START+15
#define ECLWATCH_FILE_DESPRAY_ACCESS_DENIED ECLWATCH_ERROR_START+16
#define ECLWATCH_SASHA_ACCESS_DENIED ECLWATCH_ERROR_START+17
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: not only one, but would be better if these were consistently indented and aligned.

context.ensureFeatureAccess(FEATURE_URL, SecAccess_Read, ECLWATCH_SASHA_ACCESS_DENIED, "WSSashaEx.GetVersion: Permission denied.");

StringBuffer results;
Owned<INode> node = createSashaNode(wuArchiverType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

future: this isn't too costly, but it is worth considering setting these up once and not having to walk through the configuration finding the service, and creating the INode each time. But ensuring they are updated on configuraton changes only, i.e. using a ConfigUpdateFunc (e.g. like the one in validateTargetName, in TpContainer.cpp)

Don't change now though.

esp/services/ws_sasha/ws_sashaservice.cpp Show resolved Hide resolved
outBuffer.appendf("Sasha server[%s]: Version %s", host.str(), id.str());
}

extern SARUNCMD_API void runSashaCommand(SashaCommandAction action, INode *node, StringBuffer &outBuffer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this will be wanted, I think it will be better to define the sasha commands and expose instead.
Either as exposed functions, or contained in a class, so that it can be reused with common settings.
e.g.:

class CSashaCmdExecutor
{
    Linked<INode> node;
    unsigned defaultTimeoutMs;
    StringBuffer nodeText;
    const char *getNodeText()
    {
        if (0 == nodeText.length())
            node->endpoint().getEndpointHostText(nodeText);
        return nodeText;
    }

    bool restoreWorkunit(const char *wuid, bool dfu, StringBuffer &serverResponse)
    {
        Owned<ISashaCommand> cmd = createSashaCommand();
        cmd->setAction(SCA_RESTORE);
        if (dfu)
            cmd->setDFU(true);
        cmd->addId(wuid);
        if (!cmd->send(node, defaultTimeoutMs))
            throw makeStringExceptionV(-1, "Could not connect to Sasha server on %s", getNodeText());
        if (cmd->numIds()==0)
        {
            // nothing restored
            return false;
        }
        cmd->getId(0, serverResponse);
        return true;
    }
    bool archiveWorkunit(const char *wuid, bool dfu, bool deleteAfterArchiving, StringBuffer &serverResponse)
    {
        Owned<ISashaCommand> cmd = createSashaCommand();
        cmd->setAction(SCA_ARCHIVE);
        if (dfu)
            cmd->setDFU(true);
        cmd->addId(wuid);
        if (!cmd->send(node, defaultTimeoutMs))
            throw makeStringExceptionV(-1, "Could not connect to Sasha server on %s", getNodeText());
        if (cmd->numIds()==0)
        {
            // nothing restored
            return false;
        }
        cmd->getId(0, serverResponse);
        return true;
    }

public:
    CSashaCmdExecutor(INode *_node, unsigned _defaultTimeoutSecs=60) : node(_node)
    {
        defaultTimeoutMs = _defaultTimeoutSecs * 1000;
    }
    CSashaCmdExecutor(const SocketEndpoint &ep, unsigned _defaultTimeoutSecs=60)
    {
        node.setown(createINode(ep));
        defaultTimeoutMs = _defaultTimeoutSecs * 1000;
    }
    CSashaCmdExecutor(const char *server, unsigned port=0, unsigned _defaultTimeoutSecs=60)
    {
        SocketEndpoint ep(server, port);
        node.setown(createINode(ep));
        defaultTimeoutMs = _defaultTimeoutSecs * 1000;
    }
    StringBuffer &getVersion(StringBuffer &version)
    {
        Owned<ISashaCommand> cmd = createSashaCommand();
        cmd->setAction(SCA_GETVERSION);
        if (!cmd->send(node, defaultTimeoutMs))
            throw makeStringExceptionV(-1, "Could not connect to Sasha server on %s", getNodeText());
        if (!cmd->getId(0, version))
            throw makeStringExceptionV(-1, "Sasha server[%s]: Protocol error", getNodeText());
    }
    bool restoreECLWorkUnit(const char *wuid, StringBuffer &serverResponse)
    {
        return restoreWorkunit(wuid, false, serverResponse);
    }
    bool restoreDFUWorkUnit(const char *wuid, StringBuffer &serverResponse)
    {
        return restoreWorkunit(wuid, true, serverResponse);
    }
    bool archiveECLWorkUnit(const char *wuid, StringBuffer &serverResponse)
    {
        return archiveWorkunit(wuid, false, true, serverResponse);
    }
    bool archiveDFUWorkUnit(const char *wuid, StringBuffer &serverResponse)
    {
        return archiveWorkunit(wuid, true, true, serverResponse);
    }
    bool backupECLWorkUnit(const char *wuid, StringBuffer &serverResponse)
    {
        return archiveWorkunit(wuid, true, false, serverResponse);
    }
    bool backupDFUWorkUnit(const char *wuid, StringBuffer &serverResponse)
    {
        return archiveWorkunit(wuid, true, false, serverResponse);
    }
};

then the service code can use/call these methods directly.

#define SARUNCMD_API DECL_IMPORT
#endif

extern SARUNCMD_API void runSashaCommand(SashaCommandAction action, INode *node, StringBuffer &outBuffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above. I think it would be better to clearly define the commands and their parameters/results, and expose those.

Then the service can use them.
And the ECM can map to them.

@@ -374,23 +375,11 @@ ISashaCommand *createCommand(unsigned argc, char* argv[], SocketEndpoint &server
return cmd.getClear();
}

bool getVersion(INode *node)
void getVersion(INode *node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comments,
It would be better if each command implementation was contained in saruncmd.

The interaction with a directly accessible sasha 'server' should continue to work as it does now for the time being.
But there should also be a way to interact with the new esp service via the CLI, e.g. by specifying "service=" instead of "server="

The implementation of this CLI to the service will necessarily have to look quite different, using the new saruncmd's.
It's a good reason to add just a few command for now, in order to get the structure right.
Ultimately, "createCommand" may need refactoring, since it's doing serveral things:

  1. parsing the command line
  2. creating an ISashaCommand as a repository for the parsed command (which also implements server<>client send/receipt)
  3. interactive behaviour (via 'confirm').

Continuing to use it to parse the command line into a ISashaCommand is probably fine/pragmatic, which can then be used to decide what/how to run selected saruncmd's after that when talking to the esp service.

@wangkx
Copy link
Member Author

wangkx commented Dec 4, 2023

@jakesmith please review

@wangkx wangkx requested a review from jakesmith December 4, 2023 20:57
@wangkx wangkx changed the title HPCC-30809 Add ESP WsSasha service and implement GetVersion HPCC-30809 Add ESP WsSasha service and implement 4 methods Dec 4, 2023
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - looks good in general, please see feedback.

#include "sacmd.hpp"
#include "saruncmd.hpp"

class CSashaCmdExecutor : implements ISashaCmdExecutor, public CInterface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: CInterfaceOf is more efficient/saves a vmt.
and you then don't need the IMPLEMENT_IINTERFACE below either.

dali/sasha/saruncmd.cpp Show resolved Hide resolved

class CSashaCmdExecutor : implements ISashaCmdExecutor, public CInterface
{
Linked<INode> node;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: see other comment re. single ctor

if a single ctor that creates a INode, it would be more logical if this was a Owned.

Linked<INode> node;
unsigned defaultTimeoutMs = 60;
StringBuffer nodeText;
const char *getNodeText()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: as this is a class, it would probably be better to make this a const, and fetch the text in the ctor.

throw makeStringExceptionV(-1, "Sasha server[%s]: Protocol error", getNodeText());
return version;
}
bool restoreECLWorkUnit(const char *wuid, StringBuffer &serverResponse)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth revisiting in future.
Ideally, this command interface, wouldn't see arbitrary response messages coming directly back to the caller.
The server would instead ensure a consistent response of success/failure, a failure would be thrown as an error, and/or if there was auxiliary info to be related, it would be separate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps for now, to clear up the interface, consider storing the response in a member, e.g. lastServerMessage,
removing from the interface, and allowing the caller to query it if it wants it.

Owned<ISashaCmdExecutor> executor = createSashaCommandExecutor(wuType == CWUTypes_ECL ? wuArchiverType : dfuwuArchiverType);

StringBuffer results;
bool sashaCmdReturn = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is sashaCmdSuccess more intuitive?

void setResults(const char* wuid, const char* action, StringBuffer& results);

public:
IMPLEMENT_IINTERFACE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary

ESPmethod [auth_feature("SashaAccess:FULL")] ArchiveWU(ArchiveWURequest, ResultResponse); //archive ECL WUs or DFU WUs
ESPmethod [auth_feature("SashaAccess:FULL")] RestoreWU(RestoreWURequest, ResultResponse); //restore ECL WUs or DFU WUs
//The ArchiveWU deletes the workunit from Dali whereas BackupWU leave it in place.
ESPmethod [auth_feature("SashaAccess:FULL")] BackupWU(BackupWURequest, ResultResponse);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the cli, it makes sense for archive and backup to be separate.
From the service point of view, the only difference is the whether it deletes as well as archiving, I think it would be clearer if this was one service method, with an optional 'deleteOnSuccess' option.

context.ensureFeatureAccess(FEATURE_URL, SecAccess_Read, ECLWATCH_SASHA_ACCESS_DENIED, "WSSashaEx.GetVersion: Permission denied.");

StringBuffer results;
Owned<ISashaCmdExecutor> executor = createSashaCommandExecutor(wuArchiverType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps instead of recreating per command, there should be a dfu + ecl instances of the command executor, as members?

@@ -137,6 +137,10 @@ This is required by its binding with ESP service '<xsl:value-of select="$espServ
<xsl:with-param name="bindingNode" select="$bindingNode"/>
<xsl:with-param name="authNode" select="$authNode"/>
</xsl:apply-templates>
<xsl:apply-templates select="." mode="ws_sasha">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not add any bare-metal changes at this point.
This should be primarily for cloud, and test there.
I think you'll need to add WsSasha to eclwatch/applications.yaml - but you may need to check with @afishbeck

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove this code block.
The WsSasha is already in the eclwatch/applications.yaml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, please test in a k8s setup

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - looks close, please see minor comments.

return version;
}
bool restoreECLWorkUnit(const char *wuid, StringBuffer &serverResponse)
virtual StringBuffer &getLastServerMessage() const override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: more conventional to either return a const char * and for method to be queryXXX,
or provide a StringBuffer reference to copy into and return it (e.g. like getVersion).
And would be good if getLastServerMessage + getVersion followed same pattern.

bool sashaCmdSuccess = false;
if (req.getDeleteOnSuccess())
{
if ((wuType == CWUTypes_ECL))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: unnecessary extra ( ) (also on line 108)

else
sashaCmdSuccess = executor->backupDFUWorkUnit(isEmptyString(wuid) ? "*" : wuid);
}
setResults(sashaCmdSuccess ? executor->getLastServerMessage().str() : nullptr, wuid, "archived", resp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure setResults + the server message semantics are very clear.
Perhaps break it down to setSuccess and setFailure method so it is explicit when/what it is setting in the response and why.

@wangkx wangkx requested a review from jakesmith December 14, 2023 15:36
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - looks good. Please squash.

@wangkx wangkx force-pushed the h30809v01 branch 2 times, most recently from eb27162 to 78e568e Compare December 14, 2023 19:18
@wangkx wangkx changed the title HPCC-30809 Add ESP WsSasha service and implement 4 methods HPCC-30809 Add ESP WsSasha service and implement 3 methods Dec 14, 2023
@wangkx wangkx requested a review from jakesmith December 14, 2023 21:48
@wangkx
Copy link
Member Author

wangkx commented Dec 14, 2023

@jakesmith squashed.

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - looks good.

@ghalliday
Copy link
Member

@jakesmith Just to check, is 9.4 the correct target for this, or would master be better?
For 9.4: It is unlikely to break anything, and ? is needed sooner rather than later
For 9.6: It is new functionality.

@jakesmith
Copy link
Member

@jakesmith Just to check, is 9.4 the correct target for this, or would master be better? For 9.4: It is unlikely to break anything, and ? is needed sooner rather than later For 9.6: It is new functionality.

It is new functionality and unlikely to break anything, but I think master would be more appropriate.
@wangkx - please retarget.

@jakesmith jakesmith self-requested a review December 18, 2023 10:06
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangkx - please retarget to master.

1. Add ws_sasha framework
2. Add GetVersion/Archive/Restore

Revise based on review:
1. Remove 2 constructors.
2. Merge BackupWU into ArchiveWU with an optional 'deleteOnSuccess' option.
3. Store the sasha server response into the lastServerMessage and allow a caller to query it.
4. Store the ISashaCmdExecutors as members to avoid recreating.
5. Roll back the changes in esp_service_WsSMC.xsl.
6. Change the getLastServerMessage() to match with the pattern of the getVersion().
7. Remove extra ().
8. Divide the setResults() to setSuccess and setFailure.

Signed-off-by: wangkx <[email protected]>
@wangkx wangkx changed the base branch from candidate-9.4.x to master December 18, 2023 14:17
@wangkx
Copy link
Member Author

wangkx commented Dec 18, 2023

@ghalliday @jakesmith this PR has been re-targeted to master and tested.

BTW: I noticed that the build and smoke tests is not started. I saw the same thing for my PR #18137. Is it because those PRs have already been approved?

@jakesmith jakesmith closed this Dec 18, 2023
@jakesmith jakesmith reopened this Dec 18, 2023
Copy link

https://track.hpccsystems.com/browse/HPCC-30809
This pull request is already registered

@jakesmith
Copy link
Member

BTW: I noticed that the build and smoke tests is not started. I saw the same thing for my PR #18137. Is it because those PRs have already been approved?

Don't the fact that it's approved is relevant, not sure why they weren't running.
I closed the PR and reopened it - and now they are.

@wangkx
Copy link
Member Author

wangkx commented Dec 18, 2023

Thanks for reopen. I did the same for #18137.

@ghalliday
Copy link
Member

I think the tests often fail to run if you change the target branch.

@ghalliday ghalliday merged commit c1acb93 into hpcc-systems:master Dec 19, 2023
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants