Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-32452 Time security manager auth calls #19007

Conversation

timothyklemm
Copy link
Contributor

@timothyklemm timothyklemm commented Aug 19, 2024

Create an ISecManager decorator that can add internal timing spans for any security manager interface call.

Replace direct manager authorization calls with calls through the decorator. Disregard CLdapSecManager-specific calls for now.

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copy link

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32452

Jirabot Action Result:
Workflow Transition: Merge Pending
Updated PR

@timothyklemm timothyklemm marked this pull request as draft August 19, 2024 11:26
@timothyklemm timothyklemm marked this pull request as ready for review August 19, 2024 11:27
@timothyklemm timothyklemm marked this pull request as draft August 19, 2024 11:28
@timothyklemm
Copy link
Contributor Author

The complex use of template parameters is to handle the LDAP dependency, if not now then in the future. There are a couple of auth calls not handled by this PR due to this dependency.

@kenrowland how likely are we to standardize the security manager interface, removing this dependency? Is it more likely we may need and LDAP decorator first?

@rpastrana should I decorate every manager method, so instrumentation is available when we want it, or only decorate the specific calls being handled? The only risk of overdoing decoration now is if we do eliminate the LDAP dependency and can apply a decorator to each manager as it is created, automating instrumentation.

Copy link
Member

@rpastrana rpastrana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timothyklemm good looking code. left you a couple of minor comments and some high level questions.

system/security/shared/secmanagertracedecorator.hpp Outdated Show resolved Hide resolved
system/security/shared/secmanagertracedecorator.hpp Outdated Show resolved Hide resolved
{
return decorated->unsubscribe(events, secureContext);
}
virtual bool authorize(ISecUser & user, ISecResourceList * resources, IEspSecureContext* secureContext = nullptr) override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the pattern for customizing the decoration behavior?
Let's say the esp knows it doesn't want to trigger a span for a specific secmanager call. Would that require a second TSecManagerTraceDecorator with a different method def?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't sound like a decorator question. As long as the ESP is making the decision to instrument, the question is how does the ESP decide which managers do or do not require instrumentation.

If we assume LDAP will always require instrumentation, we could define a new plugin configuration value (likely in the SecurityManager element wrapping the manager's configuration). This value could be passed into the decorator constructor, allowing the decorator to do what it needs to do. We could create a threaded active decorator at manager construction, or use a "null decorator" when instrumentation is not needed.

A second option could be to make the determination a function of the manager itself. I'm not sure there's an existing API to make this decision, but one could be added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me ask it a different way.
How would we suppress the spans added by the decorator if a very simple secmanager is used which doesn't need to trace its calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ESP doesn't currently know if it is a very simple manager. There is an enumerated value that managers may return, which we could use to decide, but there are no values defined for managers outside the platform and could be unreliable. There is a label associated with the enumeration, but there are no naming conventions in place through which we could extract anything like this.

The only way the ESP will know that it is a simple manager is to ask it, or look for a new configuration value. I've intentionally omitted the third option which is dynamically casting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timothyklemm I didn't ask how the ESP can determine if a given call should be traced or not, but I think I can infer that we're stuck w/ the spans declared by the decorator for all sec manager.Thanks.

* - If nothing else changes, create a subclasses of TSecManagerTraceDecorator, with template
* parameters CLdapSecManager and ISecManager, to decorate the LDAP-specific interfaces.
*/
template <typename secmgr_t = ISecManager, typename secmgr_interface_t = ISecManager>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the advantage/disadvantage of this approach vs a base implementation which all sec managers could extend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the template, the base class holds a reference to an ISecManager instance, and any subclass holds a second reference to the same object but typed as the manager implementation class. Very minor, but I'm not a fan of holding the same value twice. The first template parameter allows the base class to hold the type needed by the subclass.

The two template parameters could be consolidated into one, reducing the complexity, if either:

  • The ISecManager and CLdapSecManager interfaces are standardized.
  • A new extension of ISecManager declaring the LDAP extensions is used.

I'm not married to either approach.

@rpastrana
Copy link
Member

hould I decorate every manager method,

@timothyklemm I don't think decorating every method would be helpful.
In fact as we've discussed, for some managers, the 2 methods decorated might result in too many spans...

Copy link
Contributor

@kenrowland kenrowland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks like what we discussed. Just a few comments.

@@ -113,7 +114,7 @@ static bool checkWuScopeSecAccess(const char *wuscope, ISecManager *secmgr, ISec
{
if (!secmgr || !secuser)
return true;
bool ret = secmgr->authorizeEx(RT_WORKUNIT_SCOPE, *secuser, wuscope)>=required;
bool ret = CSecManagerTraceDecorator(*secmgr).authorizeEx(RT_WORKUNIT_SCOPE, *secuser, wuscope)>=required;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concern that each time you use CSecManagerTraceDecorator that it is constructed and destructed. Any thought on making the decorated calls static?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a best practice to use a single security manager configuration (implying a single implementation) within an ESP process. ESPs can be configured to use multiple managers and configurations.

A static interface would require either passing the manager to each decorated call, negating the benefit of extending ISecManager, or a threaded security manager (similar to the threaded active span). The latter option would likely complicate the implementation by requiring manager checks in every decorated method.

I'm not sure which is preferred. Also, one of @rpastrana's comments might indirectly justify a threaded active decorator, so only one decorator would be created per service.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the overhead of creating a new decorator every time a secmanager method is called. What's the overall overhead added and what is the overriding reason to implement this way.

@@ -972,7 +973,7 @@ bool EspHttpBinding::basicAuth(IEspContext* ctx)
return false;
}

bool authenticated = m_secmgr->authorize(*user, rlist, ctx->querySecureContext());
bool authenticated = CSecManagerTraceDecorator(*m_secmgr).authorize(*user, rlist, ctx->querySecureContext());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concern here that m_secmgr is an OWNED and you are giving the CSecManagerTraceDecorator instance the raw pointer and it's creating a LINKED pointer. Not sure how those would interact. I see the same in at least one other instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decorator always adds, and releases, its own reference to the object, without affecting ownership of the reference passed to it. I believe it is safe.

As it's being used, linking might not be needed. If the decorator ever has a longer life than one manager call, it will be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not being an expert with Owned and Linked, but is the reference count for the object shared between the original Owned and the Linked pointers?

}
virtual bool authorizeEx(SecResourceType rtype, ISecUser & user, ISecResourceList * resources, IEspSecureContext* secureContext = nullptr) override
{
START_SEC_MANAGER_TRACE_BLOCK("security.authorize_ex");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be "security.authorize_ex_list" to distinguish between a single authorization and a list? It would seem that ia a long list was passed that it could skew results if it's measured the same as a single authorization. There is at least one other case where a method is overloaded to accept both a single instance and a list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be. I thought about it because it's already a distinction made in the feature flags.

system/security/shared/secmanagertracedecorator.hpp Outdated Show resolved Hide resolved
* interface. By implementing the same interface, a decorator is interchangeable with the object
* it decoratos.
*
* In less than ideal situations, the decorated object's interface is an extension of a named
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial, should "an extension" be "a subclass"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they're synonyms, but I'll change it.

system/security/shared/secmanagertracedecorator.hpp Outdated Show resolved Hide resolved
@kenrowland
Copy link
Contributor

@timothyklemm There are no plans at this time to make the LDAP specific interface part of ISecManager. If having them the same significantly reduces the decorator solution and the need exists to decorate and trace LDAP security manager calls, I can investigate further. Consolidating them would be a good interim step to eventual replacement.

@timothyklemm timothyklemm force-pushed the hpcc-32452-secmgr-trace-decorator branch 2 times, most recently from 18e7e08 to 3e2f130 Compare September 24, 2024 17:24
@timothyklemm
Copy link
Contributor Author

@rpastrana and @kenrowland, this is still a work in progress. There are two commits, each representing a separate Jira. The first commit sets up the ESP for using trace flags. The second has the decorator using trace flags to control span creation. Do these address prior concerns?

Copy link
Member

@rpastrana rpastrana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timothyklemm a few comments on the traceflag commit

@@ -113,7 +114,7 @@ static bool checkWuScopeSecAccess(const char *wuscope, ISecManager *secmgr, ISec
{
if (!secmgr || !secuser)
return true;
bool ret = secmgr->authorizeEx(RT_WORKUNIT_SCOPE, *secuser, wuscope)>=required;
bool ret = CSecManagerTraceDecorator(*secmgr).authorizeEx(RT_WORKUNIT_SCOPE, *secuser, wuscope)>=required;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the overhead of creating a new decorator every time a secmanager method is called. What's the overall overhead added and what is the overriding reason to implement this way.

@@ -436,6 +436,16 @@ void setLDAPSecurityInWSAccess(IPropertyTree *legacyEsp, IPropertyTree *legacyLd
}
}

void copyTraceFlags(IPropertyTree *legacyEsp, IPropertyTree *appEsp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when would this method be useful?
Is it only capitalizing the traceflags name?
Perhaps a method comment header would help

@@ -354,6 +358,74 @@ static void usage()

IPropertyTree *buildApplicationLegacyConfig(const char *application, const char* argv[]);

// Modified version of jlib's loadTraceFlags. The modification adds special handling for traceLevel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's elaborate on the reasoning for this alternative loadTraceFlags implementation
Including examples of the proposed config format and the pre-existing format

// Modified version of jlib's loadTraceFlags. The modification adds special handling for traceLevel.
static TraceFlags loadEspTraceFlags(const IPropertyTree *ptree, const std::initializer_list<TraceOption> &optNames, TraceFlags dft)
{
for (auto &o: optNames)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename o with a self documenting name

{
VStringBuffer attrName("@%s", o.name);
const char* value = nullptr;
if (!(value = ptree->queryProp(attrName)) && !(value = ptree->queryProp(attrName.setf("_%s", o.name))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the significance of the underscore prefix? is this a new convention you're setting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underscore prefix is an alternate xpath used in loadTraceFlags. I don't know when, or if, it's used but retained it to minimize functional differences between the two functions.

@@ -577,6 +649,7 @@ int init_main(int argc, const char* argv[])
config->bindServer(*server.get(), *server.get());
config->checkESPCache(*server.get());

initializeTrace(config);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the symmetry w/ the initializeMetrics call, but this name can be confused with Tracing.
InitializeTraceLevelSettings ?

Copy link
Contributor

@asselitx asselitx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks like a good solution to getting the trace configuration into the ESP and for timing the secmgr calls.

@@ -354,6 +358,74 @@ static void usage()

IPropertyTree *buildApplicationLegacyConfig(const char *application, const char* argv[]);

// Modified version of jlib's loadTraceFlags. The modification adds special handling for traceLevel.
static TraceFlags loadEspTraceFlags(const IPropertyTree *ptree, const std::initializer_list<TraceOption> &optNames, TraceFlags dft)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it be better to just use the standard load from jlib? Names are more expressive than an integer, but then we start a habit of ESP diverging from the standard (as in the ESPLOG) that might cause us more frustration in the long term than if we had uniform configuration of common properties across components.

* only the first.
* - decorated_t is the type of the object to be decorated. If the the decorated object conforms
* to an interface, use of the interface is preferred.
* - secorated_interface_t is the interface implemented by the decorated object. If not the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling: "secorated_interface_t" should be "decorated_interface_t"

* @brief Macro used start tracing a block of code in the security manager decorator.
*
* Create a new named internal span and enter a try block. Used with END_SEC_MANAGER_TRACE_BLOCK,
* provides consistent timing and exception handling for the inned code block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling: should be "handling for the inner code block"

Tim Klemm added 3 commits October 23, 2024 13:56
- Define an ESP process configuration node that supports specification of global
  TraceFlags values for each ESP.
- Reserve traceLevel to request a specific trace level (0: none, 1: standard,
  2: detailed, 3: max). This replaces acting on the last observed occurrence
  of either traceNone, traceStandard, traceDetailed, and traceMax.
- Override default flag settings when not configured and a debug build.

Signed-off-by: Tim Klemm <[email protected]>
…tion calls.

- Define a new, common, trace option for security manager tracing. Both ESP and
  Roxie use security managers, and the flag can be shared by both.
- Refactor CEspHttpServer to create server spans before initial authorization
  requests.
- Create a security manager decorator encapsulating the logic of what to trace
  and when.
- Replace security manager authorization calls with decorated calls. This change
  excludes calls specific to the LDAP security manager.

Signed-off-by: Tim Klemm <[email protected]>
Introduce a persistent decorator instance to the HTTP binding, ESP context, and
security handler to avoid repeated construction and destruction.

Improve the binding's handling of security manager creation failures to avoid
null dereferences when creating the decorator, or resource auth maps.
@timothyklemm timothyklemm force-pushed the hpcc-32452-secmgr-trace-decorator branch from 3e2f130 to f23b4e4 Compare October 29, 2024 14:36
@timothyklemm timothyklemm changed the base branch from candidate-9.8.x to master October 29, 2024 15:14
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem to be the correct approach to me. It is more complicated that I would expect for the problem that I think it is trying to solve.

The jira could do with a discussion of the potential designs, exactly what it would be beneficial to trace and why. Adding the code in this place means that all security managers - including fixed username, or simple table of users would be tracing all authentication calls.

Similarly for ldap security managers it will be creating spans when the value is returned immediately from the cache. We do not want to generate lots of short-lived spans, especially if they do not involve a server.

What is potentially important is tracing the requests to the LDAP server. I imagine that can be best achieved by adding code to create a client span within the ldap code itself. That also allows ldap-specific details to be added to the span (if there are any).

@timothyklemm
Copy link
Contributor Author

@ghalliday the code currently included in the PR is more complex than it should be. If #19292 is accepted, we will be able to decorate managers as they are constructed, and no additional awareness of the decorator is required.

@rpastrana and I discussed whether it would be preferable for the ESP or the manager to be responsible for instrumentation. With the potential for 3rd party managers (not just mine, but an OAuth 2 manager is being developed), we concluded it would be better for the ESP to produce consistent output instead of hoping managers will do it for us.

To reduce the number of unnecessary spans, the feature must be explicitly turned on and only "implemented" methods, as identified by the manager, will be instrumented. Unfortunately, there doesn't seem to be a reliable way, given the current interface, to identify which managers warrant instrumentation. Also, I can't predict whether something will be cached before creating a span.

It could be possible to replace spans with timing attributes. This could be start and end times (similar to current TxSummary content), start time and duration, or just duration. Inclusion of the attributes could be conditional on a minimum duration. Retroactive span creation given a start and duration, if possible, could also be conditional on a minimum duration.

Both options would prevent grouping any spans created by the manager itself. if I was actively working on my plugin, I would probably have it creating child spans for each of its multiple HTTP and MySQL requests. Any authorize request to one of my managers could require at least 3 HTTP requests and 2 MySQL requests to complete.

@rpastrana
Copy link
Member

@timothyklemm as I recall there were several long conversations regarding this feature. I don't recall what might have convinced me to agree to the current proposal, but in general I'd expect the ESP to span potentially lengthy/complex calls. I'd also expect the call to have the ability to declare its own child span to represent some interesting subtask (I think you pointed out why this might be difficult to achieve, but I don't recall). Another important aspect we've emphasized is to avoid affecting existing functionality for the sake of tracing.

@timothyklemm
Copy link
Contributor Author

Instrumentation will be added directly within affected security managers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants