auditing: construct metrics probe during service construction #24108

michael-redpanda · 2024-11-12T21:34:19Z

Move the construction of the audit probe from service start to service construction. Previously, the audit probe was created as a unique pointer when the service was started and destroyed when the service stopped.

Since the audit service has the same lifespan as the application, creating the probe during service construction reduces the risk of dereferencing a null pointer if the probe is accessed before the service has started.

Backports Required

Release Notes

Bug Fixes

Construct audit metrics probe during service initialization to prevent null pointer access.

Move the construction of the audit probe from service start to service construction. Previously, the audit probe was created as a unique pointer when the service was started and destroyed when the service stopped. Since the audit service has the same lifespan as the application, creating the probe during service construction reduces the risk of dereferencing a null pointer if the probe is accessed before the service has started. Signed-off-by: Michael Boquard <[email protected]>

oleiman

Lgtm. Added Ben to the review because I'm never too sure about this stuff.

BenPope

This looks safe, but I have concerns that the service is being used before it is started.

For example, it's not clear without deep inspection of the code that calling drain is safe before the sink has been started.

It's also not clear that client_shard_id has been constructed before events can be received on other shards.

I think it may make sense to set the drain timer in start(), and move stuff from start to the constructor where possible.

Overall, it would be better to fix this by starting the service before it is used..

michael-redpanda · 2024-11-13T15:55:24Z

This looks safe, but I have concerns that the service is being used before it is started.

Yeah I was discussing exactly this with @oleiman yesterday. The problem is that I don't think we can kick off the sink until after the kafka API is available.

For example, it's not clear without deep inspection of the code that calling drain is safe before the sink has been started.

So the drain timer isn't armed until after the sink has been started so I believe this is safe.

It's also not clear that client_shard_id has been constructed before events can be received on other shards.

It should be constructed when the service is constructed, no?

I think it may make sense to set the drain timer in start(), and move stuff from start to the constructor where possible.

It is, in a somewhat round-about way based on the value of audit_enabled

Overall, it would be better to fix this by starting the service before it is used..

So again, I don't think we can start the sink before the kafka service is up. One idea I had was to start the service very early and then enable the sink via another method once the kafka API is up and running.

BenPope · 2024-11-14T10:14:36Z

This looks safe, but I have concerns that the service is being used before it is started.

Yeah I was discussing exactly this with @oleiman yesterday. The problem is that I don't think we can kick off the sink until after the kafka API is available.

Agreed.

For example, it's not clear without deep inspection of the code that calling drain is safe before the sink has been started.

So the drain timer isn't armed until after the sink has been started so I believe this is safe.

OK, I see it's ultimately called from audit_log_manager::start().

It's also not clear that client_shard_id has been constructed before events can be received on other shards.

It should be constructed when the service is constructed, no?

I think I fumbled that comment. I was talking about audit_log_manager::_sink on shard client_id_shard, the concern was that the shards are created without an ordering. But if drain cannot be called before audit_log_manager::start has been called (despite receiving events), I think it's safe.

I think it may make sense to set the drain timer in start(), and move stuff from start to the constructor where possible.

It is, in a somewhat round-about way based on the value of audit_enabled

Yeah, after digging around, I see that.

Overall, it would be better to fix this by starting the service before it is used..

So again, I don't think we can start the sink before the kafka service is up. One idea I had was to start the service very early and then enable the sink via another method once the kafka API is up and running.

I get it now.

vbotbuildovich · 2024-11-14T13:31:35Z

/backport v24.2.x

vbotbuildovich · 2024-11-14T13:31:36Z

/backport v24.1.x

dotnwat · 2024-11-18T21:29:52Z

src/v/security/audit/audit_log_manager.h

+    // Probe is mutable so it can be modified in const methods when they need to
+    // report auditing failures
+    mutable audit_probe _probe;


afaik the contract isn't "so i can modify it in const methods" it's that the public api maintains logical constness. since the probe is exposed (e.g. probe()) it's not clear that this is technically correct.

michael-redpanda requested a review from a team November 12, 2024 21:34

michael-redpanda self-assigned this Nov 12, 2024

michael-redpanda requested review from aanthony-rp and oleiman and removed request for a team November 12, 2024 21:34

github-actions bot added the area/redpanda label Nov 12, 2024

oleiman requested a review from BenPope November 12, 2024 22:43

oleiman approved these changes Nov 12, 2024

View reviewed changes

aanthony-rp approved these changes Nov 13, 2024

View reviewed changes

BenPope reviewed Nov 13, 2024

View reviewed changes

BenPope approved these changes Nov 14, 2024

View reviewed changes

michael-redpanda merged commit c84cb6a into redpanda-data:dev Nov 14, 2024
17 checks passed

This was referenced Nov 14, 2024

[v24.2.x] auditing: construct metrics probe during service construction #24127

Merged

[v24.1.x] auditing: construct metrics probe during service construction #24128

Merged

dotnwat reviewed Nov 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auditing: construct metrics probe during service construction #24108

auditing: construct metrics probe during service construction #24108

michael-redpanda commented Nov 12, 2024

oleiman left a comment

BenPope left a comment •

edited

Loading

michael-redpanda commented Nov 13, 2024

BenPope commented Nov 14, 2024 •

edited

Loading

vbotbuildovich commented Nov 14, 2024

vbotbuildovich commented Nov 14, 2024

dotnwat Nov 18, 2024

auditing: construct metrics probe during service construction #24108

auditing: construct metrics probe during service construction #24108

Conversation

michael-redpanda commented Nov 12, 2024

Backports Required

Release Notes

Bug Fixes

oleiman left a comment

Choose a reason for hiding this comment

BenPope left a comment • edited Loading

Choose a reason for hiding this comment

michael-redpanda commented Nov 13, 2024

BenPope commented Nov 14, 2024 • edited Loading

vbotbuildovich commented Nov 14, 2024

vbotbuildovich commented Nov 14, 2024

dotnwat Nov 18, 2024

Choose a reason for hiding this comment

BenPope left a comment •

edited

Loading

BenPope commented Nov 14, 2024 •

edited

Loading