-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider running without syscall tracepoints #173
Comments
CC: @florianl |
If I understand correctly, it would boil down to something like if !tracer.BtfAvailable() || (kernelVersionCheck && tracer.HasProbeReadBug()) {
if err = tracer.ProbeTracepoint(); err != nil {
return failure("Failed to probe tracepoint: %v", err)
}
}
|
I think we could just remove |
I agree with @fabled 's suggested approach and this is what I proposed for Parca: https://github.com/parca-dev/parca-agent/pull/2990/files (not removing the call, but just printing a warning when it fails) There is one minor change that I would also suggest: making it clear in the error messages when |
This implements the proposal in open-telemetry#173 . Please see there for full details.
This implements the proposal in open-telemetry#173 . Please see there for full details.
This implements the proposal in open-telemetry#173 . Please see there for full details.
This implements the proposal in open-telemetry#173 . Please see there for full details.
Implemented in #175 |
This discussion began in Slack: https://cloud-native.slack.com/archives/C03J794L0BV/p1728340374847259
The issue is that the agent refuses to run when syscall tracepoints are disabled; that is, when the path
/sys/kernel/debug/tracing/events/syscalls
does not exist. This was reported by a user of parca-agent, which is based on a fork of opentelemetry-ebpf-profiler: parca-dev/parca-agent#2976 . However, I don't think the issue is unique to Parca, and exactly the same issue would happen in the stock OTel profiler.The original reporter is running Unraid, but similar issues can happen in various other distributions that might have
CONFIG_FTRACE_SYSCALLS
or other relevant kernel configuration options disabled. For example, evilsocket/opensnitch#774 documents similar brokenness in OpenSnitch when running on Liquorix kernels.If I understand the OTel code correctly, we rely on the ability to attach syscall tracepoints in three places:
ProbeTracepoint
, called here. This is just a check that syscall tracepoints work, so we can exit early in main if they are unavailable. It isn't actually functionally required.loadKernelCode
, called fromloadTPBaseOffset
, called fromloadSystemConfig
. Note that this is only called ifparseBTF
fails. So it is unnecessarily in kernels where BTF-based offset discovery is available.loadKernelCode
, called fromcheckForMaccessPatch
. This is only necessary for the specific versions of the kernel that might have thecopy_from_user_nofault
bug which causes freezes.So, there is a configuration that we could support, but don't now: if syscall tracepoints are unavailable, but BTF-based offset discovery is possible and we're in a kernel version that is known not to be impacted by the
copy_from_user_nofault
bug, we will currently fail to run due to the lack of syscall tracepoints, even though logically the agent should be able to work.Thus I propose that we don't require syscall tracepoints in the case when we're in such a configuration.
The text was updated successfully, but these errors were encountered: