Sidekick Crashes After Triggering the Same Rule Multiple Times in a Short Window with Falco 0.38.2 #1011

cme-incom · 2024-10-01T10:03:23Z

Describe the bug

After executing Aqua Security’s kube-bench, the Sidekick service fails and crashes. This issue occurs when the same Falco rule is triggered more than 15 times within a very short time window. Instead of handling the load gracefully, the service crashes.

How to reproduce it

Run Aqua Security’s kube-bench to perform security checks.
Ensure that a specific Falco rule is triggered more than 15 times in a very short window.

Expected behaviour

The Sidekick service should handle multiple rule triggers without crashing. It should remain stable and not be terminated

Screenshots
No screenshots available.

Environment

Falco version:
Falco version: 0.38.2
OS:

Talos 1.6.5

Kernel:

6.6.32-talos

Installation method:

Helm
Additional context

The rule triggered:

   # Note that runsv is both in protected_shell_spawner and the
   # exclusions by pname. This means that runsv can itself spawn shells
   # (the ./run and ./finish scripts), but the processes runsv can not
   # spawn shells.
   #
   # Also, trivy uses this for vulnerability scanning and kyverno uses it to clean ephemeral reports
   # And we exclude the incom user
   - rule: Incom Run shell untrusted
     desc: > 
       An attempt to spawn a shell below a non-shell application. The non-shell applications that are monitored are 
       defined in the protected_shell_spawner macro, with protected_shell_spawning_binaries being the list you can 
       easily customize. For Java parent processes, please note that Java often has a custom process name. Therefore, 
       rely more on proc.exe to define Java applications. This rule can be noisier, as you can see in the exhaustive 
       existing tuning. However, given it is very behavior-driven and broad, it is universally relevant to catch 
       general Remote Code Execution (RCE). Allocate time to tune this rule for your use cases and reduce noise. 
       Tuning suggestions include looking at the duration of the parent process (proc.ppid.duration) to define your 
       long-running app processes. Checking for newer fields such as proc.vpgid.name and proc.vpgid.exe instead of the 
       direct parent process being a non-shell application could make the rule more robust.
     condition: >
       spawned_process
       and shell_procs
       and proc.pname exists
       and not (k8s.ns.name = trivy)
       and not (k8s.ns.name = kyverno)
       and not serf_script
       and not check_process_status
       and not (container.image.repository in (incom_network_images))
       and not (user.name = incom)
       and not (proc.pexe = /bin/containerd-shim-runc-v2)
     output: Shell spawned by untrusted binary (parent_exe=%proc.pexe parent_exepath=%proc.pexepath pcmdline=%proc.pcmdline gparent=%proc.aname[2] ggparent=%proc.aname[3] aname[4]=%proc.aname[4] aname[5]=%proc.aname[5] aname[6]=%proc.aname[6] aname[7]=%proc.aname[7] evt_type=%evt.type user=%user.name user_uid=%user.uid user_loginuid=%user.loginuid process=%proc.name proc_exepath=%proc.exepath parent=%proc.pname command=%proc.cmdline terminal=%proc.tty exe_flags=%evt.arg.flags %container.info)
     priority: ERROR
     tags: [maturity_stable, host, container, process, shell, mitre_execution, T1059.004]

The error msg from the failed pod:

2024/09/23 17:48:45 [INFO]  : Slack - POST OK (200)
2024/09/23 17:48:45 [INFO]  : Pagerduty - Create Incident OK
2024/09/28 09:25:13 [INFO]  : Slack - POST OK (200)
fatal error: concurrent map iteration and map write
goroutine 502012 [running]:
github.com/falcosecurity/falcosidekick/outputs.getSortedStringKeys(0xc00089e1e0?)
   /home/runner/work/falcosidekick/falcosidekick/outputs/utils.go:12 +0x6b
github.com/falcosecurity/falcosidekick/outputs.newSlackPayload({{0xc00005e8a0, 0x24}, {0xc000aaaa00, 0x266}, 0x5, {0xc000114080, 0x19}, {0xb860900, 0xede8d3286, 0x0}, ...}, ...)
   /home/runner/work/falcosidekick/falcosidekick/outputs/slack.go:75 +0x62c
github.com/falcosecurity/falcosidekick/outputs.(*Client).SlackPost(0xc0008e1d00, {{0xc00005e8a0, 0x24}, {0xc000aaaa00, 0x266}, 0x5, {0xc000114080, 0x19}, {0xb860900, 0xede8d3286, ...}, ...})
   /home/runner/work/falcosidekick/falcosidekick/outputs/slack.go:152 +0x78
created by main.forwardEvent in goroutine 502010
   /home/runner/work/falcosidekick/falcosidekick/handlers.go:235 +0x148
goroutine 1 [IO wait]:
internal/poll.runtime_pollWait(0x7fce1861fed0, 0x72)
   $GOROOT/src/runtime/netpoll.go:345 +0x85
internal/poll.(*pollDesc).wait(0x3?, 0x1?, 0x0)
   $GOROOT/src/internal/poll/fd_poll_runtime.go:84 +0x27
internal/poll.(*pollDesc).waitRead(...)
   $GOROOT/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc0009dd100)
   $GOROOT/src/internal/poll/fd_unix.go:611 +0x2ac
net.(*netFD).accept(0xc0009dd100)
   $GOROOT/src/net/fd_unix.go:172 +0x29
net.(*TCPListener).accept(0xc0009c95e0)
   $GOROOT/src/net/tcpsock_posix.go:159 +0x1e
net.(*TCPListener).Accept(0xc0009c95e0)
   $GOROOT/src/net/tcpsock.go:327 +0x30
net/http.(*Server).Serve(0xc000568690, {0x3079fb0, 0xc0009c95e0})
   $GOROOT/src/net/http/server.go:3255 +0x33e
net/http.(*Server).ListenAndServe(0xc000568690)
   $GOROOT/src/net/http/server.go:3184 +0x71
main.main()
   /home/runner/work/falcosidekick/falcosidekick/main.go:934 +0x1287
goroutine 13 [select]:
go.opencensus.io/stats/view.(*worker).start(0xc000143680)
   pkg/mod/[email protected]/stats/view/worker.go:292 +0x9f
created by go.opencensus.io/stats/view.init.0 in goroutine 1
   pkg/mod/[email protected]/stats/view/worker.go:34 +0x8d
goroutine 502011 [runnable]:
net.(*OpError).Timeout(0xc0000cf400?)
   $GOROOT/src/net/net.go:507 +0x133
net/http.(*connReader).backgroundRead(0xc00067d290)
   $GOROOT/src/net/http/server.go:708 +0xa9
created by net/http.(*connReader).startBackgroundRead in goroutine 502010
   $GOROOT/src/net/http/server.go:677 +0xba
goroutine 502013 [runnable]:
bytes.(*Buffer).WriteByte(0xc000ce8980?, 0x7b?)
   $GOROOT/src/bytes/buffer.go:285 +0x9c
encoding/json.mapEncoder.encode({0xc000b16538?}, 0xc000ce8980, {0x2426d60?, 0xc00067d3b0?, 0x2426d60?}, {0x14?, 0x0?})
   $GOROOT/src/encoding/json/encode.go:737 +0x215
encoding/json.(*encodeState).reflectValue(0xc000ce8980, {0x2426d60?, 0xc00067d3b0?, 0x7c9779?}, {0x40?, 0xde?})
   $GOROOT/src/encoding/json/encode.go:321 +0x73
encoding/json.interfaceEncoder(0xc000ce8980, {0x23dde40?, 0xc0008c66f0?, 0x6f8345?}, {0x60?, 0xa6?})
   $GOROOT/src/encoding/json/encode.go:658 +0xba
encoding/json.structEncoder.encode({{{0xc00033e488, 0x8, 0x8}, 0xc000652a80, 0xc000652ab0}}, 0xc000ce8980, {0x273f520?, 0xc0008c6680?, 0xc0000f8f20?}, {0x0, ...})
   $GOROOT/src/encoding/json/encode.go:704 +0x21e
encoding/json.ptrEncoder.encode({0xc0000f8f20?}, 0xc000ce8980, {0x2275700?, 0xc0000f8f20?, 0xc0000f8f20?}, {0xa?, 0x0?})
   $GOROOT/src/encoding/json/encode.go:876 +0x23c
encoding/json.structEncoder.encode({{{0xc00033e008, 0x8, 0x8}, 0xc000652b40, 0xc000652ba0}}, 0xc000ce8980, {0x273f640?, 0xc0000f8ea0?, 0xc000b16950?}, {0x0, ...})
   $GOROOT/src/encoding/json/encode.go:704 +0x21e
encoding/json.(*encodeState).reflectValue(0xc000ce8980, {0x273f640?, 0xc0000f8ea0?, 0x4?}, {0x60?, 0x24?})
   $GOROOT/src/encoding/json/encode.go:321 +0x73
encoding/json.(*encodeState).marshal(0x411ce5?, {0x273f640?, 0xc0000f8ea0?}, {0xc8?, 0xa5?})
   $GOROOT/src/encoding/json/encode.go:297 +0xc5
encoding/json.Marshal({0x273f640, 0xc0000f8ea0})
   $GOROOT/src/encoding/json/encode.go:163 +0xd0
github.com/PagerDuty/go-pagerduty.ManageEventWithContext({0x3089ca0, 0x46aa1a0}, {{0xc000064015, 0x20}, {0x289802d, 0x7}, {0x0, 0x0}, {0x0, 0x0, ...}, ...})
   pkg/mod/github.com/!pager!duty/[email protected]/event_v2.go:175 +0x74
github.com/falcosecurity/falcosidekick/outputs.(*Client).PagerdutyPost(0xc0008e1e00, {{0xc00005e8a0, 0x24}, {0xc000aaaa00, 0x266}, 0x5, {0xc000114080, 0x19}, {0xb860900, 0xede8d3286, ...}, ...})
   /home/runner/work/falcosidekick/falcosidekick/outputs/pagerduty.go:34 +0x1ac
created by main.forwardEvent in goroutine 502010
   /home/runner/work/falcosidekick/falcosidekick/handlers.go:375 +0x2d28
goroutine 502010 [sync.Cond.Wait]:
sync.runtime_notifyListWait(0xc000ce8690, 0x0)
   $GOROOT/src/runtime/sema.go:569 +0x159
sync.(*Cond).Wait(0xc00067d290?)
   $GOROOT/src/sync/cond.go:70 +0x85
net/http.(*connReader).abortPendingRead(0xc00067d290)
   $GOROOT/src/net/http/server.go:729 +0xa6
net/http.(*response).finishRequest(0xc000578b60)
   $GOROOT/src/net/http/server.go:1671 +0x87
net/http.(*conn).serve(0xc000897560, {0x3089e60, 0xc00066de90})
   $GOROOT/src/net/http/server.go:2045 +0x62b
created by net/http.(*Server).Serve in goroutine 1
   $GOROOT/src/net/http/server.go:3285 +0x4b4

The text was updated successfully, but these errors were encountered:

Issif · 2024-10-07T13:14:37Z

This is another issue created about this "bug", wasn't able to reproduce til now falcosecurity/charts#746

Issif · 2024-10-07T15:15:50Z

Which version of Falcosidekick are you running? The 2.29.0 or the latest (== master) ?

Issif · 2024-11-22T15:26:11Z

Are you still facing the issue?

cme-incom added the kind/bug Something isn't working label Oct 1, 2024

cme-incom changed the title ~~Sidekick Crashes After Running Aqua Security’s Kube-Bench with Falco 0.38.2~~ Sidekick Crashes After Running Kube-Bench with Falco 0.38.2 Oct 1, 2024

cme-incom changed the title ~~Sidekick Crashes After Running Kube-Bench with Falco 0.38.2~~ Sidekick Crashes After Triggering the Same Rule Multiple Times in a Short Window with Falco 0.38.2 Oct 1, 2024

Issif self-assigned this Oct 7, 2024

Issif added this to Falcosidekick 2.x Oct 7, 2024

github-project-automation bot moved this to To do in Falcosidekick 2.x Oct 7, 2024

Issif added this to the 2.30 milestone Oct 7, 2024

Issif modified the milestones: 2.30, 2.x Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sidekick Crashes After Triggering the Same Rule Multiple Times in a Short Window with Falco 0.38.2 #1011

Sidekick Crashes After Triggering the Same Rule Multiple Times in a Short Window with Falco 0.38.2 #1011

cme-incom commented Oct 1, 2024 •

edited

Loading

Issif commented Oct 7, 2024

Issif commented Oct 7, 2024

Issif commented Nov 22, 2024

Sidekick Crashes After Triggering the Same Rule Multiple Times in a Short Window with Falco 0.38.2 #1011

Sidekick Crashes After Triggering the Same Rule Multiple Times in a Short Window with Falco 0.38.2 #1011

Comments

cme-incom commented Oct 1, 2024 • edited Loading

Issif commented Oct 7, 2024

Issif commented Oct 7, 2024

Issif commented Nov 22, 2024

cme-incom commented Oct 1, 2024 •

edited

Loading