Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: segmentation fault when removing gNB with (previous) user session and then killing UPF process #121

Closed
LaumiH opened this issue Sep 11, 2024 · 4 comments

Comments

@LaumiH
Copy link

LaumiH commented Sep 11, 2024

I want to report a segmentation fault in the SMF when a user session is released and afterwards the UPF misses a heartbeat (e.g., because the process was killed).

The error:

2024-09-11T14:58:53.844618988Z [DEBU][SMF][Main] Sending PFCP Heartbeat Request to UPF[127.0.0.8]
2024-09-11T14:59:02.849662832Z [ERRO][SMF][Main] PFCP Heartbeat error: SendPfcpHeartbeatRequest error: Request Transaction [22]: retry-out
2024-09-11T14:59:02.849766065Z [INFO][SMF][Main] Release all resources of UPF [127.0.0.8]
2024-09-11T14:59:02.855741060Z [WARN][SMF][Consumer] N1N2MessageTransfer for RequestAMFToReleasePDUResources failed: 409 Conflict
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xb13171]

goroutine 49 [running]:
github.com/free5gc/smf/internal/sbi/processor.(*Processor).requestAMFToReleasePDUResources(0xc00026eb50, 0xc00059a000)
        /home/lhenning/free5gc/NFs/smf/internal/sbi/processor/association.go:260 +0x4f1
github.com/free5gc/smf/internal/sbi/processor.(*Processor).releaseAllResourcesOfUPF.func1(0xc00059a000)
        /home/lhenning/free5gc/NFs/smf/internal/sbi/processor/association.go:197 +0xb0
github.com/free5gc/smf/internal/sbi/processor.(*Processor).releaseAllResourcesOfUPF.(*UPF).ProcEachSMContext.func2({0xffffffffffffffff?, 0xc0002a3d90?}, {0xccf5c0?, 0xc00059a000?})
        /home/lhenning/free5gc/NFs/smf/internal/context/upf.go:626 +0x4a
sync.(*Map).Range(0x13e2e40, 0xc00060fef8)
        /usr/local/go/src/sync/map.go:476 +0x228
github.com/free5gc/smf/internal/context.(*UPF).ProcEachSMContext(...)
        /home/lhenning/free5gc/NFs/smf/internal/context/upf.go:623
github.com/free5gc/smf/internal/sbi/processor.(*Processor).releaseAllResourcesOfUPF(0xc00026eb50, 0xc00016e800, {0xc0000ba100?, 0xc0000ba100?})
        /home/lhenning/free5gc/NFs/smf/internal/sbi/processor/association.go:192 +0xe7
github.com/free5gc/smf/internal/sbi/processor.(*Processor).ToBeAssociatedWithUPF(0x0?, {0xe557f0, 0xc0000aa140}, 0xc00016e800)
        /home/lhenning/free5gc/NFs/smf/internal/sbi/processor/association.go:42 +0x1ea
created by main.action.InitPFCPFunc.func2 in goroutine 1
        /home/lhenning/free5gc/NFs/smf/pkg/utils/pfcp_util.go:33 +0x114

Steps to reproduce:

  1. Start a standard free5gc core with one UPF. I took a heartbeat interval of 5s, but it really should not matter.
  2. Create a standard subscriber, e.g., the one from the user guide.
  3. Start a standard UERANSIM gNB.
  4. Start one UERANSIM UE and wait until a user session is created.
  5. OPTIONAL: kill the UE process, this has no influence on the segfault.
  6. Kill the gNB process (AMF removes RAN context).
  7. Kill the UPF process (heartbeat miss, SMF tries to release all resources of the UPF).
  8. -> SIGSEGV: segmentation violation

The error does not show when there is/was no user session.

I have attached the full logs. However, the segmentation fault is not visible here, it happens right after the line
time="2024-09-11T15:15:03.983901309Z" level="warning" msg="N1N2MessageTransfer for RequestAMFToReleasePDUResources failed: 409 Conflict" CAT="Consumer" NF="SMF".
free5gc.log

@LaumiH
Copy link
Author

LaumiH commented Sep 11, 2024

The segmentation fault comes from this piece of code in association.go:

rspData, statusCode, err := p.Consumer().N1N2MessageTransfer(ctx, smContext.Supi, n1n2Request, smContext.CommunicationClientApiPrefix)
if err != nil {
	logger.ConsumerLog.Warnf("N1N2MessageTransfer for RequestAMFToReleasePDUResources failed: %+v", err)
}

switch *statusCode {
...
}

As the SMF warns about N1N2MessageTransfer for RequestAMFToReleasePDUResources failed: 409 Conflict, the statusCode variable is nil, which causes the segmentation fault when trying to dereference it in the switch statement.

One possible solution could be to return and remove the SM context after throwing the warning, e.g.:

rspData, statusCode, err := p.Consumer().N1N2MessageTransfer(ctx, smContext.Supi, n1n2Request, smContext.CommunicationClientApiPrefix)
if err != nil {
	logger.ConsumerLog.Warnf("N1N2MessageTransfer for RequestAMFToReleasePDUResources failed: %+v", err)
	return false, true // remove SM context
}

@LaumiH
Copy link
Author

LaumiH commented Sep 11, 2024

Another solution that comes to mind is handling a graceful release of all user sessions that are associated with the removed RAN context in the AMF, so for example informing the SMF about necessary session releases, or returning a useful status code when the SMF tries to remove such a previously removed context from the AMF in the requestAMFToReleasePDUResources method.

@ming-hsien
Copy link
Contributor

Hi @LaumiH ,
Thank you for raising this issue and suggesting a solution. We have already resolved this problem in the latest version. Please perform a git pull to get the latest commit.

@andy89923
Copy link
Contributor

Thanks for reporting this bug.

Fix at
#125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants