Fix the potential race condition of InitializeSubnetService #879

zhengxiexie · 2024-11-11T04:28:49Z

It is possible that an error occurs when trying to send to the already closed fatalErrors channel, resulting in a panic.

codecov-commenter · 2024-11-11T04:35:16Z

Codecov Report

Attention: Patch coverage is 70.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 70.60%. Comparing base (bf1880a) to head (8fba48c).
Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/nsx/services/subnet/subnet.go	70.00%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #879      +/-   ##
==========================================
- Coverage   70.61%   70.60%   -0.01%     
==========================================
  Files          95       95              
  Lines       15094    15110      +16     
==========================================
+ Hits        10658    10669      +11     
- Misses       3710     3715       +5     
  Partials      726      726

Flag	Coverage Δ
unit-tests	`70.60% <70.00%> (-0.01%)`	⬇️

Files with missing lines	Coverage Δ
pkg/nsx/services/subnet/subnet.go	`30.17% <70.00%> (+1.85%)`	⬆️

It is possible that an error occurs when trying to send to the already closed `fatalErrors` channel, resulting in a panic.

wenyingd · 2024-11-11T05:28:04Z

.gitignore

+go.work.sum
+.tool-versions


Is this change needed?

wenyingd · 2024-11-11T05:30:52Z

pkg/nsx/services/subnet/subnet.go

@@ -65,20 +65,39 @@ func InitializeSubnetService(service common.Service) (*SubnetService, error) {
 		},
 	}

+	// Use sync.Once to ensure channel is closed only once


I would prefer to change like this,

func InitializeSubnetService(service common.Service) (*SubnetService, error) { wg := sync.WaitGroup{} fatalErrors := make(chan error, 1) defer close(fatalErrors) subnetService := &SubnetService{ Service: service, SubnetStore: &SubnetStore{ ResourceStore: common.ResourceStore{ Indexer: cache.NewIndexer(keyFunc, cache.Indexers{ common.TagScopeSubnetCRUID: subnetIndexFunc, common.TagScopeSubnetSetCRUID: subnetSetIndexFunc, common.TagScopeVMNamespace: subnetIndexVMNamespaceFunc, common.TagScopeNamespace: subnetIndexNamespaceFunc, }), BindingType: model.VpcSubnetBindingType(), }, }, } wg.Add(1) go subnetService.InitializeResourceStore(&wg, fatalErrors, ResourceTypeSubnet, nil, subnetService.SubnetStore) wg.Wait() if len(fatalErrors) > 0 { err := <-fatalErrors return subnetService, err } return subnetService, nil }

https://github.com/vmware-tanzu/nsx-operator/blob/main/pkg/nsx/services/securitypolicy/firewall.go#L116-L116
The initial reason is to return as early as possible if there are multiple resources.
You should consider closing channel fatalErrors in this way.

fatalErrors is closed in "defer" after it is declared in my example.
Even if there are multiple resources, we still need to wait until all sub-tasks are done rather than leave them out of control, otherwise it may lead to unexpected issues (e.g., a zombie routine).

Yes, this version has no zombie routine.

zhengxiexie · 2024-11-11T08:04:15Z

/e2e

zhengxiexie · 2024-11-12T08:22:36Z

/e2e

zhengxiexie · 2024-11-13T03:13:01Z

/e2e

zhengxiexie requested review from wenyingd and yanjunz97 November 11, 2024 04:28

vmwclabot added the cla-not-required label Nov 11, 2024

zhengxiexie force-pushed the topic/zhengxie/main/race_condition branch from 2534c5b to 076af3a Compare November 11, 2024 04:46

Fix the potential race condition of InitializeSubnetService

8fba48c

It is possible that an error occurs when trying to send to the already closed `fatalErrors` channel, resulting in a panic.

zhengxiexie force-pushed the topic/zhengxie/main/race_condition branch from 076af3a to 8fba48c Compare November 11, 2024 05:16

wenyingd reviewed Nov 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the potential race condition of InitializeSubnetService #879

Fix the potential race condition of InitializeSubnetService #879

zhengxiexie commented Nov 11, 2024

codecov-commenter commented Nov 11, 2024 •

edited

Loading

wenyingd Nov 11, 2024

wenyingd Nov 11, 2024

zhengxiexie Nov 11, 2024

wenyingd Nov 11, 2024

zhengxiexie Nov 11, 2024

zhengxiexie commented Nov 11, 2024

zhengxiexie commented Nov 12, 2024

zhengxiexie commented Nov 13, 2024

		go.work.sum
		.tool-versions

Fix the potential race condition of InitializeSubnetService #879

Are you sure you want to change the base?

Fix the potential race condition of InitializeSubnetService #879

Conversation

zhengxiexie commented Nov 11, 2024

codecov-commenter commented Nov 11, 2024 • edited Loading

Codecov Report

wenyingd Nov 11, 2024

Choose a reason for hiding this comment

wenyingd Nov 11, 2024

Choose a reason for hiding this comment

zhengxiexie Nov 11, 2024

Choose a reason for hiding this comment

wenyingd Nov 11, 2024

Choose a reason for hiding this comment

zhengxiexie Nov 11, 2024

Choose a reason for hiding this comment

zhengxiexie commented Nov 11, 2024

zhengxiexie commented Nov 12, 2024

zhengxiexie commented Nov 13, 2024

codecov-commenter commented Nov 11, 2024 •

edited

Loading