Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugs fix and improvements to Node.Start() and Node.Stop() #414

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

hackintoshrao
Copy link
Contributor

@hackintoshrao hackintoshrao commented Nov 6, 2022

Context: I love the project and what it stands for. It's very important for an impactful and complex project like this to have great coverage of tests. I started working on writing unit and integration tests for the entire codebase. While I work on tests I'm fixing things as I find them. Here is the summary of the PR

- Create Meaningful error macros
- Make IsRunning flag atomic for safety purpose.
- Fix deso-protocol#413
- Add utility functions for change the status of the node.
Unit tests for the following utility functions to change the status of the node.

- *Node.IsRunning()
- *Node.StatusStartToStop()
- *Node.StatusStopToStart()
- changeRunningStatus
Duplicating two test utility functions to avoid the cyclic issues. Cannot import integration_testing in cmd/ packager, this creates a cycle.
Changing the names of the test utility functions to improve the readability.

1. Changing getDirectory to getTestDirectory.

2. Changing generateConfig to GenerateTestConfig()
Updating test utility function names to match the latest updates.
@hackintoshrao hackintoshrao requested a review from a team as a code owner November 6, 2022 12:51
@diamondhands0
Copy link
Member

Wow great work! We're reviewing and should get back shortly.

Copy link
Contributor

@AeonSw4n AeonSw4n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome Karthic, thank you so much for this contribution! I've added some small comments during my review. It would be great if we could incorporate them into the PR before we merge. Let me know what you think 😀

cmd/node_test.go Outdated

// expected running status should be false before the node is started
expectedRunningStatus := false
actualRunningstatus := testNode.IsRunning()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we camel case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm particular about these things too, missed it by mistake. Thanks for pointing out.

cmd/node.go Outdated
// set to false after Stop() is called. Mainly used in testing.
// Use the convinience methods changeRunningStatus/StatusStartToStop/StatusStopToStart to change node status
// Use *Node.IsRunning() to retreive the node running status.
running *atomic.Bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea; however, I believe atomic.Bool was added in 1.19. Our code is currently only compatible with 1.18, so I think sadly we'll have to revert running to a bool. While we're at it, there is a chance that in the future we might want to have more statuses than "running" or "not running", and it looks like the functionality you've written already accounts for that which is great. In light of that, could we actually rename this running variable to status with a custom byte type NodeStatus that will, for now, take values of stopped (NodeStatus is byte(0)), running (NodeStatus is byte(1)). If we feel like it, optionally we could add statuses of new, restarting, etc. feel free to get creative lol. Let's make sure that the status updating functions you wrote also reflect this change. Lastly, since we don't use the atomic package, could we use the runningMutex to nail all the race conditions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review. Is the Go 1.18 Compatibility documented anywhere in the docs? If not I'll make a documentation/README update.

Hmm, Atomic makes code much cleaner, but yes, will wait for a future 1.19 bump! Your recommendation sounds perfect, I'll soon ship the changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AeonSw4n I've incorporated the changes. Let me know what you think.

// It's usually the first step to starting a node.
func generateConfig(t *testing.T, port uint32, dataDir string, maxPeers uint32) *cmd.Config {
func GenerateTestConfig(t *testing.T, port uint32, dataDir string, maxPeers uint32) *cmd.Config {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any way we can deduplicate this and keep only a single GenerateTestConfig in cmd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. I'll do this change

- Moving GenerateTestConfig to cmd package to avoid issues related to cyclic usage of the test utility inside the cmd package.
- Updating testconfig in blocksync tests. GenerateTestConfig is moved to cmd package.
- changing node status to custom NodeStatus byte type
- Setting valid status codes
- Utility functions to validate and change the status codes
- Using new status locks for safely reading and writing the node status.
Copy link
Contributor

@AeonSw4n AeonSw4n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey man, just reviewed your changes. Really solid code overall! I think before we're ready to merge this boyo into our codebase, I have to be confident we won't get screwed by the additional complexity entailed by the status changes and locking/unlocking of the statusMutex. We gotta make sure everything is intuitively forward compatible, and no dev in the future can break things too easily. Left you some suggestions, let me know what you think! I'd say after we address these things, we'll be ready to sync some nodes and make sure everything runs smooth as butter.

cmd/node.go Outdated
@@ -273,7 +358,9 @@ func (node *Node) Start(exitChannels ...*chan struct{}) {
case <-syscallChannel:
}

node.statusMutex.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like you're using statusMutex very similarly to how runningMutex is used. How about we just get rid of runningMutex and replace it with the statusMutex? In this case, we would also get rid of node.statusMutex.Lock()/Unlock() in here, since the mutex will be held inside of node.Stop()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with all feedback from @AeonSw4n . Main input from me is to move all the mutex handling into the helper functions that handle stopping and updating status so the caller doesn't have to worry about it.

Agreed.

cmd/node.go Outdated
// Loads the status of Node.
// Modifying LoadStatus() and the validateStatus() functions and their tests are hard requirements
// to add new status codes for the node.
func (node *Node) LoadStatus() (NodeStatus, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find "get/set" more intuitive than "load/change" (though it's just my personal preference). Since this getter isn't holding a mutex, maybe we rename it to GetStatusWithoutLock -- this way other developers will immediately know that this function is supposed to be called with a status mutex.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also maybe we make this method private and wrap it with some other GetStatus() that does actually hold the status mutex?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be best if all the mutex logic lives within these functions instead of handled by the callers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

cmd/node.go Outdated

// changes the running status of the node
// Always use this function to change the status of the node.
func (node *Node) changeNodeStatus(newStatus NodeStatus) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the setter for the node status, similarly to my comment about the getter LoadStatus(), maybe we rename it to setNodeStatusWithoutLock()?

cmd/node.go Outdated
switch operation {
case lib.NodeRestart:
// Using Mutex while accessing the Node status to avoid any race conditions
node.statusMutex.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move the node.statusMutex.Lock()/Unlock() and the if !node.IsRunning(){...} condition inside the node.Stop(). Since Stop() should never be called when node is already stopped it feels reasonable to isolate everything inside Stop. Same goes for the NodeErase case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes^ - big piece from my review is that we don't want to have these mutexes flying around all the calling functions, but should instead handle the lock/unlock in the helper functions as much as we can.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AeonSw4n @lazynina Makes sense. I'll abstract the lock logic from the user, and introduce it inside the helper functions.

cmd/node.go Outdated
// status is nil when a NewNode is created, it is initialized and set to RUNNING [byte(1)] on node.Start(),
// set to STOPPED [byte(2)] after Stop() is called.

// Use the convinience methods changeNodeStatus/UpdateStatusRunning/UpdateStatusStopped to change node status.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: convenience

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Use the convinience methods changeNodeStatus/UpdateStatusRunning/UpdateStatusStopped to change node status.
// Use the convenience methods changeNodeStatus/UpdateStatusRunning/UpdateStatusStopped to change node status.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll use a small check next time. Sorry about the typos @AeonSw4n @lazynina

cmd/node.go Outdated
// Use *Node.IsRunning() to check if the node is running.
// Use *Node.LoadStatus() to retrieve the status of the node.
status *NodeStatus
// Held whenver the status of the node is read or altered.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: whenever

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Held whenver the status of the node is read or altered.
// Held whenever the status of the node is read or altered.

cmd/node.go Outdated

node.statusMutex.Lock()
// Load the node status.
// This is identify whether the node is initialized for the first time or it's a restart.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: to identify

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This is identify whether the node is initialized for the first time or it's a restart.
// This is to identify whether the node is initialized for the first time or it's a restart.

Copy link
Member

@lazynina lazynina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with all feedback from @AeonSw4n . Main input from me is to move all the mutex handling into the helper functions that handle stopping and updating status so the caller doesn't have to worry about it.

cmd/node.go Outdated
// ErrAlreadyStopped is returned when somebody tries to stop an already
// stopped service (without resetting it).
ErrAlreadyStopped = errors.New("already stopped")
// ErrNodeNeverStarted is returned when somebody tries to find or changge the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ErrNodeNeverStarted is returned when somebody tries to find or changge the
// ErrNodeNeverStarted is returned when somebody tries to find or change the

cmd/node.go Outdated
// NEVERSTARTED is the default status, cannot set deliberately.
ErrCannotSetToNeverStarted = errors.New("cannot set the node status to neverstarted")
// Error if the atomic compare of swap of the node status fails
ErrFailedtoChangeNodeStatus = errors.New("failed to change the status of the node")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this variable does not appear to be used - if so, can we please remove it? If we do keep it, let's fix the capitalization on the name

Suggested change
ErrFailedtoChangeNodeStatus = errors.New("failed to change the status of the node")
ErrFailedToChangeNodeStatus = errors.New("failed to change the status of the node")

cmd/node.go Outdated
// status is nil when a NewNode is created, it is initialized and set to RUNNING [byte(1)] on node.Start(),
// set to STOPPED [byte(2)] after Stop() is called.

// Use the convinience methods changeNodeStatus/UpdateStatusRunning/UpdateStatusStopped to change node status.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Use the convinience methods changeNodeStatus/UpdateStatusRunning/UpdateStatusStopped to change node status.
// Use the convenience methods changeNodeStatus/UpdateStatusRunning/UpdateStatusStopped to change node status.

cmd/node.go Outdated
// Use *Node.IsRunning() to check if the node is running.
// Use *Node.LoadStatus() to retrieve the status of the node.
status *NodeStatus
// Held whenver the status of the node is read or altered.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Held whenver the status of the node is read or altered.
// Held whenever the status of the node is read or altered.

cmd/node.go Outdated

node.statusMutex.Lock()
// Load the node status.
// This is identify whether the node is initialized for the first time or it's a restart.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This is identify whether the node is initialized for the first time or it's a restart.
// This is to identify whether the node is initialized for the first time or it's a restart.

cmd/node.go Show resolved Hide resolved
cmd/node_test.go Outdated
Comment on lines 28 to 30
expectedRunningStatus = true
actualRunningStatus = testNode.IsRunning()
require.Equal(t, actualRunningStatus, expectedRunningStatus)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expectedRunningStatus = true
actualRunningStatus = testNode.IsRunning()
require.Equal(t, actualRunningStatus, expectedRunningStatus)
require.True(t, testNode.IsRunning(), expectedRunningStatus)

cmd/node_test.go Outdated
Comment on lines 34 to 36
expectedRunningStatus = false
actualRunningStatus = testNode.IsRunning()
require.Equal(t, actualRunningStatus, expectedRunningStatus)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expectedRunningStatus = false
actualRunningStatus = testNode.IsRunning()
require.Equal(t, actualRunningStatus, expectedRunningStatus)
require.False(t, testNode.IsRunning())

cmd/node_test.go Outdated
Comment on lines 21 to 23
expectedRunningStatus := false
actualRunningStatus := testNode.IsRunning()
require.Equal(t, actualRunningStatus, expectedRunningStatus)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expectedRunningStatus := false
actualRunningStatus := testNode.IsRunning()
require.Equal(t, actualRunningStatus, expectedRunningStatus)
require.False(t, testNode.IsRunning())

cmd/node_test.go Outdated
actualErr = testNode.UpdateStatusRunning()
require.NoError(t, actualErr)
// Once the running flag is changed, the isRunning function should return true
require.Equal(t, true, testNode.IsRunning())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
require.Equal(t, true, testNode.IsRunning())
require.True(t, testNode.IsRunning())

- replacing a panic with glag.Fatalf
- simplifying the logic of node.IsRunning()
- Deduplicating the glog messages, shifting the responsibility of logging to the caller.
- Removing runningMutex and replacing it with statusMutex
- Adding statusMutex inside *Node.Start() and *Node.Stop()
- Changing LoadStatus to GetStatus
- Creating private functions getStatusWithoutLock and updateStatusWithoutLock to change node status without holding statusLocks.
- Creating wrapper functions GetStatus and UpdateStatus to fetch and update node status while holding statusMutex locks.
- Updating function names to match the updates to utility functions in node.go
- Adding test to verify graceful shutdown of the node.
@hackintoshrao
Copy link
Contributor Author

hackintoshrao commented Nov 15, 2022

@AeonSw4n @lazynina : I made the changes as per your suggestions, thank you for the great inputs and the timely reviews. Here's the summary of my changes.

I find "get/set" more intuitive than "load/change" (though it's just my personal preference). Since this getter isn't holding a mutex, maybe we rename it to GetStatusWithoutLock

Changed LoadStatus->GetStatus and ChangeStatus-> SetStatus

also maybe we make this method private and wrap it with some other GetStatus() that does actually hold the status mutex?
would be best if all the mutex logic lives within these functions instead of handled by the callers.

created private functions with suffix withoutLock, and wrapper GetStatus and SetStatus which uses locks inside them.
Moved the locks which were earlier held at multiple places to a few set location at the beginning of function definition.

It feels like you're using statusMutex very similarly to how runningMutex is used. How about we just get rid of runningMutex and replace it with the statusMutex? In this case, we would also get rid of node.statusMutex.Lock()/Unlock() in here, since the mutex will be held inside of node.Stop()

Got rid of runningMutex and replaced it with statusMutex. Now, statusMutex is held inside of node.Stop() and node.Start(), this eliminates the use of locks at multiple places.

let's move the node.statusMutex.Lock()/Unlock() and the if !node.IsRunning(){...} condition inside the node.Stop(). Since Stop() should never be called when node is already stopped it feels reasonable to isolate everything inside Stop. Same goes for the NodeErase case.

Node.Stop() automatically takes care of validating the current status of the node before stopping. Hence, removed the !node.IsRunning check.

This is the setter for the node status, similarly to my comment about the getter LoadStatus(), maybe we rename it to setNodeStatusWithoutLock()?

Renamed changeNodeStatus to changeStatusWithoutLock, and created a wrapper ChangeStatus which holds the lock internally.

big piece from my review is that we don't want to have these mutexes flying around all the calling functions, but should instead handle the lock/unlock in the helper functions as much as we can.

Mutexes don't fly anymore :) 🚀

@AeonSw4n
Copy link
Contributor

AeonSw4n commented Nov 24, 2022

@hackintoshrao thanks for making the updates! Just wanted to make a quick update on my end and let you know reviewing and testing all your code is a high priority for me, and will get to it once I finish being tunnel-visioned on access groups & group chats.

@hackintoshrao
Copy link
Contributor Author

Thanks @AeonSw4n . Just that this PR will enable me to do tons of contributions. I'm sweeping through the codebase writing tests, but I can't move forward till this is merged.

May be I can help with access groups and chats by writing tests and manually testing them.

@AeonSw4n
Copy link
Contributor

Oh, that's so awesome @hackintoshrao I'm so excited to see your other contributions! Okay, I'll look into your code in the evening. Speaking of tests, I've actually been working on a new transaction unit testing framework lately. I'll PR it out of my current code so it's easier to read lol -- figured that might be interesting to you.

So cool to see your readiness to help with access groups! If you feel like checking out the code the latest branch is in this PR #412. Let's wrap this PR, and I'll think of what could be the best use of your time. :)

@hackintoshrao
Copy link
Contributor Author

hackintoshrao commented Nov 24, 2022

Thanks @AeonSw4n . Getting this merged will also help me fix tests for the #412 PR. Eager to check out the transaction test framework.

(On a different note, so cool to see Badger! I used to work at Dgraph, the company behind Badger.)

I like the idea of the transaction test suite. I've been thinking of a framework with parallel tests to benchmark and stress test the core's performance. But, many things need to clean up before getting there. Hence, I'm running across the codebase with tests.

I'm curious to monitor the Go routine count and graceful exits while stress-testing the transactions, I have some early suspicions of Go-routine leaks. PS: I have OCD for an unusually excessive amount of testing😉

Overall this new transaction test framework can be used for testing a lot of things around leaks, performance, and correctness. Keen on checking it out and contributing to it.

Copy link
Contributor

@AeonSw4n AeonSw4n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reviewed your latest code, and I've got some minor change requests, but giving you an approve because your code is really awesome! One more comment I have is, do we need these .DS_Store files? We didn't have them before and they feel a bit redundant.

Once again thank you for this contribution and your flexibility when faced with our feedback 😃 @lazynina @diamondhands0 can you guys also take a look so we can merge?

cmd/node.go Outdated Show resolved Hide resolved
cmd/node.go Outdated Show resolved Hide resolved
cmd/run.go Outdated Show resolved Hide resolved
@hackintoshrao
Copy link
Contributor Author

Just reviewed your latest code, and I've got some minor change requests, but giving you an approve because your code is really awesome! One more comment I have is, do we need these .DS_Store files? We didn't have them before and they feel a bit redundant.

Thank you @AeonSw4n :) I'll remove the .DS files, they are redundant.

Once again thank you for this contribution and your flexibility when faced with our feedback 😃 @lazynina @diamondhands0 can you guys also take a look so we can merge?

I learned quite a bit from the reviews :)

- Removing the .DS files
- Adding error wrapper %w
- Removing the redundant lock
@hackintoshrao
Copy link
Contributor Author

hackintoshrao commented Nov 25, 2022

@AeonSw4n Found the transaction test framework inside the block_view_test.go. Working on some refactoring on the suite to improve its usability, for instance, to try parallel transaction scenarios with multiple accounts and verify their sanity.

Is there a better channel to communicate while I work on the codebase than Github comments😅

Copy link
Member

@lazynina lazynina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hackintoshrao - looking very good. I've opened a PR against your fork with some changes that would simplify the diff between your fork and the existing code in core. I think it'll be faster for us to get it through with a smaller diff - we can keep the existing function names and move all the testing logic into the integration_testing package.

cmd/config.go Outdated Show resolved Hide resolved
cmd/node.go Outdated Show resolved Hide resolved
cmd/node.go Show resolved Hide resolved
cmd/config.go Outdated Show resolved Hide resolved
integration_testing/tools.go Outdated Show resolved Hide resolved
Renaming setstatus functions with a suffix of without locks.
@AeonSw4n
Copy link
Contributor

AeonSw4n commented Nov 30, 2022

@AeonSw4n Found the transaction test framework inside the block_view_test.go. Working on some refactoring on the suite to improve its usability, for instance, to try parallel transaction scenarios with multiple accounts and verify their sanity.

Is there a better channel to communicate while I work on the codebase than Github comments😅

Prallelizing tests/transactions could be fun @hackintoshrao :) Though keep in mind they will always be processed sequentially in the mempool. We are planning to build actual prallelization for transaction processing a little further on the horizon -- probably after/concurrently with PoS. I've also refactored the testing framework into another PR #422. Before it's merged I'm planning on adding a state verification for before & after connect-disconnect to Postgres tests (currently this only works for Badger tests) and just documenting everything a little better.

Re communication channel: certainly! I think you were chatting with Dylan over Discord, is that a good channel? I'll reach out to you soon (almost done with access groups 🤞).

@hackintoshrao
Copy link
Contributor Author

@AeonSw4n Discord is fine, I just need to map the Github accounts to the Discord user names!

I'll look into #422 and look into opportunities to contribute.

@hackintoshrao
Copy link
Contributor Author

@lazynina The PR is ready as per the suggestions, do let me know if there's anything more needs to be done for this to be merged :)

@lazynina
Copy link
Member

@lazynina The PR is ready as per the suggestions, do let me know if there's anything more needs to be done for this to be merged :)

@hackintoshrao I have been testing this out by running and stopping a node, but ran into a case where one of the status change validations returns an error and the node will not switch to running. Trying to figure out exactly what's going on and to test out more stop + restart situations to make sure we don't have nodes that get stuck. I have only reproduced it once but it's a little unclear what is happening. As I have more details, I'll share them

@hackintoshrao
Copy link
Contributor Author

Thank you @lazynina . Please do report the edge scenarios when you have them.

@lazynina
Copy link
Member

Thank you @lazynina . Please do report the edge scenarios when you have them.

I can't reproduce it - maybe it happened in a dream! @diamondhands0 for final review.

@hackintoshrao
Copy link
Contributor Author

hackintoshrao commented Dec 16, 2022

Thanks @lazynina . Will wait for the review by @diamondhands0 .

@lazynina
Copy link
Member

@hackintoshrao - can you provide more details on how to reproduce the bug ( #413 ) and what specifically in this PR is fixing that bug? @diamondhands0

@hackintoshrao
Copy link
Contributor Author

The bug occurs when the shouldRestart flag is true during the first run of the server.

In this scenario, nodeMessageChan receives a lib.NodeErase message, but at this point, node.IsRunning is still set to false.

Hence, instead of safely recovering, it panics here.

Here is a quick screen recording explainer https://www.loom.com/share/410232d7a6ab4d1e8610c9259de1cbf6 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic when node.nodeMessageChan <- lib.NodeErase message is sent for the first time
4 participants