-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
retry mine keystone on failure #18
Conversation
68f726b
to
6bb1289
Compare
service/popm/popm.go
Outdated
// Push inserts an L2Keystone, dropping the oldest if full | ||
func (r *L2KeystonePriorityBuffer) Push(val hemi.L2Keystone) { | ||
r.mtx.Lock() | ||
defer r.mtx.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do less while locked.
service/popm/popm.go
Outdated
|
||
// temporarily set this to false, so another goroutine doesn't | ||
// try to process | ||
v.requiresProcessing = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks racy. You could have started traversing the map while the variable is not set yet. This is a logic race. Setting and unsetting the variable should really happen in the same lock.
service/popm/popm.go
Outdated
key := hex.EncodeToString(mined[:]) | ||
|
||
err := cb(e) | ||
r.mtx.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to say this but this is probably where you want to serialize using a channel. I may be missing some subtlety so don't take this as gospel truth but once you start seeing locking/unlocking in the same loop it typically means the code should be rethought to prevent that from happening.
service/popm/popm.go
Outdated
} | ||
case <-m.mineNowCh: | ||
go m.mineKnownKeystones(ctx) | ||
case <-time.After(15 * time.Second): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a giveaway there is a chance that logic race exists in this code somewhere.
@@ -12,6 +12,7 @@ import ( | |||
"fmt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I cannot put my finger on it, it seems not unlikely that there is a logic race/error somewhere in this code. I would create a test that has hundreds of go routines attacking the pipeline and see what test -race
comes up with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may not be fully appreciating what is going on but let's spend a minute talking through this PR.
service/popm/popm.go
Outdated
bfgCmdCh: make(chan bfgCmd, 10), | ||
holdoffTimeout: 5 * time.Second, | ||
requestTimeout: 5 * time.Second, | ||
mineNowCh: make(chan struct{}, 1), | ||
l2Keystones: make(map[string]L2KeystoneProcessingContainer), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These really should have an initial size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I can do that, I can make it the size of the max keystones (10)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done 👍
service/popm/popm.go
Outdated
func (m *Miner) mineKnownKeystones(ctx context.Context) { | ||
keystonesFailed := false | ||
|
||
m.ForEachL2Keystone(func(ks hemi.L2Keystone) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not a proper for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted the ForEachL2Keystone
to handle the locking and copying, I could pull it out and make it a for loop if you want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed it to a for
loop 👍
service/popm/popm.go
Outdated
func (m *Miner) l2KeystonesForProcessing() []hemi.L2Keystone { | ||
m.mtx.Lock() | ||
|
||
copies := []hemi.L2Keystone{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be allocated with make outside of the mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure 👍
service/popm/popm.go
Outdated
case <-ctx.Done(): | ||
return | ||
case <-time.After(l2KeystoneRetryTimeout): | ||
go m.mineKnownKeystones(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should not `go recurse calls. And can we get rid of the channel?
* keep buffer of 10 newest keystones known * upon receiving a keystone, only insert into buffer if it is newer than the oldest keystone * if buffer is full, drop oldest keystone upon inserting another * when a keystone is mined, mark it as "processed" to not repeat mining same keystone * generate fewer btc blocks in full e2e network test since we have a retry strategy
c25ed44
to
4927945
Compare
* retry keystone on failure * keep buffer of 10 newest keystones known * upon receiving a keystone, only insert into buffer if it is newer than the oldest keystone * if buffer is full, drop oldest keystone upon inserting another * when a keystone is mined, mark it as "processed" to not repeat mining same keystone * generate fewer btc blocks in full e2e network test since we have a retry strategy * use requiresProcessing so multiple goroutines don't process the same keystone at once * just copy the keystone for processing, not the whole buffer element * squash * map attached to miner * better wording * don't retry l2 keystone broadcast in tests * on retry per loop * do not remove from map; mark as processed or processing * use one flag * remove foreach and set size * update comment * make and other timer * ok check in test * added test case * fix start height since we generate fewer * formatted declaration * removed extra line
Summary
We need a way to retry mining keystones if they fail, this provides a solution.
Fixes #7
Changes