-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
object/put: fix concurrent PUT data corruption #3027
Conversation
943eaaf
to
a63db8a
Compare
@roman-khimov, this is almost the same as in the |
Also we can try node one more time ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now let's just fix the bug without any behavioral changes. This code can and will be improved, but that's a bit different story.
Also, be more aggressive with "close XXX", we have a number of issues here. |
a63db8a
to
004fcf0
Compare
@roman-khimov, added issues for my taste. |
004fcf0
to
ea3e03b
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3027 +/- ##
==========================================
- Coverage 22.85% 22.63% -0.23%
==========================================
Files 791 791
Lines 58734 58493 -241
==========================================
- Hits 13425 13238 -187
+ Misses 44412 44358 -54
Partials 897 897 ☔ View full report in Codecov by Sentry. |
e82555a
to
06efc8e
Compare
Test needs to be fixed. |
If ants pool is busy and cannot take task, early `return` without `wg.Wait()` leads to `iterateNodesForObject`'s `return` and all the buffers for binary replication from now may be reused while are still in use by the other routines inside the pool. Wait for WG before any `return` is called. Closes #2978, closes #2988, closes #2975, closes #2971. Signed-off-by: Pavel Karpy <[email protected]>
If an object cannot be PUT due to local overload (i-th routine for (i-1)-length worker pool), log the error and continue over other nodes, and even other placement vectors. `errNotEnoughNodes` will be also returned as the natural replication number handling in the outer `for`. Signed-off-by: Pavel Karpy <[email protected]>
Signed-off-by: Pavel Karpy <[email protected]>
06efc8e
to
872baaf
Compare
Also, added one more test. |
If ants pool is busy and cannot take task, early
return
withoutwg.Wait()
leads to
iterateNodesForObject
'sreturn
and all the buffers for binaryreplication from now may be reused while are still in use by the other routines
inside the pool. Wait for WG before any
return
is called. Closes #2978, #2988,#2975, #2971.