-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate vote response can make illegal leader without a quorum #27
Comments
Server id is not included in response because it is not necessary. When we create a connection from current node to another node, we are sure the response from that node belongs to the the node we are connecting. If you are talking about the incorrect or duplicated messages, please consider a scenario in what circumstance the node could send duplicated messages. Normally, if you resend something, it means you are waiting for the response and retrying. There is no retrying mechanism from the follower side when sending the RequestVoteResponse, if the message cannot reach the other node, it just is being dropped or ignored. Raft is building on the unreliable and asynchronous-model network to solve the fundamental consensus problem in distributed systems so there is no stable message exchange protocol and Raft is not going to create one. However, if you really want to maintain correctness between two nodes, you can add some check in the connection handler, which should be transparency to Raft side. |
Sorry for my misleading expression. I agree that in my scenario Node 2 cannot response twice for only one request. But for me it is still possible that Node 1 can receive duplicate responses even when Node 2 only sends it once due to OS- or network-level error. Besides, Raft is required to handle with duplicate message. You can find the related statements in Dr. Ongaro's thesis, section 8.1. Also, the given TLA+ specification defines DuplicateMessage behavior which Raft must cover. |
Thank you for pointing out the necessity of handling unintentional duplicated messages. A quick thought is to check if the node has voted for a specific term in the connection handler. e.g. int lastTermVoted = -1;
if(requestVoteResult.term != lastTermVoted) {
lastTermVoted = requestVoteResult.term;
// propagate the message
}
// otherwise just ignore it I'll approve it if you create a pull request for this. |
I dont think this simple fix can help solve the problem, since lastTermVoted cant still be used to distinguish which node the vote is from. I've created a PR with a more complex revision. |
I find another scenario to prove why it is important to distinguish where votes from. |
Thank you. I checked the The only chance that a node will reply more than one response is that a brain-split happens, two candidates ask the same node for voting. Then the node replies to them respectively. In that case, the messages go to different nodes, not the same one. |
Yep. The standard Raft will persist important internal states, i.e., votedFor and term, on disk when they are changed, so that a node could remember his choice even after restart. |
I notice that the message RequestVoteResult does not contains a field for distinguishing which node the result is from.
Besides, when processing RequestVoteResult, the node only checks the count of votes. It means a node can become leader as long as he receives enough granted votes, rather than is supported by enough nodes.
Considering a scenario with 5 nodes, Node 1 becomes Candidate and request vote for Node 2. Node 2 responses two duplicate RequestVoteResult messages (due to network error or some other faults). Then Node 1 can become Leader even when he only receives one node's support!
I think it is a bug.
The text was updated successfully, but these errors were encountered: