Skip to content

0.7 Livelock Scenario

Jae Kwon edited this page Jan 23, 2019 · 4 revisions

This is Zarko's writeup of the 0.7 livelock scenario. If Byzantine validators can cause a select few honest validators to lock on the proposed block at just the right time, then they can ensure that <= 2/3 honest voting power remain unlocked. If also, honest validators never propose a block that is already locked on by other honest validators (thus never getting the requisite > 2/3 prevotes for a POL aka PolC) before the attack, then Byzantine validators can prevent termination.

The problematic scenario is sketched in the attachment (sorry for quality, didn't have time to do it properly).

We consider system of 4 processes, p1 to p4, where p4 is Byzantine process. The round number 1 has p1 as a proposer, and as it happens before GST only p1 locks value v1 in this round. Stating with round number 2, all rounds happen after GST, so communication between correct processes is reliable and timely, i.e., all correct processes receive messages from all correct processes. Note that after GST we don't have the same guarantee on messages sent by Byzantine processes, i.e., after GST, a Byzantine process can send a message only to a subset of correct processes.

In round 2, p2 proposes (v2, -1) as it hasn't locked any value. p1 rejects this proposal as it has v1 locked in round 1. p3 accepts the proposal, but as p4 stay silent, no process lock a value in round 2. So at the end of round 2, we have still only p1 locking v1.

In round 3, p3 proposes (v3, -1) as it hasn't locked any value. p1 rejects this proposal as it has v1 locked in round 1. p2 accepts the proposal, and p4 also accepts the proposal, but send PREPARE message only to p3. So in the round 3 only p3 locks v3. At the end of the round 3, p1 has locked v1 with timestamp 1, and p3 v3 with timestamp 3. p2 hasn't locked any value.

In round 4, p4 is a proposer and as it is Byzantine he stays silent, so nothing change.

In round 5, p1 proposes (v1, 1) . p3 rejects this proposal as it has v3 locked in round 3. p2 accepts the proposal, and p4 also accepts the proposal, but send PREPARE message only to p1. So in the round 5 only p1 locks v1. At the end of the round 5, p1 has locked v1 with timestamp 5, and p3 v3 with timestamp 3. p2 hasn't locked any value.

In round 6, p2 proposes (v2, -1) as it hasn't locked any value. p1 rejects this proposal as it has v1 locked in round 5. p3 rejects the proposal as it has locked v3 in round 3, and p4 stays silent, so no process lock a value in round 2. At the end of the round 6, p1 has locked v1 with timestamp 5, and p3 v3 with timestamp 3. p2 hasn't locked any value.

In round 7, p3 proposes (v3, 3). p1 rejects this proposal as it has v1 locked in round 5. p2 accepts the proposal, and p4 also accepts the proposal, but send PREPARE message only to p3. So in the round 7 only p3 locks v3. At the end of the round 3, p1 has locked v1 with timestamp 5, and p3 v3 with timestamp 7. p2 hasn't locked any value.

This scenario repeats forever, so the algorithm never terminates.

The solution that Tendermint has implemented to mitigate this live-lock scenario is the ValidBlock/ValidRound mechanism, which restricts what an honest validator can propose -- the last valid proposal w/ POL that was discovered via gossip.

From Ethan:

The key is that we have to wait the precommit timeout before going to next round, which ensures we have time to hear about the block that may have gotten a polka, so it can be proposed in the next round.