Bridging and Finality: Ethereum
Mar 03 2023 _ 12 min read
Post-merge, Ethereum’s consensus protocol involves the composition of two heterogeneous protocols, and as such, there is a wide design space for developers to consider when building cross-chain. As we laid out in our first post in this series, the core tradeoff cross-chain applications must design for is the one between reversion risk and latency. For each stage we delineate below, we’ll try to give a meaningful picture of where it lies along this tradeoff by providing some empirical observations, characterizing the attack surface and offering some references for further reading.
Ethereum’s consensus model, as well as the history of its development, is well documented, so for this post we’ll assume working knowledge of the main concepts—if you’re not yet familiar, the work of a thorough explanation is best left to the Gasper paper. Although we’ll give brief recaps of the core components of Ethereum’s consensus, this post is intended to act as a field guide for cross-chain development, giving quick references for the distributions across latency and the reversion rate we can expect at a given stage in the protocol.
At a very high level, Ethereum’s consensus is based off a set of staked validators who produce blocks while attesting to and determining their view of the canonical chain through two separate mechanisms. The first mechanism, a fork choice rule called LMD GHOST, provides a way for full nodes to resolve the ambiguous tree of blocks that arise from asynchronous block production and voting, allowing validators to determine which chain to propose new blocks on top of. At checkpoints in the block production schedule, Casper FFG, a BFT mechanism that acts on a longer time scale, initiates a round of voting amongst the entire validator set that requires a hard threshold of 2/3rds to pass—once Casper consensus is achieved, the tree is “pruned” down to a single canonical chain up to the checkpoint, while the tree continues to progress.
An important note to reiterate is that we’re going to be discussing consensus from the perspective of full nodes here: as we discussed in our previous post, finality and block safety guarantees (more generally) are meaningful only in reference to a given view of the chain. When a qualifier isn’t given, the implicit assumption is usually that we’re taking the viewpoint of a full node, but we want to be explicit about this here, since not all bridge designs conform to this assumption. The “lifecycle” we will describe is specific to the view of full nodes, and this is an important distinction because not all of these stages will be visible to light clients or carry the same security assumptions.
A block proposal is the first time a transaction will be visible (from the perspective of consensus) to the full nodes observing the network and provides the first meaningful indication of eventual inclusion in the finalized chain. Confirmation that is close to immediate is clearly an attractive property for bridge transactions to have; however, this comes with an extremely low security threshold and thus is only appropriate to use for applications where reversions are inconsequential to UX or safety.
Under ideal conditions a proposal will come out every slot, which is every 12 seconds [Note that this is when we expect proposals to be produced—network delay is another factor that will determine when a given node will actually see a proposal]. However, while the slot schedule is constant, this doesn’t guarantee that we’ll actually see a new block, as it’s possible that a proposer for a given slot doesn’t actually release one.
Under adversarial conditions, in order to delay block production for N consecutive slots, an attacker would have to compromise the proposers for each of those slots. If we consider a model where an attacker can only accomplish this by directly owning the proposer for a slot, the odds of this as a function of the attacker’s stake follow a binomial distribution, which drops off exponentially as N increases. However, other cases to consider are ones in which an attacker tries to bribe or DDOS third-party proposers, which are more difficult to cleanly describe or make assumptions about, so there isn’t a clear model we can use to determine an upper bound on these probabilities.
Another factor to consider here is proposal failures from honest validators—while causes like individual network failures or poor node management are likely to be one-off cases, correlated strings of block proposal failures can occur if there’s a bug in a client implementation used by a large proportion of validators. This has turned out to be a more relevant consideration so far, as we’ve actually seen such cases happen with the beacon chain as described here, although it should be noted that these occurrences were pre-merge and client diversity has improved since then, decreasing the chances of these kinds of correlated failures.
Here’s a short description of the observed distribution of slot delays between proposed blocks since the Merge occurred:
Since the block proposal step happens before any validator attestations can occur, at this point we can’t even be sure if the most recent block proposed will end up in the chain selected by GHOST (or even start out as the canonical head), much less in the finalized chain, so we face reversion risks under the fork-choice rule, which we’ll discuss in the next section. The risk we face related specifically to this stage is a double proposal from a malicious proposer. The odds that an attacker with X% of stake controls the proposer for a given slot are of course just X% (however, it’s probably better to consider the proportion of validators who will make the locally rational decision to accept a bribe to do so).
In practice, there have been less than a dozen proposer equivocations on the beacon chain, and all of these occurred before it actually went live with the merge (here is the last one to occur as of the date of publication). Although this rate is extremely low, it’s important to keep in mind that the cost for an attacker to double propose is of course capped at 32 ETH, the stake of a single validator (and it will be much lower if it’s an isolated event).
Although the observed rate is low, it makes for a very weak security assumption—transactions should only be confirmed upon block proposal if they would cause no harm to a protocol or its users if they’re reverted. Use cases that might be able to safely use block proposals to confirm bridged transactions are services that don’t carry significant value either directly or implicitly, such as cross-chain emails. More generally, if the state you attest to with a bridge transaction is guaranteed by contract-level to be immutable (though you should be extremely cautious about confirming this assumption), then you won’t incur any harm if your attestation gets reverted, since the state it’s attesting to will still be valid.
Inclusion in GHOST chain
By inclusion in the chain, we specifically mean inclusion in the fork chosen by applying the LMD GHOST algorithm (this is a pre-requisite to achieving finality under Casper). Note that a block that has just been proposed will not necessarily be the head of the chain under this fork choice rule, so this is a stronger condition than block proposal, and even if it does start out there, it can be reverted later as block production and consensus progress.
Although a fork being selected by GHOST doesn’t guarantee it can’t be reverted (it’s just a rule applied to our local view of the chain), as the subtree rooted in our block of interest accumulates attestation weight, we can increase our confidence that it will eventually be finalized down the line, roughly analogously to how we count the number of block confirmations in Nakamoto protocols, which we explored in our last post.
Similarly also to our discussion within the post, there isn’t a point in this process where we gain absolute certainty that a block won’t be reverted (in fact, rather than leaving it up to a subjective heuristic, Ethereum instead defers the determination of “finality” to Casper), so we can consider the latency/reversion tradeoff in this stage as falling along a gradient according to a block’s subtree weight.
Once again, since there aren’t any specific points in the progression of the chain under the fork choice that provide qualitatively new guarantees, there aren’t particularly meaningful conditions to measure latency against. If we’re interested in the expected time until a given proportion of attestation weight is reached for a given subtree, it’s best to have a rough understanding of the rate at which it can be accumulated. Validators vote according to the scheduled slot of their committees, which are split evenly among the 32 slots in the epoch, so it will take a minimum of a slot (12 seconds) for a subtree to gain 1/32 of the total vote-weight of the validator set. This will progress at a constant rate (subject to the participation rate of each slot’s committee) if a competing fork doesn’t appear.
Generally, the way an attacker can cause a reversion under LMD GHOST is by controlling a proposer for a slot, but withholding their proposal, then later revealing a fork that reverts some honest blocks and throwing their attestation weight onto it. If we consider just the naive attack, then if a given subtree has already accumulated N attestations, an attacker will have to have that same attestation weight in order to fork the network away from it.
However, there are adversarial strategies that improve on this—the high-level takeaway from formulations like the Avalanche attack is that attackers can achieve deeper reversions with their stakeweight through the timing of their reveals and the structure of the alternate forks they produce. Thus, although the difficulty of reverting a block clearly increases along with its subtree attestation weight, it’s difficult to put an exact price on this.
A number of modifications have been made to Ethereum’s fork choice rule that make these attacks less feasible, such as ignoring validators who equivocate with their attestations and proposer boost, which gives extra weight to blocks proposed in a timely manner, but the exact parameters required for a successful attack remain fairly vague.
Although block reversions can also occur among honest validators due to network asynchrony or client partitions (here is an exploration of a 7-block reorg that occurred pre-merge, due to a partially adopted consensus client update), Ethereum’s PoS consensus is very effective at preventing conditions that produce “accidental” re-orgs, having so far proven to be much more stable than under the proof of work regime.
Our own full nodes have experienced about 90 chain re-orgs since the merge, with none of them reverting deeper than a single block. Once again, it’s important not to take this as a prior thoughtlessly and assume that deeper reversions can’t happen, especially if there are specific payoffs to an attacker for doing so. Even with 51% of the total attestation weight of validators, which would prevent any naive attacks, there still isn’t a formal guarantee that a subtree can’t be reverted—for cross-chain applications, the risk of double spend attacks (which are accomplished by causing reversions) are particularly crucial to consider.
Another important consideration here is that bridge transactions can contain unique forms of MEV and thus could provide an extra incentive for attackers to try to re-org the chain. Hence, cross-chain application designers must consider not only the risk of block reversions under GHOST in the general case, but also the potential that the transactions their protocol produces might present (in of themselves) an incentive for attackers to target their blocks, even if they can’t profit directly through a double spend.
Although this is somewhat of a second-order effect, it’s important to consider since it means that pre-finality confirmations are dangerous not only for cross-chain transactions that directly transfer value (which are susceptible to double spend attacks), but also potentially for transactions that don’t directly involve tokens being sent/received, but that do carry explicit or implicit information that corresponds to value (for example, components of a cross-chain pool sending updates about their liquidity or oracle updates).
Head of Casper chain (Justification)
A block being justified under Casper is analogous to the completion of the pre-vote phase in Tendermint. Under the assumption that at least 2/3rds of the validator set is honest (don’t equivocate), then there will only be one canonical block that will receive enough votes to be a candidate for finality, although it’s also important to consider what can happen if this assumption doesn’t hold. This stage alone isn’t how the condition of finality is formulated for BFT mechanisms, such as Casper (which require an additional round of consensus confirming agreement on these votes), but it already provides a strong assurance against reversion, which as compared to LMD GHOST, is easier to reason about formally.
If consensus is running smoothly, then a checkpoint slot will get justified every epoch, which is 32 slots, or 6.4 minutes. Given a randomly selected block/slot, the time to reach one of these checkpoints will have an average of 16 slots, uniformly distributed between 0 and 32.
Since the merge, Casper has smoothly and successfully kept up with this schedule; however, we can’t be certain that a full round of Casper will always succeed according to this schedule, as this depends on the liveness of at least 2/3rds of the validator set. Even in the scenario where >1/3rd of validators go offline, either due to a correlated outage or a malicious attempt to halt the chain, Casper is designed to eventually progress again, without the need for a hard-fork, through the inactivity leak. Thus, in the worst case, the time for a block to reach justification could stretch out to days or weeks in the event of a major validator outage. This paper shows a more detailed look at attacks that can delay finality, although many of the attack vectors, such as disrupting Casper by effecting long-range re-orgs under GHOST, are once again addressed by practical modifications to Ethereum consensus as it is implemented.
So far, thankfully there hasn’t been a mass equivocation among validators, so the threat of a conflicting Casper fork hasn't yet materialized in practice. Whereas the head of the GHOST chain can see a reversion fully within the protocol rules, with a potentially small subset of validators behaving maliciously, Casper can only produce ambiguous results if a large portion of validators are willing to double-sign votes—it would require 1/3rd of validators at minimum to equivocate in order to accomplish this.
Although there’s no way Ethereum can guarantee that this won’t happen, it does guarantee that if two conflicting Casper chains successfully reach consensus, then a minimum of 10.7 million ETH (at the current validator set size of ~500,000) will be burned on both forks. The cases of high-MEV transactions that don’t involve much “direct” value might be able to safely confirm at this stage, thanks to the high price of reversion; however, for applications that involve direct transfers of value, the safer choice is to wait for another round of consensus to achieve finality.
Finality under Casper is the strongest guarantee that Ethereum’s consensus provides, requiring an additional round of voting on top of justification. High-stakes applications where a reversion or a double spend present existential threats should of course require the highest level of safety possible.
If a block being justified can be compared to the completion of the pre-commit phase of Tendermint, then achieving finality is similar to the pre-vote phase. However, Casper is interesting in that it takes these two voting stages and in some sense “pipelines” them. When validators sign Casper votes, they vote not for individual checkpoint blocks, but for a transition between two checkpoints. The process that justifies blocks is the same one that finalizes them, just phase-shifted back by an epoch. Therefore, the latency between block finalization follows the same distribution as the latency between justification, and from the perspective of a particular block of interest, the expected time to finality is just the time to justification plus another epoch, adding on another 6.4 minutes. The worst-case consideration is the same as with justification, requiring a wait until the inactivity leak pushes out offline validators.
While it’s technically possible to revert finality with just a third of the validator stake weight, in order to successfully justify and then also finalize two conflicting forks under Casper, the remaining 2/3rds of the validator set would have to be perfectly partitioned in order to prevent the two sides from hearing about the first equivocation and subsequently slashing the attacker before they could equivocate again to reach finality.
More realistically, in order for an attacker to finalize conflicting blocks without depending on a partition among honest validators, they would have to control fully 2/3rds of the validator set. Ultimately, the security of Ethereum comes down to the economic value staked and whether that produces a high enough threshold (over 20 billion dollars at the current price of ETH) to prohibit an actor from controlling that proportion of the validator set.
In this post, we’ve explored the tradeoff between latency and reversion risk at the major stages in Ethereum’s current proof-of-stake consensus protocol. In the next post of our series, we’ll explore the safety guarantees of optimistic rollups and some of the subtleties that arise when considering finality there.
Coming soon..View All Posts (2)
Stay up to date with the latest from Jump_
SAFU: Creating a Standard for Whitehats
Whitehats and DeFi protocols need a shared understanding of security policy. We propose the SAFU - Simple Arrangement for Funding Upload - as a versatile and credible way to let whitehats know what to...
Oct 24 2022 _ 17 min
Stop the Chain! CosmWasm Stack Overflow
This post announces a vulnerability we discovered in CosmWasm, a smart contract platform written for the Cosmos ecosystem. The vulnerability was a stack overflow, which would have allowed users who ca...
Jun 01 2023 _ 1 min
The information on this website and on the Brick by Brick podcast or Ship Show Twitter spaces is provided for informational, educational, and entertainment purposes only. This information is not intended to be and does not constitute financial advice, investment advice, trading advice, or any other type of advice. You should not make any decision – financial, investment, trading or otherwise – based on any of the information presented here without undertaking your own due diligence and consulting with a financial adviser. Trading, including that of digital assets or cryptocurrency, has potential rewards as well as potential risks involved. Trading may not be suitable for all individuals. Recordings of podcast episodes or Twitter spaces events may be used in the future.