So You Still Want to Use a Price Oracle
- Price oracles must be both accurate and timely, but often those two goals directly conflict.
- In order to achieve both goals, we propose that price oracles align incentives internally, by having oracle operators stake reputation, collateral, and trading capital on the integrity and timeliness of the data.
- This framework can be applied to the separate cases of the LIBOR index and the on-chain Pyth oracle, to understand which elements are designed well and which elements require further work.
In an excellent post, "So you want to use a price oracle," samczsun offers an exhaustive summary of oracle exploits. In his numerous examples, the oracle reports a stale, incorrect, or manipulated price, that an attacker subsequently exploits. Since publishing the article (in late 2020), the number of high-profile oracle failures has only grown, e.g. Inverse Finance, Venus, Fortress, and more.
But the primary solution that the article offers — introducing delays (e.g. reporting delays, trading delays, time-weighted prices, etc) — is too narrow. As financial applications in crypto grow more sophisticated, they also have grown more latency-sensitive.
In this article, we look at the problem at a foundational level to identify alternate ways forward. At its core, oracles reflect information that can either have high integrity or low integrity. Delays are one way of improving the integrity of information, by allowing users and markets to correct untruthful information. But they come with high costs, by making that same information less useful. Fortunately, they are not the only way to preserve integrity.
The other way is: skin in the game, i.e. broadly defined "staking". If designed correctly, such systems ensure that information has high integrity because those who provide it will suffer economic or reputational consequences if incorrect. This can be accomplished by staking of reputation, staking of collateral, or implicit staking of trading capital.
These principles can be applied to understand the (in)famous example of LIBOR. The mechanism to generate LIBOR took only limited steps to align incentives, and this was likely its undoing. A system that deployed skin in the game more extensively could have performed better.
Here at Jump Crypto, we contribute to an oracle protocol known as the Pyth network ("Pyth"). So it should come as no surprise that we’ve helped Pyth operationalize these principles, to distribute data in a low-latency and high-integrity manner. There are many real-world constraints that preclude Pyth from doing this perfectly, and that force it to add some additional defenses. But the importance of skin in the game in Pyth’s implementation and whitepaper alike excites us.
What is a Price Oracle?
A price oracle is a source of price data streamed onto the blockchain. Oracles bridge the off-chain world with the blockchain, by publishing off-chain price information on-chain.
In many cases, this may seem unnecessary, as price information often already exists on-chain. For instance, one can figure out prices by querying decentralized exchanges (DEXes). Integrating the on-chain source requires no extra lift from an infrastructure or coordination perspective, and the risk of a delay or an off-chain failure is minimized compared to an oracle.
However, DEXes often have relatively low liquidity compared to centralized exchanges (CEXes). Thus, DEX prices may not be representative and are often more easily manipulated than CEX prices. Using an oracle resolves these issues, as the oracle brings the high-quality information from CEXes onto the chain.
Oracles, though, introduce their own vulnerabilities. As the many examples of oracle failures illustrate, they can be manipulated and they can make mistakes. This issue is the heart of samczsun’s piece and ours alike.
Delays and New Attack Vectors
Given these risks, samczsun proposes time delays — among other solutions — to address oracle failures. The article, for instance, highlights the Uniswap V2 TWAP (time-weighted average price) oracle as an illustration of an oracle robust to price manipulation. The reference implementation to Uniswap’s oracle averages prices over a twenty-four hour window, meaning that short-lived shocks to the price are largely ignored and even a large and sustained shock (e.g. 20% for an hour) will move the oracle price by less than a 1%.
This is a feature, but perhaps also a bug. When the price falls because of legitimate reasons (e.g. increased selling pressure), the oracle price will lag the "true" price of the asset significantly. Any protocol dependent upon the oracle will be using stale prices for its logic. Thus, delays are reasonable when the asset class is stable, and unreasonable when the asset class is volatile.
But crypto, of course, is still a volatile asset class, with frequent intraday spikes and drops. This problem gets worse when certain tokens face sudden squeezes, crashing or skyrocketing in price. Delayed oracles in this context can be disastrous.
The twenty-four hour oracle may be an extreme and admittedly stylized case, but surprisingly this assessment even holds for shorter and seemingly reasonable timeframes. For instance, consider the price of ETH in mid-August 2022. Even using a fifteen-minute TWAP leads to a significant lag of the oracle price behind the price on a centralized exchange, as the graph below shows. In this example window, there is an average difference of $2.34 between the CEX price and the oracle price, i.e. a 0.15% divergence. This average alone is greater than many protocols' fee margins — but the divergence gets particularly magnified during volatile periods. For instance, in the one-hour window from 19:30 to 20:30 GMT below (a sustained sell-off in ETH), the average divergence is $6.35, i.e. 0.40%. Arbitrageurs thus can earn outsized profits, particularly during tumultuous periods, entirely from LPs that finance pools using delayed oracles.
Such a significant difference between the true and delayed prices over tiny timeframes indicates the importance of latency. Oracles built around TWAPs or other delays are unusable for real-time financial applications. Regardless of the context, on-chain liquidity providers, traders, and protocols cannot afford to transact at stale prices. DEXes will trade at systematically unfair prices and the protocol may leak profits to arbitrageurs. Liquidations on a lending protocol will not happen quickly enough and the protocol may end up holding onto deteriorating assets.
To be clear, samczsun’s motivation for introducing delays is reasonable, given the plethora of oracle price manipulation exploits in DeFi. But the cure can be worse than the disease here, particularly in markets with high volatility.
Staking and the Integrity of Information
Introducing delays in oracles is one way to filter away low-quality information — although in a crude way that also makes the high-quality information less useful. But there is a more elegant way: holding those who power the oracle responsible for the quality of the data. If done correctly, this generates high-quality data that can be distributed without any delay.
This is known as "skin in the game," which we also refer to as "staking." Crypto users are likely familiar with staking in its canonical form: some party posts funds as collateral that can be seized if they perform maliciously or poorly, to align incentives. Staking is common to how Layer 1 networks manage their validators. But the broader principle of staking — wagering something valuable to commit credibly to good behavior — can extend to oracles in three ways: reputational staking, staking of collateral, and staking by trading.
First, an oracle operator could operationalize the simplest form of staking: staking its reputation. In its simplest form, this happens automatically — a bad oracle will attract negative press and eventually be abandoned. The mechanism is not novel or specific to blockchain, and every business implicitly participates. For instance, if the local sandwich shop uses poor-quality ingredients, we will note that on our Yelp reviews and tell our friends. But in the same way, this is not a particularly strong mechanism. By the time an oracle gets a bad reputation, it may have caused substantial damage.
But an oracle operator that is powered by a collection of data providers can implement this mechanism on its upstream contributors more successfully. In particular, oracle operators can make provider contributions transparent, so that individual contributors that perform poorly will do so in full public view (presumably without affecting the aggregate price substantially).
Second, an oracle operator could operationalize staking in the most literal sense. In particular, an oracle operator could post collateral that can be seized if it reports incorrect prices (known as "slashing"). Alternatively, an oracle operator that is powered by several data providers could force those contributors to post collateral. The setup is powerful for explicitly aligning incentives. Wary of having collateral seized, the contributor or operator is incentivized to take efforts to filter for high-quality information more aggressively. As an added bonus, the funds that are seized in an attack can be used to partially compensate for damages.
This is a powerful mechanism, if it can be operationalized successfully. However, it requires an adjudication mechanism that punishes incorrect contributors and operators automatically. As it is, most Layer 1 networks have not implemented this feature yet for their validators — and oracle networks may face similar development challenges. Such an adjudication mechanism requires some independent or least robust notion of ground truth to score oracles on.
There is a third and under-appreciated form of skin in the game in the context of oracles: staking by trading. In particular, oracle operators can vouch for the quality of prices because they are trading on those same prices, i.e. "staking" or betting capital on the integrity of the data. Oracle operators are incentivized to filter for high-quality data so that they can trade successfully, and — as a byproduct of that incentive alignment — create high-quality data to publish.
Indeed, samczsun does note this solution in the context of individual protocols or traders: "the best way to know for sure the exchange rate between two assets is to simply swap the assets directly." But this principle can be extended to oracles operator directly. By "staking" capital (under this much broader definition), there is a natural synergy between traders and oracle operators to enhance data quality, without compromising on its speed and usefulness.
As always, the trick is operationalizing this successfully. In particular, traders do not want to give up their edge, and so they may want to publish that data after first executing trades. If that lag is long, the oracle risks becoming high latency again. Second, there needs to be some enforcement mechanism to ensure that oracle operators do not misrepresent their contributions as true prices when they are not. Thus, this mechanism best works when coupled with the other ones.
Case Study: LIBOR
Aligned incentives are not only desirable for a low-latency product — they are necessary. Indeed, misaligned incentivizes in information reporting can bedevil even the most storied of financial oracles. In particular, the manipulation of the London Interbank Offered Rate (LIBOR) — once the most consequential global benchmark for interest rates — offers a cautionary tale.
As a quick refresher, LIBOR represented the (theoretical) rate at which the largest banks could obtain short-term unsecured loans from one another. Everyday before 11:30am GMT, approximately sixteen reputable banks with operations in London were polled. They were asked the following question, across multiple currencies and maturities: "At what rate could you borrow funds, were you to do so by asking for and then accepting interbank offers in a reasonable market size just prior to 11am?" The results went through a light layer of post-processing, in which the highest and lowest four responses would be culled. The averages of the remaining submissions were published at 11:30am GMT as that day’s LIBOR benchmark.
The importance of LIBOR cannot be understated. At its peak, LIBOR served as the benchmark rate for upwards of $300 trillion notional in contracts — ranging from conventional auto loans and household mortgages to more complex instruments like swaps and futures. Many saw the rate as the center of finance, which added to its sense of invulnerability. Inasmuch as there were concerns about individual banks gaming it, the layer of post-processing was considered a suitable defense.
But in 2012, it was alleged — and confirmed later — that several of the reporting banks had been colluding for years to collectively misreport LIBOR submissions, for the benefit of their own proprietary trading positions. This scandal spawned record fines and criminal prosecutions, and doomed the rate’s hegemony. After fifty-two years of existence, LIBOR "officially" was retired for new contracts at the end of 2021.
Viewed under the lens of staking, though, LIBOR’s sudden downfall seems less surprising. In particular, there were three simple and key design choices underpinning LIBOR’s mechanism, that blunted the alignment of incentives.
- Participants were not staking collateral on the accuracy of the information, which mitigated the financial consequences of misreporting rates. In fact, many of the contributing banks were staking capital on the outcome of LIBOR instead, which provided strong financial incentives to misreport LIBOR in predictable ways.
- Participants were anonymous (in the aggregate), which diminished their reputational skin in the game. While the LIBOR contributors were publicly known, each contributor’s individual submission was anonymous. This allowed contributors to avoid day-to-day reputational blowback and public scrutiny when manipulating the daily rates.
- Participants were not reporting rates based on their trades, and were explicitly reporting based on hypothetical scenarios. This meant that they were not wagering capital indirectly, which further distanced reported LIBOR from its true value.
Fortunately, the replacements to LIBOR, e.g. the Secured Overnight Financing Rate (SOFR) in the US, explicitly fix these design choices. In particular, SOFR is based on actual transaction data rather than hypothetical data — as a "staking by trading" mechanism. Moreover, it is computed directly by the Federal Reserve Bank of New York. This agency has strong oversight and regulatory powers, which further disincentivizes banks to manipulate the rate via spurious trading — a particularly strong version of the reputational staking mechanism.
Case Study: Pyth
Oracles that have high latency or misaligned incentives are worrisome. But can oracles pull off the balancing act of low latency and high-quality data?
We at Jump Crypto have been longtime contributors to the Pyth network (or "Pyth" for short), an oracle network comprised of over sixty-five publishers. These publishers stream prices to Pyth, which combines them into an aggregate and publishes those on-chain. We believe Pyth makes a good effort at operationalizing the staking principle, to achieve these twin goals.
To be clear, we do not intend to shill Pyth in this article. Pyth remains a work-in-progress, and there are many aspects that must be developed further. But we respect that Pyth implements several forms of incentive alignment, and we plan to help Pyth develop these efforts further in the coming months.
Returning to the forms of staking, Pyth actually implements all three forms. Of course, Pyth does so imperfectly, constrained by real-world factors.
- For staking via reputation, Pyth does a moderate job. On the plus side, Pyth is highly public with the aggregate set of its publishers. However, Pyth does not doxx individual addresses, which limits the public attention and press that bad contributors may suffer. It is possible that this mechanism may weaken in the long run, if Pyth transitions to permissionless publishing.
- For staking via collateral, Pyth’s mechanism is limited in the status quo but it has high future potential. In the short run, Pyth has no particular mechanism. However, a core component of Pyth’s whitepaper is to reward publishers that bring novel and correct information for the price, and slash publishers that bring incorrect information. This explicit adjudication mechanism will be novel for oracles, if it launches as described.
- Similarly, for staking via trading, Pyth’s mechanism has high potential but must further develop. On the plus side, many of Pyth’s data providers are high-frequency trading firms who contribute prices as a byproduct of their trading operations. But there are two potential vulnerabilities. First, such firms must be willing to report those prices to Pyth sufficiently quickly. Currently, these firms trade so quickly that it is reasonable that they can execute their trades first and deliver their prices to Pyth second in the same block. (Even the fastest block speeds in crypto — four hundred milliseconds, on Solana — is considered very slow by high-frequency trading firms, which can execute trades within a few milliseconds.) But if crypto block speeds improve, this may become an issue. Second, there needs to be a mechanism — either via incentives or penalties — to validate those self-reported prices. Otherwise, Pyth runs the risk that publishers misrepresent publicly-collected prices as the ones that they traded on.
While this article focuses on staking and incentive alignment on delivering high-quality data, oracles need to plan for a wider range of conditions via additional lines of defense. As one example, oracles may face periods in which there is simply uncertainty on the true price, even if all publishers are acting with the best of intentions. Pyth does so by using confidence intervals to reflect that uncertainty. For instance, consider the Terra de-peg episode, in which the consolidated price of LUNA was simply unknown at several points in time, given the substantial variation in prices across the major CEXes. Pyth handled that uncertainty and heterogeneity by widening confidence intervals around its reported price.
This is but one example of the complex realities that an oracle network must face. Skin in the game can go a long way to deliver high-quality and low-latency data. But in practice, any network must add several further defenses and failsafes.
Given the large sums in DeFi, a good price oracle needs to be accurate and timely. For too long — both before and after samczsun’s article — oracles have suffered on accuracy. Delays ease that, but the lack of timeliness introduces its own vulnerabilities.
Instead, we believe successful oracles will go back to the fundamentals of crypto, and align incentives correctly. Between staking reputation, collateral, and trading capital, oracles can achieve those twin objectives of accuracy and timeliness in parallel.
Good oracles, of course, won’t stop there. They must think through the aggregation approaches, the robustness to market stress, and the ultimate guardrails in case of failures. Good oracles understand that many of these choices are highly context-dependent (e.g. asset class, frequency of updates, etc). And in turn, protocols that integrate with oracles must think about these core principles and additional safeguards alike too. If done correctly, then the end users of crypto can harness the power of DeFi coupled with the speed and reliability of traditional markets.
Please let us all know what we got wrong or missed, as we would like to understand this subject matter thoroughly and correctly. Thanks to the research team at Jump Crypto and especially to Jayant Krishnamurthy, Michael Setrin, Jeff Bezaire, and Ben Huan for feedback. This note does not constitute financial advice.
There are other solutions mentioned, e.g. M-of-N reporters. These are more promising, although they require some trust assumptions or suitable incentive alignment. Indeed, this article discusses the latter as a fairly powerful principle, which can be utilized directly without necessarily needing "N" reporters. ↩︎
We compute prices from Binance based on one-second candlestick bars, which in turn comes from tick-level data. The closing price of the candlesticks are used to construct this graph, although using the opening or average price yields similar results. ↩︎
There are additional concerns around TWAP-based oracles that price ETH following the transition to proof of stake. In particular, the increased predictability around block assignment will allow adversarial miners to sustain manipulated prices for longer periods of time, weakening the security guarantees of TWAPs. ↩︎
In some ways, participants did stake collateral after the fact, given the large fines that banks paid for misreporting rates. However, this collateral staking was done on an ex-post rather than ex-ante basis. ↩︎
Admittedly, the secrecy did serve some other purpose. In particular, the British Bankers Association kept this information secret so that the public could not speculate on a bank’s solvency risk. ↩︎
Pyth further plans to build an insurance mechanism, which will add further capital to this staking mechanism and better compensate users affected by incorrect prices. However, these plans remain more nascent than the mechanism that binds directly on publishers. ↩︎
This may pair well with Pyth’s anticipated rewards mechanism. With such a mechanism in place, publishers are directly incentivized to publish the prices they actually traded at, to maximize their share of the rewards. This is because their trade prices are the most indicative source of price information at their disposal and would help them boost their rewards. ↩︎
Stay up to date with the latest from Jump_
SAFU: Creating a Standard for Whitehats
Whitehats and DeFi protocols need a shared understanding of security policy. We propose the SAFU - Simple Arrangement for Funding Upload - as a versatile and credible way to let whitehats know what to...
Oct 24 2022 _ 17 min
Huckleberry: IBC Event Hallucinations
This blog post describes a vulnerability in ibc-go, the reference implementation of the Interblockchain Communication Protocol (IBC) used by most Cosmos blockchains
Sep 06 2023 _ 4 min
The information on this website and on the Brick by Brick podcast or Ship Show Twitter spaces is provided for informational, educational, and entertainment purposes only. This information is not intended to be and does not constitute financial advice, investment advice, trading advice, or any other type of advice. You should not make any decision – financial, investment, trading or otherwise – based on any of the information presented here without undertaking your own due diligence and consulting with a financial adviser. Trading, including that of digital assets or cryptocurrency, has potential rewards as well as potential risks involved. Trading may not be suitable for all individuals. Recordings of podcast episodes or Twitter spaces events may be used in the future.