Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Without transactions complexity of user-written cross-chain contracts must increase significantly #97

Open
code423n4 opened this issue Aug 3, 2022 · 4 comments
Labels
bug Something isn't working QA (Quality Assurance) Assets are not at risk. State handling, function incorrect as to spec, issues with clarity, syntax sponsor acknowledged Technically the issue is correct, but we're not going to resolve it for XYZ reasons

Comments

@code423n4
Copy link
Contributor

Lines of code

https://github.com/code-423n4/2022-07-axelar/blob/9c4c44b94cddbd48b9baae30051a4e13cbe39539/contracts/AxelarGateway.sol#L26

Vulnerability details

Impact

This submission is technically out of scope but, given the overall impact, I think it's worth bringing to the attention of the sponsors.

In this submission I will make the argument that the cross-chain architecture that Axelar is proposing significantly increases the difficulty of writing correct smart contracts.

Axelar does not have transactions. One of the reasons that it is relatively easy for people -- of all skill levels -- to write applications in Solidity is because the EVM has transactions.

Simply put, a transaction allows you to either succeed "all at once", or fail and automatically roll back to the original state. There are no intermediate states the the contracts can be placed in. On the Ethereum block-chain a transaction either succeeds and becomes part of a block, or it reverts and does not become part of a block. Most importantly, in the case of a revert, the state of the blockchain is exactly the same as it was before the revert.

However, with the architecture you have chosen this is not the case. Contract calls can fail half way. When they do, they are stuck in an intermediate state and a roll back is required. State changes must occur to put the system back the way it was before. This is far more complex than it first appears.

In the Proof of Concept section I provide some examples that I hope will convince you.

Proof of Concept

NFTLinker example

Take the NFTLinker example. An invariant that must be true is this: a particular NFT can only exist on one chain at a time.

Yet there is nothing within the code of NFTLinker that enforces this. Nothing stops the minting of an NFT on both chains. Consider this scenario:

  • NFT is minted on chain A
  • NFT is minted on chain B
  • sendNFT is called in Chain A
  • the token is burned on chain A on line 58
  • on chain B the _safeMint fails on line 112 because the NFT already exists

One might argue that this is what what should have happened. The NFT on chain A should never have existed because it existed on chain B. But what logic in NFTLinker prevented that? Further, will the owner of the NFT on chain A accept that their NFT never should have existed. Presumably they paid good money for it.

There are two main options for a roll back:

  1. the NFT is re-minted on chain A but users are told they can never send them to chain B.
  2. the NFT remains burned and user on chain A is compensated in some way

In option 1 we must accept that NFTLinker doesn't work as intended because we cannot send it to chain B, and in option 2 we require remediation outside of the scope of the NFTLinker contract. This is defeats the entire purpose of software i.e. automation.

ERC20CrossChain example

I'm going to assume that the ERC20CrossChain contract is a user written contract, owned by someone other than Axelar. Here's a scenario.

  • ERC20CrossChain is deployed by an owner other than Axelar
  • The cross-chain token becomes part of a DeFi app, DeFiApp, that relies on the correct total supply of the token in order to accurately calculate exchange prices. Since the total supply is spread over two chains it does this via some Axelar cross-chain contract calls
  • Let supply on chain A be supplyA
  • Let supply on chain B be supplyB
  • Total supply T == supplyA + supplyB
  • User on chain A calls transferRemote with amount x
  • Until _execute is called on chain B the total supply on both chains is in an intermediate state where T == supplyA - x + supplyB. The invariant has been broken. Worse, there could be several transfers "in flight" at one time, further breaking the invariant.
  • DeFiApp, attempting to calculate total supply, has its cross-chain contract calls occur before ERC20CrossChain._execute. As the architecture is currently described the order in which cross-chain calls will occur cannot be guaranteed. Sometimes they will occur out of order, unless the architecture accounts for this
  • Because of this, DeFiApp produces incorrect exchange prices in favour of a hacker

The worst thing about this scenario is that your micro-services cannot remediate the problem. The instances of ERC20CrossChain and the DeFiApp contracts are not owned by Axelar. Therefore Axelar has no way to change the state of these contracts as part of a roll back. The users who wrote DeFiApp must take this into account, and getting the logic right is very difficult.

Conclusion

With the current architecture developers on the Axelar platform will have to handle all of the increased complexity that comes from not having transactions.

Cross-chain invariants will be required and contracts will need to enforce them. Also, roll backs will need to be correctly handled. Both of these things will make programming a lot harder, far beyond the skill level of most smart contract developers.

This is because:

  • consistency is hard
  • rollbacks are hard

In the database field these lessons have been learned over and over. Google, after many years of experimenting with things like MapReduce and Big Table eventually proposed Cloud Spanner specifically because it made programming easier (see here). They write:

Do you like complex application logic? We don’t either. One of the things we’ve learned here at Google is that application code is simpler and development schedules are shorter when developers can rely on underlying data stores to handle complex transaction processing and keeping data ordered. To quote the original Spanner paper, “we believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions.”

A database industry veteran, Michael Stonebraker, has spent much of his later career pointing out how people are making the same mistakes that were made back in the 60s.

In 2009, Mike famously criticized MapReduce to the chagrin of the Big Data community, only to be vindicated five years later when its creators disclosed that Mike’s criticism coincided with their abandonment of MapReduce and Hadoop for yet another round of Big Data management solutions of their own making.” — Michael Brodie, “Making Databases Work” Link to article

Recommended Mitigation Steps

  1. Think very carefully about cross-chain invariants and roll-back code that is necessary for the simple examples you have provided, and modify the code to take them into account.

  2. Alert developers on the Axelar platform to the issues that a lack of transactionality imposes on them. Make it clear that developing correct contracts in this environment will be much harder.

@code423n4 code423n4 added 3 (High Risk) Assets can be stolen/lost/compromised directly bug Something isn't working labels Aug 3, 2022
code423n4 added a commit that referenced this issue Aug 3, 2022
@Foivos
Copy link
Collaborator

Foivos commented Aug 4, 2022

While your points are well thought out and very valid, it is hard to simplify things further. We are doing our best to inform developers of the difficulties and help them along, and we are also learning along the way.
As far as your first scenario is concerned, I don't think you understood how the NFT linker really works. There is no such thing as the same NFT on both chains. The tokenId minted by the token linker is a hash of the original chainName, the address of the operator (original minting contract) and the original tokenId. This is as unique as keccak256 is collisionless, so very.
For your other scenario I don't think on-chain calculations of total supply should be done that way. There is a nice design that arrises from your concern however which is to have a crossChainTotalSupply field present in ERC20CrossChain, which is updated on manual mints on the chain it was minted and a message is sent to other chains to update their crossChainTotalSupply as well. This requires that each contract knows all the other chains it supports, and there will be some lag, but it's not a bad solution.
In conclusion, yes designing cross-chain applications is difficult, and we are here to help with it.

@sseefried
Copy link

sseefried commented Aug 4, 2022

I concede that I may not have fully understood the NFTLinker example. The idea of using hashes to avoid collisions is just the kind of thing developers needs to think about when working in an environment where cross-chain invariants are important.

I'm not expecting this issue to be awarded. I concede that it's probably out of scope. But my general point stands. I truly believe that the last few decades of industry experience and academic research strongly support the position that programming without (database-style) transactions is extremely difficult. It is not for the average developer.

This is why it is up to Platform Developers (such as Axelar) to make things easier for the user. If you don't think about this issue carefully I fear that the number of successful hacks on contracts written for the Axelar platform will be very high. Most developers just won't be sophisticated enough to get the logic correct.

@re1ro
Copy link
Member

re1ro commented Aug 5, 2022

I appreciate the concern raised here. But I think the difficulty is in the nature of cross-chain contact calls. They become asynchronous, rather than synchronous solidity calls within a single chain.

We have rolled yet the base level of cross-chain protocol (think of UDP analogy) and now are building more application specific protocols on top of it (TCP or HTTP).

This can be viewed as an application talking to 2+ databases at the same time (2+ blockchains). If the transactional write fails for 1 DB, you would need to revert writes to other DBs manually. But I totally see the point of us providing the interface that does it automatically. So roundtrip verification and things like that are coming in future. This is critical for cross-chain trading and other DeFi protocols

@GalloDaSballo
Copy link
Collaborator

I find it hard to judge the report without my comment basically boiling down to: "Cross-Chain Tech is still experimental, use at your own risk"

I can comment on the specific examples brought by the Warden, however we will ultimately have to admit that the security of a potential system that will be, and while the in-security of the system in scope can be explored via the contest, we cannot make conclusive statements about what an integrating system will look like.

In that vein the Two Rights Might Make a Wrong article may help illustrate that point.

I think I'll leave the finding as QA because the conversation is worth having, but I don't think we will be able to reach any specific conclusion and would rather leave a recommendation to future devs to audit their code so we can help with that specific case at that specific time.

A couple thoughts on the samples:

  • Per the sponsor comment NFTLinker will use a pseudo-unique keccak combination, odds of clashes are 1/(2^256-1) which are pretty good

  • The discussion about a MultiChain token is more broad and interesting, on one hand we have to recognize that an ERC20 on 2 chains is actually just two separate contracts with an added coordination mechanism
    As such (per the article above as well), we cannot expect "per contract" invariants to be maintained when we add multiple contracts as well as a privileged entity (bridge) to the mix.

For those reasons, while a discussion about how tokens can be bridged is worth having, no conclusion about the security of such as system should be speculated until we have a specific system to review.

With that said I appreciate the convo and wish everyone to build the safest system they can

@GalloDaSballo GalloDaSballo added QA (Quality Assurance) Assets are not at risk. State handling, function incorrect as to spec, issues with clarity, syntax and removed 3 (High Risk) Assets can be stolen/lost/compromised directly labels Sep 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working QA (Quality Assurance) Assets are not at risk. State handling, function incorrect as to spec, issues with clarity, syntax sponsor acknowledged Technically the issue is correct, but we're not going to resolve it for XYZ reasons
Projects
None yet
Development

No branches or pull requests

5 participants