Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test vectors for SIGHASH_UTXOS #22

Closed
cculianu opened this issue Jul 2, 2022 · 23 comments
Closed

Add test vectors for SIGHASH_UTXOS #22

cculianu opened this issue Jul 2, 2022 · 23 comments

Comments

@cculianu
Copy link
Contributor

cculianu commented Jul 2, 2022

In reference to: https://github.com/bitjson/cashtokens/blob/master/readme.md#sighash_utxos

Rationale:

  • This is a complex change
  • Breaking it out into a separate CHIP allows for separate testing
    • Why? implementation requires some significant changes to codebase, to how node signs, etc. In general signing doesn't take all the input coins as input to the sign functions.
  • By breaking it out as a separate chip more focus can be placed on it and it can be studied and tested more carefully
  • CashTokens is already a huge change, tacking this on as a "hey by the way" is rather significant and not ideal.
@A60AB5450353F40E
Copy link
Contributor

Fyi in a version of Group I wanted to include prevout locking script in the hash for the ID, and realized it could lead to a huge amount of data being hashed when spending non-standard utxos so had to introduce a limit for that version of token genesis.

@A60AB5450353F40E
Copy link
Contributor

A60AB5450353F40E commented Jul 2, 2022

This:

Someone with access to hashpower could prepare outputs with locking scripts of 10,000 bytes and then spend them all as group genesis inputs where every such input would require hashing 10,260 bytes per input.
Consensus limits transaction size to 1MB, and the smallest input is an anyone-can-spend with OP_1 as the signature and having the genesis declaration, which would be 55 bytes.
Smallest output would be an OP_RETURN output script, which would be 10 bytes.
From that we can calculate the maximum number of inputs in a transaction:
(1000000-10)/55 == 18181.
From there, we could calculate the total amount of hashed bytes: 10260*18181 == 178 MiB, which is close to comparable Script vectors such as OP_1 <520> OP_NUM2BIN OP_DUP OP_HASH256 OP_DROP OP_DUP... pattern.
To avoid any risk, we propose to mitigate this risk by prohibiting declaring a group genesis on inputs that have their len(lockingScript) greater than MAX_SCRIPT_ELEMENT_SIZE which is currently 520 bytes.
This would bring the upper bound on hashed bytes down to: 760*18181 == 13.2 MiB, making group genesis a very inefficient way to load node CPUs.

I think it applies the same to SIGHASH_UTXOS

@cculianu
Copy link
Contributor Author

cculianu commented Jul 2, 2022

This is a great point! And precisely why i think it should be broken out into a separate CHIP{ so that all of this can be worked on and specified ...! :)

@cculianu
Copy link
Contributor Author

cculianu commented Jul 2, 2022

@A60AB5450353F40E Here's another funny thing I discovered in the consensus rules -- nowhere in the rules can I see a limit on the size of an output script (in nonstandard txns). The MAX_SCRIPT_SIZE is only applied when spending an input -- and only at the point where you begin to evaluate the script in question! You can actually produce blocks with huge output scripts that exceed MAX_SCRIPT_SIZE. Not sure how this impacts this feature -- likely it doesn't at all, although maybe you can at least trick a node into hashing even more than 178 MiB somehow using that trick? (Of course the txn would fail to be spent anyway eventually when it gets to its input scripts that exceed MAX_SCRIPT_SIZE).

Just another curious thing that may or may not affect that calculation that leads to 178MiB .. but yeah. In principle -- that's a huge amount of data!

@bitjson
Copy link
Member

bitjson commented Jul 8, 2022

Thanks for opening an issue @cculianu! I'd be ok with breaking out SIGHASH_UTXOS into its own CHIP if that would help. I do want to note: SIGHASH_UTXOS is security-critical for CashTokens, and they must be deployed at the same time. Every wallet I've reviewed (with the possible exception of Electron Cash, see several here) will be vulnerable to attacks where their data source(s) can:

  • Burn tokens, and/or
  • Trick the wallet into performing unintended actions in decentralized applications (e.g. Jedex), often allowing tokens to be directly stolen.

I think it would be a serious mistake to allow CashTokens to activate without SIGHASH_UTXOS, as many transactions involving tokens should enable it (for example, many of Libauth's test vectors for CashTokens include signatures with SIGHASH_UTXOS). I also don't really see how a separate CHIP would improve review (I'd hope that reviewers will apply just as much rigor to components of this spec as they would apply if each component got its own document 😅). So I'm inclined to leave it in the primary CHIP.

Also, having multiple documents makes implementation harder for end-readers (in my experience), and deciding where to chop up the spec isn't obvious. E.g. if SIGHASH_UTXOS belongs in its own document, do the Token Inspection Operations? What about CashAddress Token Support or Fungible Token Supply Definitions? I'd really prefer to keep everything in a single document.


@A60AB5450353F40E on hashing: note that this is the purpose of transaction "standardness" – miners can safely include standard transactions in blocks knowing that standard transactions will not require excessive resource usage during validation. As with other nonstandard transactions, the worst-case validation performance you're describing affects only the miner choosing to include the transaction: if they pack their block full of expensive-to-validate transactions, their block will not be accepted fast enough by the network, and it will become stale if another miner finds a competing block that is accepted more quickly by other miners. This is the same reason that several other attacks are not currently practical.

If we were to relax non-P2SH transaction standardness, we'd need to address this and several other issues, but for now, there's no additional value to adding other consensus limits around bytes hashed by SIGHASH_UTXOS; the existing non-P2SH standardness limit is already far more restrictive.

@cculianu
Copy link
Contributor Author

cculianu commented Jul 8, 2022

@bitjson: @A60AB5450353F40E and I both fixed the accidental burn issue. Are you aware that signing a token input now requires additional data in the signature hash? It is impossible for unupgraded software to burn tokens, or for unupgraded software to be tricked or for tokens to be stolen.

Are you aware of the sighash modification that @A60AB5450353F40E and I came up with? It just requires the token data to be committed in the signature hash, basically.

If you are not familiar with it, please familiarize yourself with it.

if you are aware of it, then please enlighten me as to how one can steal tokens or how unupgraded software can burn tokens inadvertently?

@bitjson
Copy link
Member

bitjson commented Jul 8, 2022

Yes! The spec includes a Signing Serialization of Tokens section now. (Fixed in #11. Thank you again for realizing that we don't need to reserve a SIGHASH bit for tokens in general.)

SIGHASH_UTXOS addresses a different set of wallet vulnerabilities arising when transactions with multiple inputs can require multiple signing attempts (the issue I was talking about here).

This was also addressed by BTC BIP341/Taproot's SIGHASH_DEFAULT, rationale here (and one exploit is described here), but the issue is even more acute for BCH because CashTokens enables significantly more powerful contracts: contract logic (and tokens) can be split across multiple UTXOs to accomplish much more interesting validation (e.g. Jedex).

So not only can wallets be mislead into signing multiple variations of the same transaction to burn fees or tokens, wallets can also be tricked into interacting with unexpected UTXOs, performing unexpected actions in decentralized applications. In practice, that means that the attacker can actually trick the user into giving value to the attacker (the wallet service backend, another entity in a multisig wallet, a hardware wallet's compromised host computer, etc.). E.g. a user's wallet thinks they are buying some tokens on a DEX, but in reality they are providing the backend with all the signatures necessary to reconstruct a transaction that "sells" their whole token balance to the attacker at price near 0 (effectively stealing those tokens).

Attacks can be prevented if the signing wallet/device decodes and inspects the original transaction behind every input to confirm that a UTXO of the expected shape is actually being spent (e.g. the signer performs full SPV or neutrino validation). However, just looking at our existing landscape of BCH wallets, Electron Cash is the only wallet that could implement the additional validation without major upgrades to its internal database – I'm not even aware of any actively-maintained mobile BCH wallets that could implement this validation without major re-architecting.

And that's not to mention hardware wallets: if we want hardware wallets to ever be able to safely interact with the decentralized applications made possible by CashTokens, the hardware wallets would need to either 1) implement that same validation (requiring possibly MBs of transaction data – remember this alone is why BTC added SIGHASH_DEFAULT in the Taproot upgrade), or 2) commit to UTXO contents in the signing serialization (requires no additional data, even among mobile BCH wallets that rely on trusted backends).

TLDR: we can either:

  1. Force all wallets/signing devices to validate more data from previous transactions to safely interact with the decentralized applications made possible by CashTokens (this would typically require architectural changes and significantly more data/memory/storage), or
  2. Allow wallets to commit to the precise contents of UTXOs (SIGHASH_UTXOS).

Option 1 would likely either stifle adoption of CashTokens in existing wallets (since proper support becomes a much more significant upgrade) or encourage existing wallets to cut corners and ship implementations that are vulnerable to subtle, hard-to-understand exploits (and any successful exploits could be reasonably blamed on "the CashTokens CHIP not addressing a security issue").

Option 2 conclusively solves all related vulnerabilities and makes adding CashTokens support to wallets significantly easier (without increasing data/memory/storage requirements).


Thanks for prompting this discussion, and sorry I haven't had a chance to write a more thorough explanation! I'll try to clean this up and get a more detailed rationale section into the CHIP.

@cculianu
Copy link
Contributor Author

cculianu commented Jul 9, 2022

I mean this just proves that SIGHASH_UTXOS needs to be its own chip. It's not essential for the basic use-case of CashTokens, which would be an SLP 2.0 work-alike.

It's perhaps essential for what you had in mind using CashTokens -- which is to enable some really advanced smart contract features.

As such, it really should be its own chip, and it needs to be fleshed out more and thought about separately, I think. We can make it for May 2023 with it, it just should be broken out for clarity.

@cculianu
Copy link
Contributor Author

cculianu commented Jul 9, 2022

I mean look at the very least SIGHASH_UTXOs needs to be specified well -- I merely suggest breaking it out into a separate CHIP for organizational purposes. You can even say both CHIPs are required to go together else we won't get all the sexy features... that's fine. We can promise to implement the two together...

If you don't want to do a separate CHIP .. then can you at least break down SIGHASH_UTXOs more to specify it fully along with test vectors or something like that?

You can make it a sub-CHIP of this big master chip maybe?

@cculianu
Copy link
Contributor Author

cculianu commented Jul 9, 2022

ok @bitjson and @A60AB5450353F40E thanks for explaining to me why SIGHASH_UTXOS is critical. Shall I rename this issue to "specify SIGHASH_UTOXS more completely" or something more abstract? It seems @bitjson doesn't want to beak it out into a separate CHIP.. which is fine. I still think it needs more meat in terms of the spec document...

@cculianu cculianu changed the title SIGHASH_UTXOS should be seperated out into a separate CHIP SIGHASH_UTXOS needs more exact specification Jul 9, 2022
@cculianu cculianu changed the title SIGHASH_UTXOS needs more exact specification SIGHASH_UTXOS needs more specification Jul 9, 2022
@A60AB5450353F40E
Copy link
Contributor

@bitjson just to add something from my chat with Calin to the rationale, is this right:

Say you're being offered a CoinJoin TX that swaps some tokens and are supposed to sign with sighash_all. Without fetching non-local data, you only know: what you're spending (your 1 utxo) and what you're getting (transaction outputs), but you have no idea what's behind other inputs - you only "see" outpoints (txid/index). Now normally with p2pkh it's all fine, those will need signatures of their own.However, for advanced contracts introspection allows us to have outputs that will allow them to delegate signing to other outputs, so by signing just 1 output of yours, you could be allowing other otputs you own to be spent alongside it - without realizing you're also authorizing those (unless you looked up each input's prevout) by signing just 1 of yours. You could be fooled into giving away more than you think you're giving by signing that 1.

yeah it's not critical for cashtokens - but it seems like really important for some applications that will be built using cashtokens/introspection

wallets that don't sign using sighash_utxo wouldn't need to change their architecture, but they also couldn't be fooled into producing a signature that would spend some other coins they own because... they'd be unable to produce a signature with sighash_utxo

if you have sighash_utxo you can write safe contracts - those that require the bit on the sibling input, so even if unaware wallets wouldn't bail, they just couldn't spend the dependent utxos

@imaginaryusername
Copy link
Contributor

I haven't fully formed my opinion around this yet, but it seems like the one identified nonstandard DoS vector is shared with introspection. It may not be immediately urgent, but next upgrade cycle we should aim to align standard and nonstandard output formats, which will eliminate a whole host of possible vectors including very large outputs.

Wallets and indexers can be recommended to keep the implementations loose, and keep upgrade paths open - and given the status quo on other bitcoin-like chains, it's unlikely they'll specifically tighten their implementations for BCH anyway.

@A60AB5450353F40E
Copy link
Contributor

A60AB5450353F40E commented Jul 11, 2022

Yeah I reached the same conclusion after our brief chat - we should specify sane limits using consensus specification, with the ultimate goal of being aligned. SIGHASH_UTXO would not have that 200-ish MiB edge case if consensus was limiting script size to 1650.

@cculianu
Copy link
Contributor Author

Regarding SIGHASH_UTXOS having the same DoS properties as native introspection -- yes, this is the case.

@bitjson
Copy link
Member

bitjson commented Jul 12, 2022

@A60AB5450353F40E:

just to add something from my chat with Calin to the rationale, is this right: Say you're being offered a CoinJoin TX [...]

Yes, that's a good example! And there are lots of similar situations possible with the cross-contract interfaces enabled by CashTokens, even for non-CoinJoin transactions.

if you have sighash_utxo you can write safe contracts - those that require the bit on the sibling input, so even if unaware wallets wouldn't bail, they just couldn't spend the dependent utxos

I'd maybe just add here that most contracts probably wouldn't want to inspect for SIGHASH_UTXOS. Inspecting in the contract would waste bytes in the produced transaction itself – it's maybe more likely that any SIGHASH_UTXOS requirement would just be indicated in the internal "template" used by the wallet to understand the contract. Similar result, but avoids increasing transaction sizes with validation that doesn't need to be performed on chain.

@bitjson
Copy link
Member

bitjson commented Jul 12, 2022

@cculianu thank you for the issue, I'll add a set of test vectors as soon as possible 👍

@bitjson
Copy link
Member

bitjson commented Jul 12, 2022

@imaginaryusername:

I haven't fully formed my opinion around this yet, but it seems like the one identified nonstandard DoS vector is shared with introspection. It may not be immediately urgent, but next upgrade cycle we should aim to align standard and nonstandard output formats, which will eliminate a whole host of possible vectors including very large outputs.

Could you describe the attack you're thinking about? Who is at risk?

@imaginaryusername
Copy link
Contributor

imaginaryusername commented Jul 12, 2022

Could you describe the attack you're thinking about?

A transaction can potentially make a node's evaluation pipeline load and process data that is much bigger than the transaction itself, breaking sanity limits imposed by tx size limit and blocksize limit. Even without introspection a transaction can already "import" outputs that don't count towards txsize/blocksize for itself, but I was worried about devices like OP_UTXOBYTECODE and SIGHASH_UTXOS potentially multiplying memory and CPU consumption exponentially since they can refer to other inputs as well.

We had a discussion about this yesterday, and @cculianu went through some digging in the code about where various limits are enforced. It did turn out that the problem is much less severe than we thought - sane caching and early checks are in place - so at least the problem isn't exponential, we just need to keep similar precautions in mind when implementing SIGHASH_UTXOS.

The problem isn't entirely gone though - with or without introspection, a potential tx stuffed full of spends from nonstandard 10kb outputs will still be way larger at evaluation than the txsize/blocksize limits would otherwise suggest, and there may be vectors at SIGHASH_UTXOS and introspection we haven't considered yet. So a future alignment of consensus output limit to standard formats is still desired for scaling and safety purposes.

@bitjson
Copy link
Member

bitjson commented Jul 14, 2022

The problem isn't entirely gone though - with or without introspection, a potential tx stuffed full of spends from nonstandard 10kb outputs will still be way larger at evaluation than the txsize/blocksize limits would otherwise suggest, and there may be vectors at SIGHASH_UTXOS and introspection we haven't considered yet. So a future alignment of consensus output limit to standard formats is still desired for scaling and safety purposes.

Could you describe the worst case scenario here? Reducing consensus limits to equal standardness limits seems to make future upgrades harder, and I'm not following how it improves network security.

@cculianu
Copy link
Contributor Author

cculianu commented Jul 14, 2022

Could you describe the worst case scenario here? Reducing consensus limits to equal standardness limits seems to make future upgrades harder, and I'm not following how it improves network security.

Yes. So the attack is this: You mine a bunch of txns that have locking scripts that are ~1MB. Consensus now allows this! Of course you can never evaluate those scripts -- consensus fails upon evaluation -- but you can MINE them and they exist and are.. stuffed into the utxodb.

Later on you make a block that has a txn that references, say, 20,000 such outputs that are each 1MB in size. Due to the way BCHN (and Core) are built -- the node reads them all from the DB.. all 20GB of data.. before deciding much later on in the pipeline that the txn is invalid due to it trying to evaluate locking scripts that are >10KB.

The issue is a plumbing issue within the node itself -- namely it's actually saving large UTXOs to the db (ones that are unspendable).. and later on it's too stupid to abort early so it ends up reading in 20GB of data it will never use.

It gets even worse than that because block validation is done in parallel under default config -- so you can load a 32MB block full of 32 such txns -- so you are looking at like 640GB potentially of data reads from the utxodb before the node figures out the block is invalid.

To be fair.. this really an issue of a combination of problems -- namely how the plumbing is laid out in Core-descendant nodes as well as how the consensus rules are subtle and surprising. But it doesn't help that the consensus rules are subtle, surprising, and difficult to understand even for experts.

Regularizing the consensus rules will help prevent such "surprises" in the future.

@bitjson @imaginaryusername @A60AB5450353F40E I bet you didn't even know that you could mine a 1MB locking script as a scriptPubKey didja, until I stumbled across this? Neither did I. The rules are that bizarro and difficult to reason about... which is evidence in and of itself that they should be regularized and made less surprising and difficult to reason about!!

@bitjson
Copy link
Member

bitjson commented Jul 14, 2022

Does this require someone to expend the energy to mine invalid blocks? So the malicious miner loses approx. 6.25 BCH per malicious/invalid block, in addition to the 200.1 BCH in transaction fees+dust to set up the 20GB of unspendable UTXOs? If so, the malicious miner needs to first mine 625*32MB valid blocks with the unspendable UTXOs – at least 4.3 days at 100% network hash power – and sell the 3906.25 BCH they mined after its matured before starting their attack. So at least it's reasonably expensive, risky, and we'll have some advanced notice.

Definitely worth correcting that logic in the C++ implementations though, nice find! 🚀

@cculianu
Copy link
Contributor Author

Idk when people were attacking ABC chain they were spending something like $20,000+ to do the reorg attacks they were doing. 6.25 BCH at this hashrate isn't that much... some people have nothing better to do with their money than grief. Granted, it's a bizzaro attack and not a huge threat but still..

@bitjson bitjson changed the title SIGHASH_UTXOS needs more specification Add test vectors for SIGHASH_UTXOS Jul 15, 2022
@bitjson
Copy link
Member

bitjson commented Sep 29, 2022

Done in #58 🚀

@bitjson bitjson closed this as completed Sep 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants