-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mitigate tob-5 by making a v2 of client_propose_tx endpoint #2670
Conversation
consensus/enclave/api/src/lib.rs
Outdated
@@ -246,6 +247,22 @@ pub struct FormBlockInputs { | |||
pub mint_txs_with_config: Vec<(MintTx, MintConfigTx, MintConfig)>, | |||
} | |||
|
|||
/// The schema expected to be encrypted for client_tx_propose_v2 | |||
#[derive(Clone, Message)] | |||
pub struct ClientProposeTxRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the request object that clients use to propose Tx's in the new client_propose_tx_v2
.
/// metadata, or | ||
/// * A serialization / cryptography error, or another mismatch error | ||
/// e.g. minimum_fee_map_hash or (some-day) chain-id error | ||
fn client_tx_propose_v2(&self, msg: EnclaveMessage<ClientSession>) -> Result<TxContext>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't just fold these changes into client_tx_propose
because we need to make a breaking change to the schema in order to insert another layer where we can add things. Making a new v2 version means that the old API continues to work and support clients when we drop in the new server.
consensus/enclave/impl/src/lib.rs
Outdated
.blockchain_config | ||
.get() | ||
.ok_or(Error::NotInitialized)?.canonical_fee_map_digest()[..] { | ||
return Err(Error::FeeMapDigestMismatch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is where we actually check if the hash matches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, that looks like it's going in the right direction.
However, it appears there are zero tests for this. Would it be possible to test at least the enclave part?
We likely need to incorporate this into mobilecoind, integration tests, etc as well...
The other thing I'm not sure about is, maybe I should try to categorize this as a transaction validation error rather than an enclave error. It may be easier for the client to properly handle it if that is done. |
I think we can do this in a separate PR, that can land after branching for v3 perhaps? If we try to make mobilecoind use this in the v3 release then the mobilecoind for that release can't be in customer hands before the v3 servers are live, which is not usually how we do releases. I guess maybe mobilecoind could switch between calling Depending on how much other stuff comes up before we are scheduled to branch for v3 I am willing to try to do that. |
Here's what I suggest -- we should do a follow-on PR for mobilecoind that uses this, and see that it passes tests, to build confidence in this change. But we should only merge that for |
It's an eventually breaking API change for users, this will require an MCIP. |
More to the point, what's wrong with just renaming |
This would be the same as adding the fee map hash to the This would put additional complexity into the Tx and transaction-core, make the Tx hash dependent on this data, and increase the complexity of transaction validation. That's a much larger change that requires touching more sensitive code. Currently the fee-map is not a transaction-core concept. It will also mean that all clients will be forced to plumb the fee map into the transaction builder in order to ship v3. For example, copper will now need to know the current fee map in order to build a Tx. The approach I took seems more realistic with current timelines. I think if you want to do it that way, we will have to delay the release of v3, or punt on doing this mitigation. |
|
||
// Cache the transaction. This performs the well-formedness checks. | ||
let tx_hash = self.tx_manager.insert(tx_context).map_err(|err| { | ||
if let TxManagerError::TransactionValidation(cause) = &err { | ||
counters::TX_VALIDATION_ERROR_COUNTER.inc(&format!("{:?}", cause)); | ||
let result = ProposeTxResult::from(cause.clone()); | ||
response.set_result(result); | ||
response.set_err_msg(cause.to_string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these lines are removed because this is actually dead code.
this was kind of a head scratcher, but I think this is dead code introduced 2 years ago here: 894dd54
when we finally return this error, the response
object will be discarded, so these changes to it have no effect.
@@ -139,6 +139,7 @@ impl From<ConsensusGrpcError> for Result<ProposeTxResponse, RpcStatus> { | |||
match src { | |||
ConsensusGrpcError::TransactionValidation(err) => { | |||
let mut resp = ProposeTxResponse::new(); | |||
resp.set_err_msg(err.to_string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a line that I failed to add to an earlier PR: #2501
I think because I got confused by the aforementioned dead code
Good point about the sequencing. But we do need to make sure this API works and for that we need clients that support it, so the block version switching logic might be required. We just need to make sure it really works, so maybe it should assert if block version is 0 since that should no longer happen (my concern is places where we might not be properly including the block version - like the issue we found in the full service validator earlier this week). |
The thing is that we cannot do this block version switching in thick client because thick client has no idea what the block version is. Tx's are structures that may conform to one or more sets of block version rules, and the block version they target is not recorded. What do you think about this approach: #2670 (comment) So, no clients would be using this new pathway in the wild until 3.1 release, and no block version switching is required, the 3.1 client release would assume this endpoint exists. I think that's less work overall, since we would not need to write all the block-version branching code. We can develop the PR to make those clients do that separate from this one, and when that PR is passing CD we can be sure this PR is working and merge it confidently for v3. But that follow-on PR would not actually merge until v3.1 |
My plan right now is to add a unit test that exercises this functionality in |
I think that's an acceptable compromise but maybe would be better if we merge that to master and revert it on release/v3? |
TOB-5 is about a "token id oracle" described by TOB during review of confidential tokens design. The idea is that the consensus enclave fee map can be changed by the untrusted side, by restarting the node. If the attacker controls a node and is able to get the user to submit their Tx once with one fee map, and once with another, then the priority value leaks information about what the token id of the Tx is. In principle they could do this without the user knowing -- they would restart the node and it would not be able to attest to the peers after changing the fee map, but it could still accept Tx proposals and see the WellFormedTxContext which conatins the priority number. Then they turn the node back to normal. The user's app retries when their transaction doesn't land, and the same Tx comes back to their node, now with the regular fee map. This requires having root on a consensus node, but the threat model is that attackers with root on the consensus node should not be able to see the user's token id. We view this as a time-of-check-vs-time-of-use (TOCTOU) bug, and we try to fix it a way that those bugs are sometimes fixed. The user, who checked the fee map, also makes an assertion about what the fee-map is when they try to use the system (by submitting a Tx). Before acting on the data (which the attacker has potentially tampered with), the enclave checks if the user's assertion is correct, and rejects the Tx without creating a WellFormedTxContext if the assertion is not correct. This should be straightforward to add this check, but unfortunately, the status quo `client_propose_tx` has a design flaw. It does not have a `Request` protobuf as the RPC argument, the argument it takes is directly an (encrypted) Tx. So, we cannot add any additional data to that protobuf in a non-breaking way without adding things to `Tx`. And that is untenable in this case because it adds complexity in sensitive areas of the code. We really just want to add this check next to the `Tx`. To fix this, we introduce a `ClientProposeTxRequest` object, which contains a `Tx`, and the client's minimum fee map digest. We introduce `client_propose_tx_v2` which encrypts this `Request` object instead of a naked `Tx`. This will come in handy if we want to add anything else like this. Clients will not be able to use `client_propose_tx_v2` until after the V3 service is live, because existing consensus nodes don't have this endpoint. However, in the next release of clients after the v3 release is live, they could all start using `client_propose_tx_v2` and we could deprecate / remove `client_propose_tx`. I made the fog-sample-paykit exercise the `client_propose_tx_v2` route, since the test client uses that and so we can see that it is working in CD without making `mobilecoind` or anything else use it yet.
This makes it easier for clients to handle this error appropriately. This also removes some dead code and bugs in the consensus grpc error handling.
the fee map cache should actually be a block info cache, and this should be integrated with the existing block info caching stuff. then, factor out some duplicated fee determination code in the test client.
1c7f5f0
to
0a20441
Compare
still WIP, the test is failing due to the enclave not having a valid report yet.
/// Test that we can exercise client_tx_propose_v2 and that it passes and fails | ||
/// as expected. | ||
#[test_with_logger] | ||
fn consensus_enclave_client_tx_propose_v2(logger: Logger) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to create a proper test of the consensus enclave client_propose_tx_v2 this way
I'm still reviewing this, but I think adding the fee map digest into the Sticking it in Pros:
Cons:
I don't feel strongly about either path, but figured I will put my thoughts out there for others to consider in case someone thinks one path is significantly superior to the other. It's also quite possible that it is not a good use of time to debate this, and we should just move forward with what was already built here since it works... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting aside the discussion on whether this is the right approach or whether the fee_map_digest should be added to Tx
, I think the code here is fine if we choose to go this route and not the other.
@@ -1980,9 +1980,9 @@ checksum = "b294d6fa9ee409a054354afc4352b0b9ef7ca222c69b8812cbea9e7d2bf3783f" | |||
|
|||
[[package]] | |||
name = "libc" | |||
version = "0.2.135" | |||
version = "0.2.134" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some weird downgrades here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, will fix
comments from first part: > these lines are removed because this is actually dead code. > > this was kind of a head scratcher, but I think this is dead code introduced 2 years ago here: 894dd54 > > when we finally return this error, the response object will be discarded, so these changes to it have no effect. comments from second part: > this is a line that I failed to add to an earlier PR: #2501 > > I think because I got confused by the aforementioned dead code
* add bugfix from #2670 comments from first part: > these lines are removed because this is actually dead code. > > this was kind of a head scratcher, but I think this is dead code introduced 2 years ago here: 894dd54 > > when we finally return this error, the response object will be discarded, so these changes to it have no effect. comments from second part: > this is a line that I failed to add to an earlier PR: #2501 > > I think because I got confused by the aforementioned dead code * add tests of consensus enclave propose_tx API * linting * Update consensus/enclave/tests/enclave_api_tests.rs
* add bugfix from mobilecoinfoundation#2670 comments from first part: > these lines are removed because this is actually dead code. > > this was kind of a head scratcher, but I think this is dead code introduced 2 years ago here: 894dd54 > > when we finally return this error, the response object will be discarded, so these changes to it have no effect. comments from second part: > this is a line that I failed to add to an earlier PR: mobilecoinfoundation#2501 > > I think because I got confused by the aforementioned dead code * add tests of consensus enclave propose_tx API * linting * Update consensus/enclave/tests/enclave_api_tests.rs
This is now moot, Eran fixed this a different way in a series of PRs linked from: #2611 |
TOB-MCCT-5 is about a "token id oracle" described by TOB during review of confidential tokens design. The idea is that the consensus enclave fee map can be changed by the untrusted side, by restarting the node. If the attacker controls a node and is able to get the user to submit their Tx once with one fee map, and once with another, then the priority value leaks information about what the token id of the Tx is.
In principle they could do this without the user knowing -- they would restart the node and it would not be able to attest to the peers after changing the fee map, but it could still accept Tx proposals and see the WellFormedTxContext which conatins the priority number. Then they turn the node back to normal. The user's app retries when their transaction doesn't land, and the same Tx comes back to their node, now with the regular fee map.
This requires having root on a consensus node, but the threat model is that attackers with root on the consensus node should not be able to see the user's token id.
We view this as a time-of-check-vs-time-of-use (TOCTOU) bug, and we try to fix it a way that those bugs are sometimes fixed. The user, who checked the fee map, also makes an assertion about what the fee-map is when they try to use the system (by submitting a Tx). Before acting on the data (which the attacker has potentially tampered with), the enclave checks if the user's assertion is correct, and rejects the Tx without creating a WellFormedTxContext if the assertion is not correct. (This is sort of like dealing with races when manipulating atomic variables by using compare-and-swap.)
It should be straightforward to add this check, but unfortunately, the status quo
client_propose_tx
has a design flaw. It does not have aRequest
protobuf as the RPC argument, the argument it takes is directly an (encrypted) Tx. So, we cannot add any additional data to that protobuf in a non-breaking way without adding things toTx
. And that is untenable in this case because it adds complexity in sensitive areas of the code. We really just want to add this check next to theTx
.To fix this, we introduce a
ClientProposeTxRequest
object, which contains aTx
, and the client's minimum fee map digest. We introduceclient_propose_tx_v2
which encrypts thisRequest
object instead of a nakedTx
. This will come in handy if we want to add anything else like this in the future.Clients will not be able to use
client_propose_tx_v2
until after the V3 service is live, because existing consensus nodes don't have this endpoint. However, in the next release of clients after the v3 release is live, they could all start usingclient_propose_tx_v2
and we could deprecate / removeclient_propose_tx
.I made the fog-sample-paykit exercise the
client_propose_tx_v2
route, since the test client uses that and so we can see that it is working in CD without makingmobilecoind
or anything else use it yet.I am not sure if this change requires an MCIP. It is a protocol / api change, but not a big one, and users are unlikely to notice. If it does require an MCIP we can write one pretty easily. Let me know what you think.
Also, lmk if you think there's a better approach than the route I took here.