Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix precision of average_output_time #7798

Merged
merged 1 commit into from
Oct 18, 2021

Conversation

SChernykh
Copy link
Contributor

@SChernykh SChernykh commented Jul 16, 2021

The fix as suggested by jberman on IRC. Before the fix, it would truncate 1.9 to 1 skewing the output selection.

Copy link
Contributor

@tobtoht tobtoht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, merging this change into the release branch is a very bad idea. This introduces a transaction uniformity defect that allows distinguishing between transactions that were created using the old decoy selection algorithm. Not all user keep their wallet software up to date, and not all wallet will use the new selection algorithm immediately. It seems better to me to include this in a hardfork.

NACK.

@SChernykh
Copy link
Contributor Author

SChernykh commented Jul 16, 2021

This introduces a transaction uniformity defect that allows distinguishing between transactions that were created using the old decoy selection algorithm

We discussed it on IRC. The current code has another serious problem and it's division by zero as soon as we have more than 120 outputs per block on average (right now it's 63 outputs per block) - average_output_time will be calculated as 0. Apart from that, current selection algorithm is already somewhat weakened by this bug. If it's updated in the next release, all pools + exchanges will switch, providing enough transactions with the new algorithm. There are pros and cons to having it in releave v0.17, but I think pros outweigh the cons.

@selsta
Copy link
Collaborator

selsta commented Jul 16, 2021

We have automatic updates for Feather + Monero GUI, plus also monerod notifies when a new update is out. Together with Reddit + mailing list we should get the users updated quite quickly.

@tobtoht
Copy link
Contributor

tobtoht commented Jul 16, 2021

providing enough transactions with the new algorithm.

Making matters worse for users unaware of the update. Not everyone checks /r/Monero on the daily or is subscribed to the mailing list. Auto updates only cover a subset of users, can be ignored, and are broken on some operating systems (GUI on Tails, Feather on macOS). Some users may not care enough about their privacy or may think it isn't relevant enough to their threat model to update, which directly harms the privacy of users using the new algorithm by lowering the effective ring size. For this reason I think we should be very conservative with changes that impact transaction uniformity, especially for something as critical as decoy selection. Since this isn't a security vulnerability I also see no incentive for merchants and exchanges to upgrade swiftly (outside of their willingness to aid in ensuring Monero maintains its privacy guarantees). Moreover, there is no ETA for the next hard fork, which means we'll probably have these two anonymity puddles coexist for at least the next 6 months. I would personally prefer a quick hardfork that includes BP+ and this decoy selection change rather than merging this change in the release branch and hoping users and services will upgrade.

If we go ahead with this then at the very least I would like to propose setting a target date for the CLI/GUI release, so other wallets can coordinate their releases accordingly. And adding a banner to getmonero alerting users to update as soon as possible.

@SChernykh
Copy link
Contributor Author

SChernykh commented Jul 16, 2021

@tobtoht You should also consider the attack vector (flooding network with transactions until the current code breaks with division by zero). The information is public now anyway.

which directly harms the privacy of users using the new algorithm by lowering the effective ring size

Current faulty algorithm also lowers the effective ring size. There's no perfect way of going forward here.

@tobtoht
Copy link
Contributor

tobtoht commented Jul 16, 2021

You should also consider the attack vector (flooding network with transactions until the current code breaks with division by zero).
There's no perfect way of going forward here.

You're right. My napkin math suggest that a motivated attacker could pull this off within 17 days, costing only ~130 XMR. That is not enough time to organize a hard fork. This PR has my re-approval, given no other options.

If we go ahead with this then at the very least I would like to propose setting a target date for the CLI/GUI release, so other wallets can coordinate their releases accordingly.

@selsta Can you comment on the expected time-frame for the next release? It would be nice if we could tag whenever ready then wait a few days to allow other wallets rebase, test and have a release ready in time for the CLI/GUI release.

@j-berman
Copy link
Collaborator

j-berman commented Jul 16, 2021

@SChernykh @tobtoht

An ugly, temporary hack to protect the division by 0 vector until a hard fork could be to run the correct calculation if average_output_time is 0:

average_output_time = DIFFICULTY_TARGET_V2 * blocks_to_consider / outputs_to_consider; // this assumes constant target over the whole rct range
if (average_output_time == 0)
    average_output_time = DIFFICULTY_TARGET_V2 * blocks_to_consider / static_cast<double>(outputs_to_consider);

I can continue digging to assess the privacy implications with data. There is a chance that this skew to the selection algorithm is not significant enough to have caused material privacy issues to surface, or will cause material issues to surface in the future. More analysis is needed to be certain of the full extent of its implications, I think.

Edit: provided a slightly improved alternative hack.

@selsta
Copy link
Collaborator

selsta commented Jul 17, 2021

Can you comment on the expected time-frame for the next release?

2 weeks or so

@j-berman
Copy link
Collaborator

j-berman commented Jul 17, 2021

To add some detail here that I mentioned in #monero-research-lab as well, I simulated get_outs 10k times pre-fix and post-fix to be certain of exactly what the impact of this bug was:

get_outs (pre vs  post-fix)

It seems get_outs pre-fix simply has a fatter tail as a result of its bias to select older outputs. My opinion on this, which is a hunch, is that it wouldn't be feasible to determine the difference between 2 individual tx's, where one is pre-fix and the other uses the fix in this PR. So it would be patchy guesswork to actually differentiate 2 puddles. And this fix is therefore just a marginal improvement to improve the decoy selection algorithm (edit: that would marginally improve the privacy of some transactions).

I could definitely be wrong. Perhaps the 2 puddles could be guesstimated with the right algorithm. I think it would take a lot of work to know the answer to this question definitively, and at the moment I'm pursuing analysis on the algorithm in general.

To err on the side of extreme caution, maybe it would make sense to go with the hack I proposed in that comment above for the next release to protect against the divide by 0 vector? And can continue the analysis separately on privacy. I apologize for not recognizing the divide by 0 attack vector when I initially spotted this.

For reference, here's my code to repro the simulation (uses pre-fix code in the PR).

@j-berman
Copy link
Collaborator

Here's how I think it may be possible to try and differentiate 2 puddles:

For pre-fix transactions constructed at the block height I ran the simulation at, you would expect ~1 decoy to be selected from the last 100 blocks, but post-fix you would expect ~1.6 decoys to be selected from the last 100 blocks. Therefore, if you see a ring with 3 outputs near that block, you have a stronger reason to separate that transaction into the post-fix puddle (2 decoys from last 100 blocks + 1 expected recent real = higher chance it came from post-fix). Or if you see a ring with 0 outputs near that block, you have a stronger reason to separate that transaction into the pre-fix puddle.

A heuristic like that could potentially be used to try and separate transactions into puddles. It couldn't possibly be perfect, though.

Here's the data for the simulations that also shows how I got the numbers above: get_outs (old vs new).ods

@selsta
Copy link
Collaborator

selsta commented Jul 18, 2021

Here's how I think it may be possible to try and differentiate 2 puddles:

Let's say you find out someone's tx ring selection is post-fix algo. What would they gain by this? I know in general we want as much uniformity as possible but still, how would privacy be weakened?

I don't see much risk currently is simply fixing it in a release.

@j-berman
Copy link
Collaborator

j-berman commented Jul 19, 2021

Here's my best shot: when an output from a post-fix tx is included in a subsequent ring later on, you have a marginally stronger reason to believe that subsequent tx should be a post-fix tx as well (since that output may have been the change output, and you can safely assume that user spending the change output has already upgraded). Therefore when you see that output included in a ring of a subsequent pre-fix tx, perhaps you can eliminate that output from the plausible real outputs in that ring. But even then, there is no guarantee that is the correct thing to do, since there is no way to know if that output was a change output from the original post-fix tx.

Perhaps using a methodology like this, you can build up a tx graph of implausible post-fix to pre-fix tx's, or pre-fix to post-fix tx's, and do some guesswork to reduce effective ring sizes.

But I agree, the risk does seem extremely low, probably nonexistent to me. First, you have to be able to determine if a tx is pre-fix or post-fix, which is already guesswork. Then, you have to do guesswork again above to guess eliminate some outputs from rings in subsequent tx's.

@j-berman
Copy link
Collaborator

A clearer example: let's say you're certain a tx is a post-fix tx, and for simplicity, the tx only has 1 ring as input. 10/11 outputs in the ring come from tx's you're certain are pre-fix tx's, and 1/11 outputs come from a post-fix tx. You have marginally stronger reason to believe that the post-fix output in the ring is the real output, and it is the user spending change, especially if it is a recent output.

With more rings in the tx that have the same makeup, the more accurate the guess can be (e.g. a post-fix tx with 100 rings and every single ring only has a single output from post-fix tx's in them.. it's fairly obvious that all the "post-fix" outputs in every ring of that tx are the real spents).

But again, this relies on being able to differentiate between pre and post-fix tx's, which I don't believe is possible to do with certainty. This is more of a highly unlikely hypothetical to make it clearer what the potential risk is in that scenario.

@j-berman
Copy link
Collaborator

j-berman commented Jul 24, 2021

Visualizing the impact of the fix with 10 newly generated samples of pre-fix and post-fix decoy selections:

pre-fix vs post-fix decoys

You can see that as expected, the post-fix rings tend to be lower (since the correct average_output_time is higher with the fix), but notice how close data points are to each other for each line. For the lowest post-fix decoy collections, there is 1 pre-fix collection just a tiny bit above, and for the highest pre-fix decoy collections, there is 1 post-fix collection just a bit below.

I can work on applying a more rigorous mathematical framework, but I believe this is fairly solid evidence that it would not be possible to identify pre and post-fix tx's with certainty. You could make a solid guess for maybe 10% of transactions that fall in the extremes, but that seems to be about it.

To demonstrate, here is another sample, try and determine which lines are pre-fix and which are post-fix:

guess pre-fix and post-fix

And clarifying, even with the knowledge of a single transaction as pre- and post-fix, I don't see any useful information that can come from that. You need to be certain for other transactions tied to the outputs in each of that transcation's rings as well to gain semi-useful information. And because it only seems that 10% maybe 15% of transactions can be bucketed as pre or post-fix, there does not seem to be much risk at all to me (15% ^ 11 = 0.000000001).

Edit: here's the data + more samples comparing pre- and post-spam as well:
get_outs pre-fix vs post-fix decoy samples.ods

@j-berman
Copy link
Collaborator

My opinion: this PR should not be merged as is, and we should instead go with this fix for next release ASAP:

THROW_WALLET_EXCEPTION_IF(num_rct_outputs == 0 || outputs_to_consider == 0, error::wallet_internal_error, "No rct outputs");
average_output_time = DIFFICULTY_TARGET_V2 * blocks_to_consider / outputs_to_consider; // this assumes constant target over the whole rct range
if (average_output_time == 0) {
  average_output_time = DIFFICULTY_TARGET_V2 * blocks_to_consider / static_cast<double>(outputs_to_consider);
}
THROW_WALLET_EXCEPTION_IF(average_output_time == 0, error::wallet_internal_error, "Average seconds per output cannot be 0.");

In the meantime, we can continue working on assessing the privacy implications of altering transaction uniformity with the fix in this PR as is.

My reasoning, which includes new information:

  • I realized that divide by 0 is not a DoS vulnerability, rather it would result in undefined behavior. When tested with clang and g++ compilers (credit to @tobtoht for this), it causes clients to select decoys exclusively from the most recent spendable block, therefore causing clients to construct transactions that would reveal the real output in the ring in the vast majority of cases. Thus, I figure the severity of the divide by 0 vulnerability has increased, and so divide by 0 should be protected ASAP.

  • I discovered an additional issue in the decoy selection algorithm (wallet2 decoy selection algorithm ignores very recent outputs #7807), and the fix for it would also cause a change to tx uniformity.

  • Out of caution, it would make sense to spend more time being absolutely certain that the change to the decoy selection algorithm that the fix in this PR would bring would not alter tx uniformity in a way that is harmful for any users, in combination with assessing the implications of altering tx uniformity for the fix to the new issue mentioned in the bullet above.

  • While the fix in this PR as would bring privacy benefits to users since clients would correctly select a larger proportion of more recent outputs as decoys as expected, clients over-selecting later outputs does not appear to be an urgent issue that needs an immediate patch to fix (unlike divide by 0, which is more severe). As in, to my knowledge, there are no transactions today that are definitively linkable as a result of the integer division bug that this PR addresses. Therefore, it does not seem to me that we need to patch this aspect of the bug ASAP, and we can take our time weighing the privacy benefits the fix in the PR brings against the risks of altering transaction uniformity.

Thus, my opinion is that we should protect divide by 0 ASAP, and take more time with more significant changes to the decoy selection algorithm.

@j-berman
Copy link
Collaborator

j-berman commented Aug 4, 2021

Updated opinion after analyzing the impact on transaction uniformity

I believe we should go with the fix proposed in my comment above instead, since going with the fix as is has a seemingly material impact on transaction uniformity when combined with the fix from #7821.

While the fix as is does appear to bring marginal benefit to privacy by selecting more recent outputs, doing so is only marginally helpful, and does not outweigh the cost to users who delay in updating in my opinion. I think it is a completely acceptable solution from a privacy perspective to only protect divide by 0 today, and to hold off on the change from this PR as is until a hard fork.

Looking at the numbers

Combining the fix in this PR as is with the fix in #7821, the impact to transaction uniformity appears to be relatively significant. Since both fixes cause clients to select decoys that are more recent, both fixes in combination (rather than just 1 in isolation) cause the deviation from transactions pre-fix and post-fix to be larger.

As noted in this comment from #7821, the risk posed by tx uniformity from #7821 is extraordinarily rare. The risk surfaces when there is a small minority of users who do not update. I mentioned in that comment that the risk is posed to users who construct transactions with 20 rings. Reiterating, with that fix, when a user constructs a 20+ ring transaction, there is only a ~1% chance that every single ring would exclude a very early output in it. Thus, if a user using an older client constructs a 20+ ring transaction that uses an output constructed from a prior 20+ ring transaction, then it becomes obvious that output is real once the number of people using older clients dwindles. Users constructing linked 20+ ring transactions seems extraordinarily rare. And thus, the risk presented by #7821 seems insignificant.

Now, with the fix in this PR as is in combination with #7821's, the impact is more significant: there would be a ~1% chance a user could construct a transaction with 5 rings that excludes an output more recent than 20 blocks, but a ~27% chance for an older client. Thus, if you have 2 linked transactions with 5 rings each, and 0 outputs more recent than 20 blocks, it very likely comes from an older client (since there is only a ~1% chance of that happening on a newer client). And therefore if there are very few older clients, it is much clearer that is the real spent.*

Taking a step back

The numbers in question with the addition of the fix in this PR clearly have a much more significant impact on tx uniformity. In the former, the risk is contained to linked 20+ ring transactions that exclude only very recent outputs. And in the latter, the risk is widened to linked 5+ ring transactions that exclude outputs from within the past 20 blocks. Therefore, it appears the addition of the fix in this PR as is presents a higher level of risk to older clients that do not update.

Thus, considering the fix from #7821 does a good job on its own of protecting recent outputs, and considering the fix as is in this PR would compound on that by also selecting more recent outputs in greater frequency, the benefits of this PR as is are nothing to write home about, while the risks are clearer.

So my view is that protecting divide by 0 in next release -- but continuing to allow integer truncation until a hard fork as a precautionary measure for older clients that do not update -- is the safest move.

*Data: Impact on tx uniformity.ods

@j-berman
Copy link
Collaborator

For the record, I settled on the conclusion once again that for next release, #7845 is the prudent move. The more I sit with this, the number of angles to dissect and analyze to be certain that getting rid of integer truncation completely is safe to do increase. In addition to the impact on tx uniformity, I also want to be certain that it would not alter the decoy selection algorithm itself for the worse, which presents its own set of angles to dissect and analyze.

So final conclusion, I figure the earliest spends issue (#7821) and div by 0 (#7845) are too important to not solve ASAP, and the changes they would bring to the selection algorithm overall are minimal. In contrast, changing integer truncation to double division here has a much more material impact. Thus, for next release I'm settled on thinking we should go with the sure things: #7821 and #7845.

As mentioned in #7821, beyond the 2 PR's (#7821 and #7845), we could continue with deeper research toward an even stronger solution that takes another crack at the assumptions laid out in Miller et al, and factors in observed chain data since then. Said research will take a fair amount of time to complete, and #7821 + #7845's do-no-harm approaches offer solid patches until that research is finished. And again, shoutout to @Rucknium who has some excellent ideas and is an applied statistician by trade who has offered to contribute in this area :)

@j-berman
Copy link
Collaborator

I think patching integer truncation is ok for next release. And I think we can reduce the RECENT_SPEND_WINDOW to something like 10.

Apologies in advance for over-complicating where over-complicated. This will, hopefully, be the last long-winded post I write on integer truncation :)

On integer truncation

Here were the 2 points I wanted to assess in patching truncation:

  1. It may cause an under-selection of older outputs.
  2. It may cause a significant change in transaction uniformity, leaving non-updated clients more vulnerable than they would otherwise be.

1. It may cause an under-selection of older outputs.

Considering it's already assumed that the corrected decoy selection algorithm is safe, and since it's probably not useful to rely on my flawed analytical framework (shared in this comment and this MRL post) to assess the (likely low) probability of whether truncation has been beneficially selecting a solid number of older outputs, I have no compelling reason to keep patching truncation on hold on these grounds. So that eliminates point 1.

2. It may cause a significant change in transaction uniformity, leaving non-updated clients more vulnerable than they would otherwise be.

First, MyMonero+Monero-LWS haven't been truncating integers this whole time, so there is already a separate pool of transactions not doing it.

Second, for the majority of transactions, I don't believe you can distinguish 1 integer truncated ring from 1 non-truncated ring with any meaningful degree of certainty. It also helps that average_output_time is lower now (1.74), so the truncation effect is a bit less significant today, though that could change obviously.

I think the primary risk on transaction uniformity is: for users who don't update and construct linked tx's with many rings, it may be a bit clearer their transactions were created by an older wallet. An older wallet that still truncates integers would select recent outputs less frequently, and older outputs more frequently. If a user of an older wallet consistently creates transactions with many inputs, this could potentially enable more informed links between their transactions, as those transactions might be marginally easier to identify as having fewer recents and more older (therefore: an older wallet). However, it is worth noting that constructing many-input linked tx's is probably a rare user pattern. If a user constructs a many-input transaction, I figure you would expect that their next transaction would have fewer inputs. It's also worth noting that users of wallets < 0.17.2.3 are already vulnerable to average_output_time falling below 1 anyway, so the rare risk of maybe (?) being able to link many-input transactions is not so significant relative to that.

Summary: I think the tx uniformity risk of patching truncation seems low, and the plausible benefit of a more "correct" algorithm is solid. So the argument for patching truncation seems stronger than not, rather than wait for a hard fork.

On the recent spend window

Repeating the intuition behind it: it's meant to capture the spending patterns of Monero users who spent outputs <10 blocks old back when the lock time wasn't enforced by consensus -- the gamma was apparently fit using that data. The idea is that if the gamma suggests an output that is <10 blocks old today (i.e. an unspendable output), it should re-select an output from a recent window to more closely mirror expected spending patterns when the gamma was fit.

The recent spend window's magic number of 50 was chosen assuming integer truncation would not be patched, and based on an analytical framework now shown to be relatively weak by @vtnerd, since we don't know all factors which might be causing recent observed output ages on chain. A shorter window had been discussed and thrown around as an option in monero-dev, though I don't exactly know what analytical framework to go with considering @vtnerd's criticisms.

While the following probably isn't very useful beyond being a sort of sanity check, we can see on these charts what the different windows and implementations look like for very recent outputs*:

recents pre-July volume spike trunc effect avg_output_time

recents pre-July volume spike trunc effect v14

recents trunc effect v12

10 looks pretty good to me (if you assume the wallet2/monero-lws implementations, are the only ones used, it would be ok to be over the blue line, because the gap below blue may indicate individual reals uncovered by many decoys).

*I decided to use an end height just before the seemingly anomalous tx volume spike on July 21st. Here are more charts with end heights up to just before 0.17.2.3 was released:

recents 0 17 2 3 trunc effect avg_output_shift

recents 0 17 2 3 trunc effect v14

The fix as suggested by <jberman> on IRC. Before the fix, it would truncate 1.9 to 1 skewing the output selection.
j-berman added a commit to j-berman/monero that referenced this pull request Oct 5, 2021
- combined with patching integer truncation (monero-project#7798), this gets the algorithm marginally closer to mirroring empirically observed output ages
- 50 was originally chosen assuming integer truncation would remain in the client for that client release version. But patching integer truncation causes the client to select more outputs in the 10-100 block range, and therefore the benefit of choosing a larger recent spend window of 50 has less merit
- 15 seems well-suited to cover the somewhat sizable observable gap in the early window of blocks
j-berman added a commit to j-berman/monero that referenced this pull request Oct 5, 2021
- combined with patching integer truncation (monero-project#7798), this gets the algorithm marginally closer to mirroring empirically observed output ages
- 50 was originally chosen assuming integer truncation would remain in the client for that client release version. But patching integer truncation causes the client to select more outputs in the 10-100 block range, and therefore the benefit of choosing a larger recent spend window of 50 has less merit
- 15 seems well-suited to cover the somewhat sizable observable gap in the early window of blocks
@luigi1111 luigi1111 merged commit b8ed1cb into monero-project:release-v0.17 Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants