-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix precision of average_output_time #7798
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, merging this change into the release branch is a very bad idea. This introduces a transaction uniformity defect that allows distinguishing between transactions that were created using the old decoy selection algorithm. Not all user keep their wallet software up to date, and not all wallet will use the new selection algorithm immediately. It seems better to me to include this in a hardfork.
NACK.
We discussed it on IRC. The current code has another serious problem and it's division by zero as soon as we have more than 120 outputs per block on average (right now it's 63 outputs per block) - |
We have automatic updates for Feather + Monero GUI, plus also monerod notifies when a new update is out. Together with Reddit + mailing list we should get the users updated quite quickly. |
Making matters worse for users unaware of the update. Not everyone checks /r/Monero on the daily or is subscribed to the mailing list. Auto updates only cover a subset of users, can be ignored, and are broken on some operating systems (GUI on Tails, Feather on macOS). Some users may not care enough about their privacy or may think it isn't relevant enough to their threat model to update, which directly harms the privacy of users using the new algorithm by lowering the effective ring size. For this reason I think we should be very conservative with changes that impact transaction uniformity, especially for something as critical as decoy selection. Since this isn't a security vulnerability I also see no incentive for merchants and exchanges to upgrade swiftly (outside of their willingness to aid in ensuring Monero maintains its privacy guarantees). Moreover, there is no ETA for the next hard fork, which means we'll probably have these two anonymity puddles coexist for at least the next 6 months. I would personally prefer a quick hardfork that includes BP+ and this decoy selection change rather than merging this change in the release branch and hoping users and services will upgrade. If we go ahead with this then at the very least I would like to propose setting a target date for the CLI/GUI release, so other wallets can coordinate their releases accordingly. And adding a banner to getmonero alerting users to update as soon as possible. |
@tobtoht You should also consider the attack vector (flooding network with transactions until the current code breaks with division by zero). The information is public now anyway.
Current faulty algorithm also lowers the effective ring size. There's no perfect way of going forward here. |
You're right. My napkin math suggest that a motivated attacker could pull this off within 17 days, costing only ~130 XMR. That is not enough time to organize a hard fork. This PR has my re-approval, given no other options.
@selsta Can you comment on the expected time-frame for the next release? It would be nice if we could tag whenever ready then wait a few days to allow other wallets rebase, test and have a release ready in time for the CLI/GUI release. |
An ugly, temporary hack to protect the division by 0 vector until a hard fork could be to run the correct calculation if
I can continue digging to assess the privacy implications with data. There is a chance that this skew to the selection algorithm is not significant enough to have caused material privacy issues to surface, or will cause material issues to surface in the future. More analysis is needed to be certain of the full extent of its implications, I think. Edit: provided a slightly improved alternative hack. |
2 weeks or so |
To add some detail here that I mentioned in It seems I could definitely be wrong. Perhaps the 2 puddles could be guesstimated with the right algorithm. I think it would take a lot of work to know the answer to this question definitively, and at the moment I'm pursuing analysis on the algorithm in general. To err on the side of extreme caution, maybe it would make sense to go with the hack I proposed in that comment above for the next release to protect against the divide by 0 vector? And can continue the analysis separately on privacy. I apologize for not recognizing the divide by 0 attack vector when I initially spotted this. For reference, here's my code to repro the simulation (uses pre-fix code in the PR). |
Here's how I think it may be possible to try and differentiate 2 puddles: For pre-fix transactions constructed at the block height I ran the simulation at, you would expect ~1 decoy to be selected from the last 100 blocks, but post-fix you would expect ~1.6 decoys to be selected from the last 100 blocks. Therefore, if you see a ring with 3 outputs near that block, you have a stronger reason to separate that transaction into the post-fix puddle (2 decoys from last 100 blocks + 1 expected recent real = higher chance it came from post-fix). Or if you see a ring with 0 outputs near that block, you have a stronger reason to separate that transaction into the pre-fix puddle. A heuristic like that could potentially be used to try and separate transactions into puddles. It couldn't possibly be perfect, though. Here's the data for the simulations that also shows how I got the numbers above: get_outs (old vs new).ods |
Let's say you find out someone's tx ring selection is post-fix algo. What would they gain by this? I know in general we want as much uniformity as possible but still, how would privacy be weakened? I don't see much risk currently is simply fixing it in a release. |
Here's my best shot: when an output from a post-fix tx is included in a subsequent ring later on, you have a marginally stronger reason to believe that subsequent tx should be a post-fix tx as well (since that output may have been the change output, and you can safely assume that user spending the change output has already upgraded). Therefore when you see that output included in a ring of a subsequent pre-fix tx, perhaps you can eliminate that output from the plausible real outputs in that ring. But even then, there is no guarantee that is the correct thing to do, since there is no way to know if that output was a change output from the original post-fix tx. Perhaps using a methodology like this, you can build up a tx graph of implausible post-fix to pre-fix tx's, or pre-fix to post-fix tx's, and do some guesswork to reduce effective ring sizes. But I agree, the risk does seem extremely low, probably nonexistent to me. First, you have to be able to determine if a tx is pre-fix or post-fix, which is already guesswork. Then, you have to do guesswork again above to guess eliminate some outputs from rings in subsequent tx's. |
A clearer example: let's say you're certain a tx is a post-fix tx, and for simplicity, the tx only has 1 ring as input. 10/11 outputs in the ring come from tx's you're certain are pre-fix tx's, and 1/11 outputs come from a post-fix tx. You have marginally stronger reason to believe that the post-fix output in the ring is the real output, and it is the user spending change, especially if it is a recent output. With more rings in the tx that have the same makeup, the more accurate the guess can be (e.g. a post-fix tx with 100 rings and every single ring only has a single output from post-fix tx's in them.. it's fairly obvious that all the "post-fix" outputs in every ring of that tx are the real spents). But again, this relies on being able to differentiate between pre and post-fix tx's, which I don't believe is possible to do with certainty. This is more of a highly unlikely hypothetical to make it clearer what the potential risk is in that scenario. |
Visualizing the impact of the fix with 10 newly generated samples of pre-fix and post-fix decoy selections: You can see that as expected, the post-fix rings tend to be lower (since the correct I can work on applying a more rigorous mathematical framework, but I believe this is fairly solid evidence that it would not be possible to identify pre and post-fix tx's with certainty. You could make a solid guess for maybe 10% of transactions that fall in the extremes, but that seems to be about it. To demonstrate, here is another sample, try and determine which lines are pre-fix and which are post-fix: And clarifying, even with the knowledge of a single transaction as pre- and post-fix, I don't see any useful information that can come from that. You need to be certain for other transactions tied to the outputs in each of that transcation's rings as well to gain semi-useful information. And because it only seems that 10% maybe 15% of transactions can be bucketed as pre or post-fix, there does not seem to be much risk at all to me (15% ^ 11 = 0.000000001). Edit: here's the data + more samples comparing pre- and post-spam as well: |
My opinion: this PR should not be merged as is, and we should instead go with this fix for next release ASAP:
In the meantime, we can continue working on assessing the privacy implications of altering transaction uniformity with the fix in this PR as is. My reasoning, which includes new information:
Thus, my opinion is that we should protect divide by 0 ASAP, and take more time with more significant changes to the decoy selection algorithm. |
Updated opinion after analyzing the impact on transaction uniformityI believe we should go with the fix proposed in my comment above instead, since going with the fix as is has a seemingly material impact on transaction uniformity when combined with the fix from #7821. While the fix as is does appear to bring marginal benefit to privacy by selecting more recent outputs, doing so is only marginally helpful, and does not outweigh the cost to users who delay in updating in my opinion. I think it is a completely acceptable solution from a privacy perspective to only protect divide by 0 today, and to hold off on the change from this PR as is until a hard fork. Looking at the numbersCombining the fix in this PR as is with the fix in #7821, the impact to transaction uniformity appears to be relatively significant. Since both fixes cause clients to select decoys that are more recent, both fixes in combination (rather than just 1 in isolation) cause the deviation from transactions pre-fix and post-fix to be larger. As noted in this comment from #7821, the risk posed by tx uniformity from #7821 is extraordinarily rare. The risk surfaces when there is a small minority of users who do not update. I mentioned in that comment that the risk is posed to users who construct transactions with 20 rings. Reiterating, with that fix, when a user constructs a 20+ ring transaction, there is only a ~1% chance that every single ring would exclude a very early output in it. Thus, if a user using an older client constructs a 20+ ring transaction that uses an output constructed from a prior 20+ ring transaction, then it becomes obvious that output is real once the number of people using older clients dwindles. Users constructing linked 20+ ring transactions seems extraordinarily rare. And thus, the risk presented by #7821 seems insignificant. Now, with the fix in this PR as is in combination with #7821's, the impact is more significant: there would be a ~1% chance a user could construct a transaction with 5 rings that excludes an output more recent than 20 blocks, but a ~27% chance for an older client. Thus, if you have 2 linked transactions with 5 rings each, and 0 outputs more recent than 20 blocks, it very likely comes from an older client (since there is only a ~1% chance of that happening on a newer client). And therefore if there are very few older clients, it is much clearer that is the real spent.* Taking a step backThe numbers in question with the addition of the fix in this PR clearly have a much more significant impact on tx uniformity. In the former, the risk is contained to linked 20+ ring transactions that exclude only very recent outputs. And in the latter, the risk is widened to linked 5+ ring transactions that exclude outputs from within the past 20 blocks. Therefore, it appears the addition of the fix in this PR as is presents a higher level of risk to older clients that do not update. Thus, considering the fix from #7821 does a good job on its own of protecting recent outputs, and considering the fix as is in this PR would compound on that by also selecting more recent outputs in greater frequency, the benefits of this PR as is are nothing to write home about, while the risks are clearer. So my view is that protecting divide by 0 in next release -- but continuing to allow integer truncation until a hard fork as a precautionary measure for older clients that do not update -- is the safest move. *Data: Impact on tx uniformity.ods |
For the record, I settled on the conclusion once again that for next release, #7845 is the prudent move. The more I sit with this, the number of angles to dissect and analyze to be certain that getting rid of integer truncation completely is safe to do increase. In addition to the impact on tx uniformity, I also want to be certain that it would not alter the decoy selection algorithm itself for the worse, which presents its own set of angles to dissect and analyze. So final conclusion, I figure the earliest spends issue (#7821) and div by 0 (#7845) are too important to not solve ASAP, and the changes they would bring to the selection algorithm overall are minimal. In contrast, changing integer truncation to double division here has a much more material impact. Thus, for next release I'm settled on thinking we should go with the sure things: #7821 and #7845. As mentioned in #7821, beyond the 2 PR's (#7821 and #7845), we could continue with deeper research toward an even stronger solution that takes another crack at the assumptions laid out in Miller et al, and factors in observed chain data since then. Said research will take a fair amount of time to complete, and #7821 + #7845's do-no-harm approaches offer solid patches until that research is finished. And again, shoutout to @Rucknium who has some excellent ideas and is an applied statistician by trade who has offered to contribute in this area :) |
I think patching integer truncation is ok for next release. And I think we can reduce the Apologies in advance for over-complicating where over-complicated. This will, hopefully, be the last long-winded post I write on integer truncation :) On integer truncationHere were the 2 points I wanted to assess in patching truncation:
1. It may cause an under-selection of older outputs.Considering it's already assumed that the corrected decoy selection algorithm is safe, and since it's probably not useful to rely on my flawed analytical framework (shared in this comment and this MRL post) to assess the (likely low) probability of whether truncation has been beneficially selecting a solid number of older outputs, I have no compelling reason to keep patching truncation on hold on these grounds. So that eliminates point 1. 2. It may cause a significant change in transaction uniformity, leaving non-updated clients more vulnerable than they would otherwise be.First, MyMonero+Monero-LWS haven't been truncating integers this whole time, so there is already a separate pool of transactions not doing it. Second, for the majority of transactions, I don't believe you can distinguish 1 integer truncated ring from 1 non-truncated ring with any meaningful degree of certainty. It also helps that I think the primary risk on transaction uniformity is: for users who don't update and construct linked tx's with many rings, it may be a bit clearer their transactions were created by an older wallet. An older wallet that still truncates integers would select recent outputs less frequently, and older outputs more frequently. If a user of an older wallet consistently creates transactions with many inputs, this could potentially enable more informed links between their transactions, as those transactions might be marginally easier to identify as having fewer recents and more older (therefore: an older wallet). However, it is worth noting that constructing many-input linked tx's is probably a rare user pattern. If a user constructs a many-input transaction, I figure you would expect that their next transaction would have fewer inputs. It's also worth noting that users of wallets < 0.17.2.3 are already vulnerable to Summary: I think the tx uniformity risk of patching truncation seems low, and the plausible benefit of a more "correct" algorithm is solid. So the argument for patching truncation seems stronger than not, rather than wait for a hard fork. On the recent spend windowRepeating the intuition behind it: it's meant to capture the spending patterns of Monero users who spent outputs <10 blocks old back when the lock time wasn't enforced by consensus -- the gamma was apparently fit using that data. The idea is that if the gamma suggests an output that is <10 blocks old today (i.e. an unspendable output), it should re-select an output from a recent window to more closely mirror expected spending patterns when the gamma was fit. The recent spend window's magic number of 50 was chosen assuming integer truncation would not be patched, and based on an analytical framework now shown to be relatively weak by @vtnerd, since we don't know all factors which might be causing recent observed output ages on chain. A shorter window had been discussed and thrown around as an option in monero-dev, though I don't exactly know what analytical framework to go with considering @vtnerd's criticisms. While the following probably isn't very useful beyond being a sort of sanity check, we can see on these charts what the different windows and implementations look like for very recent outputs*: 10 looks pretty good to me (if you assume the wallet2/monero-lws implementations, are the only ones used, it would be ok to be over the blue line, because the gap below blue may indicate individual reals uncovered by many decoys). *I decided to use an end height just before the seemingly anomalous tx volume spike on July 21st. Here are more charts with end heights up to just before 0.17.2.3 was released: |
The fix as suggested by <jberman> on IRC. Before the fix, it would truncate 1.9 to 1 skewing the output selection.
12ba76e
to
3d9f348
Compare
- combined with patching integer truncation (monero-project#7798), this gets the algorithm marginally closer to mirroring empirically observed output ages - 50 was originally chosen assuming integer truncation would remain in the client for that client release version. But patching integer truncation causes the client to select more outputs in the 10-100 block range, and therefore the benefit of choosing a larger recent spend window of 50 has less merit - 15 seems well-suited to cover the somewhat sizable observable gap in the early window of blocks
- combined with patching integer truncation (monero-project#7798), this gets the algorithm marginally closer to mirroring empirically observed output ages - 50 was originally chosen assuming integer truncation would remain in the client for that client release version. But patching integer truncation causes the client to select more outputs in the 10-100 block range, and therefore the benefit of choosing a larger recent spend window of 50 has less merit - 15 seems well-suited to cover the somewhat sizable observable gap in the early window of blocks
The fix as suggested by
jberman
on IRC. Before the fix, it would truncate 1.9 to 1 skewing the output selection.