Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe decoy entries as a privacy mitigation #150

Open
npdoty opened this issue Mar 6, 2024 · 8 comments
Open

Describe decoy entries as a privacy mitigation #150

npdoty opened this issue Mar 6, 2024 · 8 comments
Assignees
Labels
during-CR This issue needs to be resolved during the Candidate Recommendation phase. editorial pr exists privacy-needs-resolution Issue the Privacy Group has raised and looks for a response on.

Comments

@npdoty
Copy link

npdoty commented Mar 6, 2024

Could dummy entries in the status list help to protect against an attacker who just regularly scans the list to try to reidentify users or their status?

Spec should document resistance to statistical analyses.

@npdoty npdoty added the privacy-needs-resolution Issue the Privacy Group has raised and looks for a response on. label Mar 6, 2024
@msporny
Copy link
Member

msporny commented Mar 9, 2024

Yes, and it would be worth saying something about that in the specification. We went into what one can do to avoid statistical attacks against group privacy during the presentation on bitstring status list yesterday:

https://meet.w3c-ccg.org/archives/w3c-ccg-weekly-2024-02-20.mp4

We should speak to decoy values and how one can avoid statistical attacks on the list by not only using decoy values, but flipping those bits randomly (when you flip other bits in the list).

@msporny msporny added ready-for-pr before-CR This issue needs to be resolved before the Candidate Recommendation phase. editorial labels Mar 9, 2024
@msporny msporny self-assigned this Mar 24, 2024
@msporny
Copy link
Member

msporny commented Mar 24, 2024

PR #155 has been raised to address this issue. This issue will be closed once PR #155 has been merged.

@msporny
Copy link
Member

msporny commented Apr 6, 2024

PR #155 has been merged, closing.

@msporny msporny closed this as completed Apr 16, 2024
@npdoty
Copy link
Author

npdoty commented May 9, 2024

@dlongley and others have argued that there are privacy harms and no privacy benefits to decoy entries, and the group has suggested that it is unaware of any threat or use case where a decoy value could provide a privacy benefit.

To be clear, I don't think of this as an "obvious" solution at all and I'm not as familiar with these deployments. But I suspect there are threats where decoy values could provide some protection, especially when status value changes are rare, where the group of people with a particular status is small or where other information is known about people whose status may be on the list.

For example, an issuer very rarely suspends licenses for a particular behavior. A case documenting that behavior is widely published in the press and on the same day, the issuer updates the status list to indicate a suspended status. The press then report apparent confirmation that the suspect's license was suspended, and potentially other information about them (when it was last suspended or un-suspended), even if that information was intended to be kept confidential. If the status information was shared in cases of selective disclosure (the licensee had proved their license status in order to access sensitive content online), then the licensee's identity has also been disclosed, and the site learns the identity of the visitor who accessed that particular content.

@KDean-Dolphin raised a separate use case about business intelligence, where the size of the status list might reveal the group behavior, like how many licenses are being issued during a particular timeframe, that the issuer might wish to conceal.

@dlongley
Copy link
Contributor

@npdoty,

I'm, of course, perfectly happy for us to say something about how decoy values can be helpful (even when already doing random assignment) if we're able to determine how and come to consensus on it.

I think we need to have a longer discussion around what to say about decoy values -- and that we should add an at-risk issue marker to the spec that says the working group will develop text around them (with options to recommend for, against, or stay silent on the concept). We could strike the sentence about discouraging them for now along with adding that risk marker.

Do you think doing this would allow us to proceed to CR and then we can continue the discussion around what to say in more depth at that point?

Regarding that discussion, I have a number of things to say around how using decoys in an effective manner requires that they behave as if they are indistinguishable from real entries, which I suspect will be quite challenging. Naive implementations that implement them in other ways (e.g., pseudo-randomly) would make them detectable as decoys, resulting in only net harm to privacy.

@npdoty
Copy link
Author

npdoty commented May 10, 2024

Yes, I think noting it as an open question on how to do properly (or whether), with an at-risk marker, would make sense for CR.

@msporny msporny changed the title describe dummy or decoy entries as a privacy mitigation Describe decoy entries as a privacy mitigation May 11, 2024
@msporny msporny added during-CR This issue needs to be resolved during the Candidate Recommendation phase. and removed before-CR This issue needs to be resolved before the Candidate Recommendation phase. labels May 11, 2024
@msporny
Copy link
Member

msporny commented May 11, 2024

PR #171 has been raised to note that the decoy guidance will be refined during the Candidate Recommendation process and that the group may, or may not, suggest that decoy values are good/bad/a mixed bag.

@npdoty, if that PR is merged, would it address your concerns enough to continue the transition into CR? If not, please suggest concrete changes on the PR such that we can determine the path forward. We'll most likely discuss this issue during our call this week, if you'd like to join us. /cc @brentzundel

@iherman
Copy link
Member

iherman commented Sep 29, 2024

The issue was discussed in a meeting on 2024-09-27

  • no resolutions were taken
View the transcript

4.7. Describe decoy entries as a privacy mitigation (issue vc-bitstring-status-list#150)

See github issue vc-bitstring-status-list#150.

Manu Sporny: We still need to discuss decoy entries.
… Waiting on nick doty to let us know if we addressed his concerns.

Brent Zundel: We are acting in good faith. If this ends up holding us up, lets close it.

Manu Sporny: The PR that got merged was an issue marker stating that in general decoys reduce the privacy of the status list.
… We haven't talked about it since then.
… We need to say if we think decoy values are harmful. say nothing. Or recommend decoy values.
… I think decoy values are harmful, because adding decoys shrinks the set size.
… Reducing privacy.
… The argument here is that you should set bits in the list randomly. Don't waste any bits in the list for decoy values.
… We could also say you can use decoy values if you want. Seems conterintuitive though.

Joe Andrieu: I am not convinced on this argument.
… The k-anonymity argument is how many people can I be confused with. I think with decoys I can be confused with a decoy. That is a good thing.

Kevin Dean: I agree with JoeAndrieu. It is possible to create a properly randomised list. And one that is expandable over time.
… I can drop a link to this.

Will Abramson: wes-smith: I am confused with what manu said about decoys cutting the group privacy of the set.

Kevin Dean: Is the idea over time that verifiers can learn which entries are decoys in the list.

Manu Sporny: Yep that is one attack. The other argument is why don't you just use a much larger set.

Kevin Dean: Fisher-Yates shuffler Python code: https://github.com/KDean-Dolphin/Python-Shuffler. IIW presentation: https://drive.google.com/file/d/1wtT2GUQrl7lKCarHYkWvyPbcRmm1Dvji/view?usp=sharing..

Manu Sporny: Otherwise you have to perfectly model the rate of issuance to real people to decoys then you can statistically determine which entries are likely to be decoys.
… e.g. if someone is issuing decoys at 4am (a bad example) then you can easily detect when decoys are being added.
… This is how the reduction of privacy occurs.
… The language says in general it is harmful to add decoys. Because it requires expertise to add decoys. It is a roll your own crypto type thing.
… Easy to do the wrong thing.

Denken Chen: +1 to the security best practice.

Brent Zundel: In general, is it possible to start the bit string status list with a randomised string.
… Then every entry could be a decoy.

Manu Sporny: Yes, but if you know the issuer always issues inactive things. You know the k-size.

Brent Zundel: It sounds like a way to address this issue might be. Perfect deployment of decoy values would increase the privacy protections of the set. However, the degratdation that occurs though improper use of decoys leads us to not recommend this approach.

Will Abramson: wes-smith: Thanks for clarifying. The language that we started with is not the best language. Decoys do not harm privacy, BUT it is hard to use the correctly.

Gabe Cohen: Thanks this makes sense now. We should add language around updating status lists and the risks around eroding privacy.
… I agree you can use decoys in a safe manner, but not sure it is worth talking about in the sepc.

Joe Andrieu: The language around not rolling your own is on point. But there might be tools written by experts that people can use.

Manu Sporny: I am concerned that this does not exist yet.
… People will try to do this, but may not do it well.

Brent Zundel: I think that is all the CR normative issues for bit string status list.
… Any notion of the timeline for this.

Manu Sporny: This wont be ready by january. Because of the other specs that are in need of attention.

Ivan Herman: I would prefer to keep the goal of January. We are not that far away.
… The result from the discussion with jeffery today was that there are 3 or 4 editoral changes required to the controller document.
… The only outlier is BBS spec.
… I believe it is still possible to aim for January.

Manu Sporny: Reminding everyone it could go fast if people did a PR.

Ivan Herman: This is true. Many things are independent PRs that can be done in parralel.

Brent Zundel: I think we have a solid idea for all our specs and next steps.
… We had a great conversation with vc-jose-cose. We know what is left.
… At what point do we anticipate a second CR from vc-jose-cose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
during-CR This issue needs to be resolved during the Candidate Recommendation phase. editorial pr exists privacy-needs-resolution Issue the Privacy Group has raised and looks for a response on.
Projects
None yet
Development

No branches or pull requests

4 participants