[FEA] Add implementation of snapping mechanism #46

danrr · 2021-07-12T20:56:54Z

First draft implementation of the snapping mechanism. This addresses a vulnerability
in the Laplace mechanism and its derivatives, stemming from floating-point numbers.
The mechanism was proposed as a solution to this vulnerability by Ilya Mironov

Paper link: https://www.microsoft.com/en-us/research/wp-content/uploads/2012/10/lsbs.pdf

naoise-h · 2021-09-23T09:49:19Z

Hi Dan,

I see this is still in draft. Are you planning on working on this further to finish the implementation? I'm not sure if there is great value in the snapping mechanism, owing to its granular output set (especially for small epsilon), but it may be worthwhile having for use as a reference nonetheless.

Note that we have since implemented defences (#47, from this paper) against the floating point vulnerability that the snapping mechanism seeks to resolve.

danrr · 2021-09-28T11:29:52Z

Hi Naoise,

Sorry for the delay, I was dealing with some hardware issues and then I was away on a wee break now that that sort of thing is possible again.

While away, I noticed your paper and wanted to reach out and ask if there is still value in this PR, but you have pre-empted that question. I'm happy to finish this PR, if only as a reference implementation. I can also add a warning pointing out that a better alternative exists.

On a side note: interesting work in your paper. I still need to fully familiarise myself with it, but I was wanting to look at the effect of random floating point numbers on the sampling of the Gaussian distribution as well. Implementing the snapping mechanism was a way to better understand some of the work up to this point. I was only just starting and was planning to delve deeper after I was back, so no great loss for me that you beat me to it but it's cool to know there was definitely something there.

First draft implementation of the snapping mechanism. This addresses a vulnerability in the Laplace mechanism and its derivatives, stemming from floating-point numbers. The mechanism was proposed as a solution to this vulnerability by Ilya Mironov Paper link: https://www.microsoft.com/en-us/research/wp-content/uploads/2012/10/lsbs.pdf

codecov · 2021-10-27T14:34:47Z

Codecov Report

Merging #46 (5da802e) into main (90b319a) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #46      +/-   ##
==========================================
+ Coverage   99.60%   99.61%   +0.01%     
==========================================
  Files          33       34       +1     
  Lines        2515     2599      +84     
==========================================
+ Hits         2505     2589      +84     
  Misses         10       10

Impacted Files	Coverage Δ
diffprivlib/mechanisms/__init__.py	`100.00% <100.00%> (ø)`
diffprivlib/mechanisms/snapping.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 90b319a...5da802e. Read the comment docs.

diffprivlib/mechanisms/snapping.py

naoise-h

Hi Dan,

I've made a first run-through of the code for syntax purposes and proposed a few changes. I will run through the code next week to double-check its implementation versus the Mironov paper.

Additionally, can you add some tests to check the behaviour of the special functions within the mechanism (i.e., _get_nearest_power_of_2, _round_to_nearest_power_of_2)?

Let me know if you have any questions. Thanks for all your hard work on this!

diffprivlib/mechanisms/snapping.py

tests/mechanisms/test_Snapping.py

danrr · 2021-11-14T21:45:10Z

Hi @naoise-h,

Thank you for looking over this. I have made the changes you asked for. I changed the implementation to compute the effective epsilon and use that, so the mechanism is epsilon-DP for the given epsilon. I do have a small issue with the different floating point types supported by numpy. I use the machine epsilon of the basic float type to compute the effective value of epsilon, and I cast to double type in _get_nearest_power_of_2. This works fine for Python's float type but on machines with support for 96 or 128 bit floats, where np.longdouble could be used, this might cause strange behaviour. How should I best address that possibility, if at all?

Re: checking the implementation against the paper, a few points that might be of interest. Scaling to the sensitivity is not defined fully in the paper so I try to scale the inputs and bounds before applying the mechanism to make the rounding step easier to reason about. I think the scaling is consistent and makes sense, but let me know if there is anthing that I can improve.

I also align the implementation with that of the LaplaceTruncated mechanism, allowing for arbitrary bounds, which I then use to compute a symmetric bound, which is then used as per the paper. The value is also scaled and offset in the same way as the bounds, and this process is reversed after the mechanism is applied.

Tests were failing due to math.nextafter only being introduced in Python 3.9, so it was replaced with np.nextafter.

naoise-h

I have added a few more requested changes below, hopefully they all make sense. As for the points you raise:

To deal with the different floating point types, we can require the input to be a float (as a check in _check_all), and throw an error if the input is a higher-precision float (or attempt to cast it as a float).
As for the sensitivity problem, scaling to unit sensitivity is the right way to go. As it's just a pre- and post-processing step, it won't affect the DP guarantee, so all good there.
Would there be value in letting the user specify bound instead of lower and upper? Would that reduce complexity?

One last comment is that there are two warnings being thrown in the tests (link). Can the code causing these be fixed?

diffprivlib/mechanisms/snapping.py

tests/mechanisms/test_Snapping.py

docs/modules/mechanisms.rst

naoise-h · 2021-12-02T10:04:07Z

diffprivlib/mechanisms/snapping.py

+        if not (isinstance(epsilon, float) or isinstance(epsilon, np.float64)):
+            warnings.warn("The snapping mechanism expects epsilon to be a double precision floating-point number for"
+                          "precise rounding; epsilon will be cast to 64-bit float", DiffprivlibCompatibilityWarning)


My apologies, I meant for the float check to be on the input value, not epsilon. This can be checked in _check_all(value). Also, it may be best to do a quick sanity check like float(value) != value, since it should still be possible to input an integer, etc. Something like this:

def _check_all(self, value): super()._check_all(value) if float(value) != value: warnings.warn() return True

Yes, this isn't quite ready.

The reason I'm checking epsilon is the two methods that depend on floating-point implementation details:_get_nearest_power_of_2 and effective_epsilon, operate on epsilon (or values derived from it) not the input value.

The other thing I was considering is to cast all values to np.longdouble, which is system dependant, and adapt the code to work with whatever precision that provides. It would complicate _get_nearest_power_of_2 but would potentially lower the impact of the machine epsilon on the mechanism.

danrr · 2021-12-03T15:15:52Z

Hi @naoise-h,

Thanks for all the feedback. I pushed a set of changes that should resolve the latest batch of comments.

To deal with the different floating point types, we can require the input to be a float (as a check in _check_all), and throw an error if the input is a higher-precision float (or attempt to cast it as a float).

I noticed that _check_epsilon_delta casts epsilon to float and this would happen if it was np.doublelong, so I think it would be consistent to just work with the standard double-precision float type and not worry about triple- or quad- precision floats. This also saves me from having to re-write the bit manipulation code, as struct is not aware of higher precision floats. I removed the warning about floating point types as a consequence. The downside is a user wouldn't be able to get slightly better accuracy by using a higher-precision floats, but, if someone wants accuracy, they probably shouldn't be using the snapping mechanism in the first place.

I did change the implementation to query the mantissa size of the floating point type the system provides, which should make things more robust and easier to adjust in the future, should there be any need for it.

Would there be value in letting the user specify bound instead of lower and upper? Would that reduce complexity?

It would reduce complexity slightly, but not by a huge amount as scaling to sensitivity would still need to be performed. I think having it be consistent with LaplaceTruncated is good. Of course, if a single bound is what the user wants, the mechanism can just be instantiated with lower=-bound, upper=bound.

danrr · 2022-01-12T17:18:46Z

Happy new year @naoise-h,

This PR is ready to review, if you have the time.

naoise-h

Thanks for all your changes, and your very valuable contribution to diffprivlib.

danrr · 2022-01-17T16:39:26Z

Thank you @naoise-h, I appreciate all your feedback on this. It's been fun!

danrr mentioned this pull request Jul 12, 2021

Floating-point privacy vulnerabilities #19

Closed

naoise-h linked an issue Jul 15, 2021 that may be closed by this pull request

Floating-point privacy vulnerabilities #19

Closed

stefano81 requested a review from naoise-h September 22, 2021 21:24

danrr force-pushed the implement-snapping-mechanism branch from 71ac493 to 0b0768b Compare October 27, 2021 14:31

danrr marked this pull request as ready for review October 27, 2021 14:33

danrr added 5 commits October 27, 2021 20:18

Add test for bounds of snapping mechanism

a30b456

Add Snapping mechanism to docs

366ad14

Fix typo in comment

697df51

Add test for effective epsilon

76d0683

Subclass LaplaceTruncated

dc8d191

danrr force-pushed the implement-snapping-mechanism branch from 6fa6b0e to dc8d191 Compare October 28, 2021 09:15

danrr commented Oct 28, 2021

View reviewed changes

diffprivlib/mechanisms/snapping.py Show resolved Hide resolved

danrr added 2 commits October 28, 2021 10:27

Set lambda once in Snapping init

4082ebb

Add tests for rounding to power of two

4da72e8

naoise-h requested changes Nov 11, 2021

View reviewed changes

Use effective epsilon in Snapping mechanism

eb368dd

danrr force-pushed the implement-snapping-mechanism branch from 87de49e to 4332ca6 Compare November 14, 2021 22:10

Fix tests that were failing on Python 3.8

f40e5cd

Tests were failing due to math.nextafter only being introduced in Python 3.9, so it was replaced with np.nextafter.

danrr force-pushed the implement-snapping-mechanism branch from 4332ca6 to f40e5cd Compare November 16, 2021 09:29

danrr added 2 commits November 16, 2021 09:29

Use exact equal for bit operations on floats

7d0fb8c

Add tests for rounding step in snapping mechanism

7970f78

naoise-h requested changes Dec 1, 2021

View reviewed changes

naoise-h reviewed Dec 2, 2021

View reviewed changes

Add warning for floating-point precision

5da802e

danrr force-pushed the implement-snapping-mechanism branch from 0cc7c62 to 5da802e Compare December 3, 2021 14:14

naoise-h approved these changes Jan 17, 2022

View reviewed changes

naoise-h changed the title ~~Add implementation of snapping mechanism~~ [FEA] Add implementation of snapping mechanism Jan 17, 2022

naoise-h merged commit e7990d2 into IBM:main Jan 17, 2022

danrr deleted the implement-snapping-mechanism branch January 17, 2022 16:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Add implementation of snapping mechanism #46

[FEA] Add implementation of snapping mechanism #46

danrr commented Jul 12, 2021

naoise-h commented Sep 23, 2021

danrr commented Sep 28, 2021

codecov bot commented Oct 27, 2021 •

edited

Loading

naoise-h left a comment

danrr commented Nov 14, 2021

naoise-h left a comment

naoise-h Dec 2, 2021

danrr Dec 2, 2021

danrr commented Dec 3, 2021

danrr commented Jan 12, 2022

naoise-h left a comment

danrr commented Jan 17, 2022

[FEA] Add implementation of snapping mechanism #46

[FEA] Add implementation of snapping mechanism #46

Conversation

danrr commented Jul 12, 2021

naoise-h commented Sep 23, 2021

danrr commented Sep 28, 2021

codecov bot commented Oct 27, 2021 • edited Loading

Codecov Report

naoise-h left a comment

Choose a reason for hiding this comment

danrr commented Nov 14, 2021

naoise-h left a comment

Choose a reason for hiding this comment

naoise-h Dec 2, 2021

Choose a reason for hiding this comment

danrr Dec 2, 2021

Choose a reason for hiding this comment

danrr commented Dec 3, 2021

danrr commented Jan 12, 2022

naoise-h left a comment

Choose a reason for hiding this comment

danrr commented Jan 17, 2022

codecov bot commented Oct 27, 2021 •

edited

Loading