This is an implemention of "Robust Watermarking of Neural Network with Exponential Weighting" by Ryota Namba and Jun Sakuma in TensorFlow.
Exponential weighting is the method which was proposed in the paper to make watermarks more robust against watermark removal attacks like pruning or fine-tuning. It works by applying a transformation to the weight matrix of each layer in the network before it is used in the forward pass. The basic concept is:
- Train the model on the training dataset until it converges
- Enable exponential weighting in the layers of the model, so it first applies a transformation to the weight matrix before it is used in the forward pass
- Train the model on the union of the key set and the training set in order to embed the watermark
- Disable exponential weighting in the layers of the model
The key set can be any set of inputs. If the accuracy on the key set is above a predefined arbitrary threshold we can verify that the model belongs to us.
You can create your own exponentially weighted layers by inheriting from EWBase which inherits from keras.layers.Layer. If exponential weighting is enabled, just call EWBase.ew() on the weight matrix before using it in the forward pass of your layer.
A simple example can be found in example.ipynb or example.py.
Show your support by ⭐ the project. Pull requests are always welcome.
This project is licensed under the MIT License - see the LICENSE.md file for details.