-
Notifications
You must be signed in to change notification settings - Fork 0
/
semantics-mdp.tex
191 lines (181 loc) · 9.48 KB
/
semantics-mdp.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
The logic $\dpCTLstar$ can be interpreted over MDPs.
%Semantics over Markov chains generalizes to MDPs as well. Note that
%two states with different enabled actions
%are trivially distinguishable; no privacy can be preserved.
Let $M = (S, \Act, \wp, L)$ be an MDP and $\neighbor{S}$ a
neighborhood relation on $S$. Define the satisfaction
relation $M, \neighbor{S}, s \models \Phi$ for $\PJ{J} \phi$ and
$\dpriv{\epsilon}{\delta} \phi$ as follows (others are straightforward).
\begin{eqnarray*}
% M, \neighbor{S}, s \models \top\\
% M, \neighbor{S}, s \models p & \textmd{ if } & p \in L(s)\\
% M, \neighbor{S}, s \models \neg \Phi & \textmd{ if } & M, \neighbor{S}, s \not\models \Phi\\
% M, \neighbor{S}, s \models \Phi_0 \wedge \Phi_1 & \textmd{ if } & M, \neighbor{S}, s \models \Phi_0 \textmd{ and }
% M, \neighbor{S}, s \models \Phi_1\\
M, \neighbor{S}, s \models \PJ{J} \phi
& \textmd{ if } &
\myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi} \in J
\textmd{ for every scheduler } \scheduler{S}\\
M, \neighbor{S}, s \models \dpriv{\epsilon}{\delta} \phi
& \textmd{ if } &
\textmd{for all } t \textmd{ with } s \neighbor{S} t \textmd{ and
query scheduler } \scheduler{Q},
\myPr{s}{M_{\scheduler{Q}}}{\neighbor{S}}{\phi} \leq
e^{\epsilon} \cdot
\\
& &
\myPr{t}{M_{\scheduler{Q}}}{\neighbor{S}}{\phi} + \delta
\textmd{ and }
\myPr{t}{M_{\scheduler{Q}}}{\neighbor{S}}{\phi} \leq e^{\epsilon} \cdot
\myPr{s}{M_{\scheduler{Q}}}{\neighbor{S}}{\phi} + \delta
\end{eqnarray*}
Recall that $M_{\scheduler{S}}$ is but a Markov chain. The semantics
of $M_{\scheduler{S}}, \neighbor{S}, \pi \models \phi$ and hence the
probability $\myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}$ are
defined as in Markov chains.
The semantics of $\dpCTLstar$ on MDPs
is again standard except the differentially private operator
$\dpriv{\epsilon}{\delta}$. For any path formula $\phi$,
$\dpriv{\epsilon}{\delta} \phi$ specifies states whose probability
of having paths satisfying $\phi$ are $(\epsilon, \delta)$-close to
those of all its neighbors for query schedulers. That is, no
query scheduler can force any of neighbors to distinguish the
specified path behavior probabilistically.
\noindent
\paragraph{Justification of query schedulers.}
We use query schedulers in the semantics for the differentially
private operator. A definition with history-dependent schedulers
might be
\begin{eqnarray*}
M, \neighbor{S}, s \models \dpriv{\epsilon}{\delta}^{\mathit{bad}} \phi
& \textmd{ if } &
\textmd{for all } t \textmd{ with } s \neighbor{S} t \textmd{ and
scheduler } \scheduler{S},
\myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi} \leq
e^{\epsilon} \cdot\\
&& \myPr{t}{M_{\scheduler{S}}}{\neighbor{S}}{\phi} + \delta \textmd{ and }
\myPr{t}{M_{\scheduler{S}}}{\neighbor{S}}{\phi} \leq
e^{\epsilon} \cdot \myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}
+ \delta.
\end{eqnarray*}
A state satisfies $\dpriv{\epsilon}{\delta}^{\mathit{bad}} \phi$ if
no history-dependent scheduler can differentiate the probabilities
of having paths satisfying
$\phi$ from neighbors. Recall that a history-dependent scheduler chooses
actions according to previous states. Such a definition would allow
schedulers to take different actions from different states.
Two neighbors could hence be differentiated by different action sequences.
The specification might be too strong for our purposes.
A query scheduler $\scheduler{Q} : S^+ \rightarrow \Act$,
on the other hand, corresponds to a query sequence. A state satisfies
$\dpriv{\epsilon}{\delta} \phi$ if no query sequence can differentiate
the probabilities of having paths satisfying $\phi$ from neighbors.
Recall query schedulers only depend on lengths of histories. Two
neighbors cannot be distinguished by the same action sequence of any
length if they satisfy a differentially private subformula.
Our semantics agrees with the informal interpretation of differential
privacy for such systems. We therefore consider only query schedulers
in our definition.
\subsection{Model Checking}
Given an MDP $M =
(S, \Act, \wp, L)$, a neighborhood relation $\neighbor{S}$, $s \in
S$, and a path formula $\phi$, consider the problem of checking $M,
\neighbor{S}, s \models
\dpriv{\epsilon}{\delta} \phi$ . Recall the semantics of $\dpriv{\epsilon}{\delta}
\phi$. Given $s, t$ with $s \neighbor{S} t$ and a path formula
$\phi$, we need to decide whether
$\myPr{s}{M_{\scheduler{Q}}}{\neighbor{S}}{\phi} \leq
e^{\epsilon} \myPr{t}{M_{\scheduler{Q}}}{\neighbor{S}}{\phi} + \delta$
for every query scheduler $\scheduler{Q}$.
When $\phi$ is $\gX B$ with $B\subseteq S$, only the first action in
the query sequence needs to be considered. This can also be easily generalized to nested next operators: one needs only to enumerate
all actions query sequences of a fixed length.
% obtained by the number of nested $\X$ operators.
The problem however is undecidable in general.
\begin{theorem}\label{theorem:mdp-model-checking}
The $\dpCTLstar$ model checking problem for MDPs is undecidable.
\end{theorem}
The proof is in Appendix. We discuss some decidable special cases. Consider the formula $\phi:=F B$ with $B\subseteq S$ and assume that states in $B$ with only self-loops. For the case $\epsilon=0$, the condition reduces to
$\myPr{s}{M_{\scheduler{Q}}}{\neighbor{S}}{F B} -
\myPr{t}{M_{\scheduler{Q}}}{\neighbor{S}}{F B}\le \delta$. If $\delta=0$ it is the classical language equivalence problem for probabilistic automata~\cite{Rabin63}, which can be solved in polynomial time. However, if $\delta>0$, the problem
becomes an approximate version of the language equivalence problem. To
the best of our knowledge, its decidability is still open except for the special case where all states are connected~\cite{Tzeng92}.
\hide{
Since the scheduler attaining the maximally probable
behavior from a state may be different from the scheduler attaining
the maximally probable behavior from its neighbors, two
neighbors may still be distinguished by a scheduler. The
weaker definition does not preserve differential privacy. We hence
prefer the original definition.
}
Despite of the negative result in Theorem~\ref{theorem:mdp-model-checking},
a sufficient condition for $M, \neighbor{S}, s \models
\dpriv{\epsilon}{\delta} \phi$ is available.
To see this, observe that for $s \in S$ and
query scheduler $\scheduler{Q}$, we have
\[
\min_{\scheduler{S}} \myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}
\leq
\myPr{s}{M_{\scheduler{Q}}}{\neighbor{S}}{\phi}
\leq
\max_{\scheduler{S}} \myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}
\]
where the minimum and maximum are taken over all schedulers
$\scheduler{S}$. Hence,
\begin{eqnarray*}
\myPr{s}{M_{\scheduler{Q}}}{\neighbor{S}}{\phi} -
e^\epsilon \cdot \myPr{t}{M_{\scheduler{Q}}}{\neighbor{S}}{\phi}
\leq
\max_{\scheduler{S}} \myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}
- e^\epsilon \cdot
\min_{\scheduler{S}} \myPr{t}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}
\end{eqnarray*}
for any $s, t \in S$ and query scheduler $\scheduler{Q}$. We have the
following proposition:
\begin{proposition}\label{proposition:sufficient-condition-mdp-model-checking}
Let $M = (S, \Act, \wp, L)$ be an MDP, $\neighbor{S}$ a
neighborhood relation on $S$.
$M, \neighbor{S}, s \models \dpriv{\epsilon}{\delta} \phi$ if
$\max\limits_{\scheduler{S}} \myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}
- e^\epsilon \cdot
\min\limits_{\scheduler{S}} \myPr{t}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}
\leq \delta$ and
$\max\limits_{\scheduler{S}} \myPr{t}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}
- e^\epsilon \cdot
\min\limits_{\scheduler{S}} \myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}
\leq \delta$ for any $s, t \in S$ with $s \neighbor{S} t$.
\end{proposition}
For $s \in S$, recall that $\max\limits_{\scheduler{S}}
\myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}$ and
$\min\limits_{\scheduler{S}}
\myPr{s}{M_{\scheduler{S}}}{\neighbor{S}}{\phi}$ can be efficiently
computed~\cite{BK:08:PMC}.
By
Proposition~\ref{proposition:sufficient-condition-mdp-model-checking},
$M, \neighbor{S}, s \models \dpriv{\epsilon}{\delta} \phi$ can be
checked soundly and efficiently.
We model the above threshold algorithm (Algorithm~\ref{algorithm:online-model})
and apply
Proposition~\ref{proposition:sufficient-condition-mdp-model-checking}
to check whether the mechanism is differentially private using the
classical $\PCTL$ model checking algorithm for MDPs.
Since concrete values of the parameters $\epsilon$ and $\delta$ are
computed, tighter bounds for specific neighbors can be obtained.
For instance, for the state $t_3r_5$ and its neighbor
$\underline{t}_3\underline{r}_4$, we verify the property $\bigwedge_{k
\in \bbfZ^{\geq 0}} \dpriv{0}{0.17} ((\X^k\bot) \top)$ is
satisfied. Note the reachability probability goes to $0$ as $k$ goes
to infinity. By repeating the computation, we verify that
the property $\bigwedge_{k \in \bbfZ^{\geq 0}} \dpriv{1}{0.74}
((\X^k\bot) \top)$ is satisfied for all neighbors.
Subsequently, the above threshold mechanism in
Algorithm~\ref{algorithm:online-model} is $(1, 0.74)$-differentially
private. Compared to the parameters for the neighbors $t_3r_5$ and
$\underline{t}_3\underline{r}_4$, the parameter $\delta$ appears to be
significantly large. It means that there are two neighbors with
drastically different output distributions from our mechanism.
Moreover, recall that
Proposition~\ref{proposition:sufficient-condition-mdp-model-checking}
is a sufficient condition. It only gives an upper bound of privacy
parameters. Tighter bounds may be computed by more sophisticated
sufficient conditions.