preliminaries: Added figures for the mechanisms

This commit is contained in:
Manos Katsomallos 2022-01-07 05:56:14 +01:00
parent bcaa417911
commit e66c59fa7a
5 changed files with 36 additions and 11 deletions

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -312,42 +312,67 @@ When there is independence between all the records in the original data set, the
\label{subsec:prv-mech}
A typical example of a differential privacy mechanism is the \emph{Laplace mechanism}~\cite{dwork2014algorithmic}.
It draws randomly a value from the probability distribution of $\textrm{Laplace}(\mu, b)$, where $\mu$ stands for the location parameter and $b > 0$ is the scale parameter (Figure~\ref{fig:laplace}).
In our case, $\mu$ is equal to the original output value of a query function, and $b$ is the sensitivity of the query function divided by $\varepsilon$.
It draws randomly a value from the probability distribution of $\textrm{Laplace}(\mu, b)$, where $\mu$ stands for the location parameter and $b > 0$ is the scale parameter (Figure~\ref{fig:mech-lap}).
In our case, $\mu$ is equal to the original output value of a query function, and $b$ is the sensitivity of the query function divided by the privacy budget $\varepsilon$.
The Laplace mechanism works for any function with range the set of real numbers.
\begin{figure}[htp]
\centering
\includegraphics[width=.5\linewidth]{preliminaries/laplace}
\caption{A Laplace distribution for location $\mu = 2$ and scale $b = 1$.}
\label{fig:laplace}
\includegraphics[width=.5\linewidth]{preliminaries/mech-lap}
\caption{A Laplace distribution for location $\mu = 0$ and different scale values $b$.}
\label{fig:mech-lap}
\end{figure}
A specialization of this mechanism for location data is the \emph{Planar Laplace mechanism}~\cite{andres2013geo,chatzikokolakis2015geo},
% which is based on a multivariate Laplace distribution.
% \emph{Geo-indistinguishability} is
an adaptation of differential privacy for location data in snapshot publishing.
It is based on $l$-privacy, which offers to individuals within an area with radius $r$, a privacy level of $l$.
an adaptation of differential privacy for location data in snapshot publishing (\emph{Geo-indistinguishability}).
It is based on $l$-privacy, which offers to individuals within an area with radius $r$ a privacy level of $l$ (Figure~\ref{fig:mech-planar-lap}).
More specifically, $l$ is equal to $\varepsilon r$ if any two locations within distance $r$ provide data with similar distributions.
This similarity depends on $r$ because the closer two locations are, the more likely they are to share the same features.
Intuitively, the definition implies that if an adversary learns the published location for an individual, the adversary cannot infer the individual's true location, out of all the points in a radius $r$, with a certainty higher than a factor depending on $l$.
The technique adds random noise drawn from a multivariate Laplace distribution to individuals' locations, while taking into account spatial boundaries and features.
For query functions that do not return a real number, e.g.,~`What is the most visited country this year?' or in cases where perturbing the value of the output will completely destroy its utility, e.g.,~`What is the optimal price for this auction?' most works in the literature use the \emph{Exponential mechanism}~\cite{mcsherry2007mechanism}.
This mechanism utilizes a utility function $u$ that maps (input data set $D$, output value $r$) pairs to utility scores, and selects an output value $r$ from the input pairs with probability proportional to $\exp(\frac{\varepsilon u(D, r)}{2\Delta u})$.
$\Delta u$ is the sensitivity of the utility
\begin{figure}[htp]
\centering
\includegraphics[width=.5\linewidth]{preliminaries/mech-planar-lap}
\caption{Geo-indistinguishability: privacy level $l$ varying with the protection radius $r$.}
\label{fig:mech-planar-lap}
\end{figure}
For query functions that do not return a real number, e.g.,~`What is the most visited country this year?' or in cases where perturbing the value of the output will completely destroy its utility, e.g.,~`How many patients in the ICU?' most works in the literature use the \emph{Exponential mechanism}~\cite{mcsherry2007mechanism}.
Initially, a utility function $u$, with sensitivity $\Delta u$, maps pairs of the input value $x$ and output value $r$ to utility scores.
Thereafter, the mechanism $M$ selects an output value $r$ from a set of possible outputs $R$ with probability proportional to $\exp(\frac{\varepsilon u(x, r)}{2\Delta u})$ (Figure~\ref{fig:mech-exp}).
% $\Delta u$is the sensitivity of the utility
% \kat{what is the utility function?}
% \mk{Already explained}
function.
% function.
\begin{figure}[htp]
\centering
\includegraphics[width=.5\linewidth]{preliminaries/mech-exp}
\caption{The internal mechanics of the exponential mechanism.}
\label{fig:mech-exp}
\end{figure}
Another technique for differential privacy mechanisms is the \emph{randomized response}~\cite{warner1965randomized}.
It is a privacy-preserving survey method that introduces probabilistic noise to the statistics of a research by randomly instructing respondents to answer truthfully or `Yes' to a sensitive, binary question.
The technique achieves this randomization by including a random event, e.g.,~the flip of a fair coin.
The respondents reveal to the interviewers only their answer to the question, and keep as a secret the result of the random event (i.e.,~if the coin was tails or heads).
Thereafter, the interviewers can calculate the probability distribution of the random event, e.g.,~$\frac{1}{2}$ heads and $\frac{1}{2}$ tails, and thus they can roughly eliminate the false responses and estimate the final result of the research.
Based on this methodology, the \emph{Random response} mechanism~\cite{wang2010privacy} returns the true or flipped answer value $x$ with a probability $p$ proportional to the privacy budget $\varepsilon$ (Figure~\ref{fig:mech-rnd-resp}).
% $\frac{e^\varepsilon}{1 + e^\varepsilon}$
% \kat{is the following two paragraphs still part of the examples of privacy mechanisms? I am little confused here.. if the section is not only for examples, then you should introduce it somehow (and not start directly by saying 'A typical example...')}
\begin{figure}[htp]
\centering
\includegraphics[width=.3\linewidth]{preliminaries/mech-rnd-resp}
\caption{The internal mechanics of the random response mechanism.}
\label{fig:mech-rnd-resp}
\end{figure}
A special category of differential privacy-preserving
% algorithms
% \kat{algorithms? why not mechanisms ?}