privacy: Reviewed subsec:prv-levels

This commit is contained in:
Manos Katsomallos 2021-09-03 15:02:57 +03:00
parent 64eb5ef4a8
commit 744bed7ac1

View File

@ -51,8 +51,11 @@ In order to better protect the privacy of Donald in case of attacks, the data sh
\subsection{Levels of privacy protection} \subsection{Levels of privacy protection}
\label{subsec:prv-levels} \label{subsec:prv-levels}
The information disclosure that a data release may entail is linked to the protection level that indicates \emph{what} a privacy-preserving algorithm is trying to achieve.\kat{I don't understand this first sentence} % The information disclosure that a data release may entail is linked to the protection level that indicates \emph{what} a privacy-preserving algorithm is trying to achieve.
More specifically, in continuous data publishing we consider the privacy protection level with respect to not only the users, but also to the \emph{events} occurring in the data. % \kat{I don't understand this first sentence}
% \mk{Same here...}
% More specifically, i
In continuous data publishing we consider the privacy protection level with respect to not only the users, but also to the \emph{events} occurring in the data.
An event is a pair of an identifying attribute of an individual and the sensitive data (including contextual information) and we can see it as a correspondence to a record in a database, where each individual may participate once. An event is a pair of an identifying attribute of an individual and the sensitive data (including contextual information) and we can see it as a correspondence to a record in a database, where each individual may participate once.
Data publishers typically release events in the form of sequences of data items, usually indexed in time order (time series) and geotagged, e.g.,~(`Dewey', `at home at Montmartre at $t_1$'), \dots, (`Quackmore', `dining at Opera at $t_1$'). Data publishers typically release events in the form of sequences of data items, usually indexed in time order (time series) and geotagged, e.g.,~(`Dewey', `at home at Montmartre at $t_1$'), \dots, (`Quackmore', `dining at Opera at $t_1$').
We use the term `users' to refer to the \emph{individuals}, also known as \emph{participants}, who are the source of the processed and published data. We use the term `users' to refer to the \emph{individuals}, also known as \emph{participants}, who are the source of the processed and published data.
@ -61,9 +64,13 @@ Users are subject to privacy attacks, and thus are the main point of interest of
In more detail, the privacy protection levels are: In more detail, the privacy protection levels are:
\begin{enumerate}[(a)] \begin{enumerate}[(a)]
\item \emph{Event}~\cite{dwork2010differential, dwork2010pan}---limits the privacy protection to \emph{any single event} in a time series, providing maximum \kat{maximum? better say high} data utility. \item \emph{Event}~\cite{dwork2010differential, dwork2010pan}---limits the privacy protection to \emph{any single event} in a time series, providing high
% \kat{maximum? better say high}
data utility.
\item \emph{$w$-event}~\cite{kellaris2014differentially}---provides privacy protection to \emph{any sequence of $w$ events} in a time series. \item \emph{$w$-event}~\cite{kellaris2014differentially}---provides privacy protection to \emph{any sequence of $w$ events} in a time series.
\item \emph{User}~\cite{dwork2010differential, dwork2010pan}---protects \emph{all the events} in a time series, providing maximum\kat{maximum? better say high} privacy protection. \item \emph{User}~\cite{dwork2010differential, dwork2010pan}---protects \emph{all the events} in a time series, providing high
% \kat{maximum? better say high}
privacy protection.
\end{enumerate} \end{enumerate}
Figure~\ref{fig:prv-levels} demonstrates the application of the possible protection levels on the statistical data of Example~\ref{ex:continuous}. Figure~\ref{fig:prv-levels} demonstrates the application of the possible protection levels on the statistical data of Example~\ref{ex:continuous}.
@ -71,6 +78,7 @@ For instance, in event-level (Figure~\ref{fig:level-event}) it is hard to determ
Moreover, in user-level (Figure~\ref{fig:level-user}) it is hard to determine whether Quackmore was ever included in the released series of events at all. Moreover, in user-level (Figure~\ref{fig:level-user}) it is hard to determine whether Quackmore was ever included in the released series of events at all.
Finally, in $2$-event-level (Figure~\ref{fig:level-w-event}) it is hard to determine whether Quackmore was ever included in the released series of events between the timestamps $t_1$ and $t_2$, $t_2$ and $t_3$, etc. (i.e.,~for a window $w = 2$). Finally, in $2$-event-level (Figure~\ref{fig:level-w-event}) it is hard to determine whether Quackmore was ever included in the released series of events between the timestamps $t_1$ and $t_2$, $t_2$ and $t_3$, etc. (i.e.,~for a window $w = 2$).
\kat{Already, by looking at the original counts, for the reader it is hard to see if Quackmore was in the event/database. So, we don't really get the difference among the different levels here.} \kat{Already, by looking at the original counts, for the reader it is hard to see if Quackmore was in the event/database. So, we don't really get the difference among the different levels here.}
\mk{It is without background knowledge.}
\begin{figure}[htp] \begin{figure}[htp]
\centering \centering
@ -83,7 +91,10 @@ Finally, in $2$-event-level (Figure~\ref{fig:level-w-event}) it is hard to deter
\subcaptionbox{$2$-event-level\label{fig:level-w-event}}{% \subcaptionbox{$2$-event-level\label{fig:level-w-event}}{%
\includegraphics[width=.32\linewidth]{level-w-event}% \includegraphics[width=.32\linewidth]{level-w-event}%
}\hspace{\fill} }\hspace{\fill}
\caption{Protecting the data of Table~\ref{tab:continuous-statistical} on (a)~event-, (b)~user-, and (c)~$2$-event-level. A suitable distortion method can be applied accordingly. \kat{Why don't you distort the results already in this table?}} \caption{Protecting the data of Table~\ref{tab:continuous-statistical} on (a)~event-, (b)~user-, and (c)~$2$-event-level. A suitable distortion method can be applied accordingly.
% \kat{Why don't you distort the results already in this table?}
% \mk{Because we've not discussed yet about these operations.}
}
\label{fig:prv-levels} \label{fig:prv-levels}
\end{figure} \end{figure}