privacy: Reviewed subsec:prv-attacks

This commit is contained in:
Manos Katsomallos 2021-08-04 00:24:24 +03:00
parent 061935b10c
commit bc313505a3

View File

@ -29,15 +29,14 @@ Information disclosure is typically achieved by combining supplementary (backgro
In its general form, this is known as \emph{adversarial} or \emph{linkage} attack.
Even though many works directly refer to the general category of linkage attacks, we distinguish also the following sub-categories, addressed in the literature:
\paragraph{Sensitive attribute domain} knowledge.
Here we can identify \emph{homogeneity and skewness} attacks~\cite{machanavajjhala2006diversity,li2007t}, when statistics of the sensitive attribute values are available, and \emph{similarity attack}, when semantics of the sensitive attribute values are available.
\paragraph{Complementary release} attacks~\cite{sweeney2002k} with regard to previous releases of different versions of the same and/or related data sets.
In this category, we also identify the \emph{unsorted matching} attack~\cite{sweeney2002k}, which is achieved when two privacy-protected versions of an original data set are published in the same tuple ordering.
Other instances include: (i)~the \emph{join} attack~\cite{wang2006anonymizing}, when tuples can be identified by joining (on the (quasi-)identifiers) several releases, (ii)~the \emph{tuple correspondence} attack~\cite{fung2008anonymity}, when in case of incremental data certain tuples correspond to certain tuples in other releases, in an injective way, (iii)~the \emph{tuple equivalence} attack~\cite{he2011preventing}, when tuples among different releases are found to be equivalent with respect to the sensitive attribute, and (iv)~the \emph{unknown releases} attack~\cite{shmueli2015privacy}, when the privacy preservation is performed without knowing the previously privacy-protected data sets.
\paragraph{Data dependence} either within one data set or among one data set and previous data releases, and/or other external sources~\cite{kifer2011no, chen2014correlated, liu2016dependence, zhao2017dependent}.
We will look into this category in more detail later in Section~\ref{sec:correlation}.
\begin{itemize}
\item \emph{Sensitive attribute domain knowledge} can result in \emph{homogeneity and skewness} attacks~\cite{machanavajjhala2006diversity,li2007t}, when statistics of the sensitive attribute values are available, and \emph{similarity attack}, when semantics of the sensitive attribute values are available.
\item \emph{Complementary release attacks}~\cite{sweeney2002k} with regard to previous releases of different versions of the same and/or related data sets.
In this category, we also identify the \emph{unsorted matching} attack~\cite{sweeney2002k}, which is achieved when two privacy-protected versions of an original data set are published in the same tuple ordering.
Other instances include: (i)~the \emph{join} attack~\cite{wang2006anonymizing}, when tuples can be identified by joining (on the (quasi-)identifiers) several releases, (ii)~the \emph{tuple correspondence} attack~\cite{fung2008anonymity}, when in case of incremental data certain tuples correspond to certain tuples in other releases, in an injective way, (iii)~the \emph{tuple equivalence} attack~\cite{he2011preventing}, when tuples among different releases are found to be equivalent with respect to the sensitive attribute, and (iv)~the \emph{unknown releases} attack~\cite{shmueli2015privacy}, when the privacy preservation is performed without knowing the previously privacy-protected data sets.
\item \emph{Data dependence} either within one data set or among one data set and previous data releases, and/or other external sources~\cite{kifer2011no, chen2014correlated, liu2016dependence, zhao2017dependent}.
We will look into this category in more detail later in Section~\ref{sec:correlation}.
\end{itemize}
The first sub-category of attacks has been mainly addressed in works on snapshot microdata publishing, and is still present in continuous publishing; however, algorithms for continuous publishing typically accept the proposed solutions for the snapshot publishing scheme (see discussion over $k$-anonymity and $l$-diversity in Section~\ref{subsec:prv-seminal}).
This kind of attacks is tightly coupled with publishing the (privacy-protected) sensitive attribute value.
@ -49,7 +48,6 @@ By the data dependence attack, the status of Donald could be more certainly infe
In order to better protect the privacy of Donald in case of attacks, the data should be privacy-protected in a more adequate way (than without the attacks).
\subsection{Levels of privacy protection}
\label{subsec:prv-levels}