diff --git a/text/preliminaries/privacy.tex b/text/preliminaries/privacy.tex index 56e3b25..5975e09 100644 --- a/text/preliminaries/privacy.tex +++ b/text/preliminaries/privacy.tex @@ -131,7 +131,12 @@ For completeness, in this section we present the seminal works for privacy-prese Sweeney coined \emph{$k$-anonymity}~\cite{sweeney2002k}, one of the first established works on data privacy. A released data set features $k$-anonymity protection when the sequence of values for a set of identifying attributes, called the \emph{quasi-identifiers}, is the same for at least $k$ records in the data set. Computing the quasi-identifiers in a set of attributes is still a hard problem on its own~\cite{motwani2007efficient}. -$k$-anonymity is syntactic\kat{meaning?}, it constitutes an individual indistinguishable from at least $k-1$ other individuals in the same data set.\kat{you just said this in another way,two sentences before} +% $k$-anonymity +% is syntactic, +% \kat{meaning?} +% it +% constitutes an individual indistinguishable from at least $k-1$ other individuals in the same data set. +% \kat{you just said this in another way,two sentences before} In a follow-up work~\cite{sweeney2002achieving}, the author describes a way to achieve $k$-anonymity for a data set by the suppression or generalization of certain values of the quasi-identifiers. Several works identified and addressed privacy concerns on $k$-anonymity. Machanavajjhala et al.~\cite{machanavajjhala2006diversity} pointed out that $k$-anonymity is vulnerable to homogeneity and background knowledge attacks. @@ -146,7 +151,8 @@ A data set features $\theta$-closeness when all of its groups satisfy $\theta$- The main drawback of $k$-anonymity (and its derivatives) is that it is not tolerant to external attacks of re-identification on the released data set. The problems identified in~\cite{sweeney2002k} appear when attempting to apply $k$-anonymity on continuous data publishing (as we will also see next in Section~\ref{sec:micro}). These attacks include multiple $k$-anonymous data set releases with the same record order, subsequent releases of a data set without taking into account previous $k$-anonymous releases, and tuple updates. -Proposed solutions include rearranging the attributes, setting the whole attribute set of previously released data sets as quasi-identifiers or releasing data based on previous $k$-anonymous releases.\kat{and the citations of these solutions?} +Proposed solutions include rearranging the attributes, setting the whole attribute set of previously released data sets as quasi-identifiers or releasing data based on previous $k$-anonymous releases~\cite{simi2017extensive}. +% \kat{and the citations of these solutions?} \subsubsection{Statistical data}