diff --git a/text/preliminaries/privacy.tex b/text/preliminaries/privacy.tex index c7a15d2..e8403f7 100644 --- a/text/preliminaries/privacy.tex +++ b/text/preliminaries/privacy.tex @@ -91,19 +91,19 @@ Contrary to event-level, which provides privacy guarantees for a single event, u Event- and $w$-event-level handle better scenarios of infinite data observation, whereas user-level is more appropriate when the span of data observation is finite. $w$-event- is narrower than user-level protection due to its sliding window processing methodology. In the extreme cases where $w$ is equal either to $1$ or to the length of the time series, $w$-event- matches event- or user-level protection, respectively. -Although the described levels have been coined in the context of \emph{differential privacy}~\cite{dwork2006calibrating}, a seminal privacy method that we will discuss in more detail in Section~\ref{subsec:prv-statistical}, they are also used for other privacy protection techniques as well. +Although the described levels have been coined in the context of \emph{differential privacy}~\cite{dwork2006calibrating}, a seminal privacy method that we will discuss in more detail in Section~\ref{subsec:prv-statistical}, they are used for other privacy protection techniques as well. \subsection{Privacy-preserving operations} \label{subsec:prv-operations} -Protecting private information, which is known by many names (obfuscation, cloaking, anonymization, etc.), is achieved by using a specific basic privacy protection operation. -Depending on the intervention that we choose to perform on the original data, we identify the following operations: +Protecting private information, which is known by many names (obfuscation, cloaking, anonymization, etc.\kat{the techniques are not equivalent, so it is correct to say that they are different names for the same thing}), is achieved by using a specific basic \kat{but later you mention several ones.. so what is the specific basic one ?}privacy protection operation. +Depending on the intervention\kat{?, technique, algorithm, method, operation, intervention.. we are a little lost with the terminology and the difference among all these } that we choose to perform on the original data, we identify the following operations:\kat{you can mention that the different operations have different granularity} \begin{itemize} - \item \emph{Aggregation}---group together multiple rows of a data set to form a single value. - \item \emph{Generalization}---replace an attribute value with a parent value in the attribute taxonomy. - Notice that a step of generalization, may be followed by a step of \emph{specialization}, to improve the quality of the resulting data set. + \item \emph{Aggregation}---group\kat{or combine? also maybe mention that the single value will replace the values of a specific attribute of these rows} together multiple rows of a data set to form a single value. + \item \emph{Generalization}---replace an attribute value with a parent value in the attribute taxonomy (when applicable). + Notice that a step of generalization, may be followed by a step of \emph{specialization}, to improve the quality of the resulting data set.\kat{This technical detail is not totally clear at this point. Either elaborate or remove.} \item \emph{Suppression}---delete completely certain sensitive values or entire records. \item \emph{Perturbation}---disturb the initial attribute value in a deterministic or probabilistic way. The probabilistic data distortion is referred to as \emph{randomization}. @@ -114,9 +114,9 @@ If we want to protect the \emph{Age} of the user by aggregation, we may replace It is worth mentioning that there is a series of algorithms (e.g.,~\cite{benaloh2009patient, kamara2010cryptographic, cao2014privacy}) based on the \emph{cryptography} operation. However, the majority of these methods, among other assumptions that they make, have minimum or even no trust to the entities that handle the personal information. -Furthermore, the amount and the way of data processing of these techniques usually burden the overall procedure, deteriorate the utility of the resulting data sets, and restrict their applicability. +Furthermore, the amount and the way of data processing of these techniques usually burden the overall procedure, deteriorate the utility of the resulting data sets, and restrict their applicability.\kat{All these points apply also to the non-cryptography techniques. So you should mostly point out that they do not only deteriorate the utility but make them non-usable at all.} Our focus is limited to techniques that achieve a satisfying balance between both participants' privacy and data utility. -For these reasons, there will be no further discussion around this family of techniques in this article. +For these reasons, there will be no further discussion around this family of techniques in this article.\kat{sentence that fitted in the survey but not in the thesis so replace with a more pertinent comment} \subsection{Basic notions for privacy protection}