diff --git a/text/preliminaries/privacy.tex b/text/preliminaries/privacy.tex index 9a4300e..95922f8 100644 --- a/text/preliminaries/privacy.tex +++ b/text/preliminaries/privacy.tex @@ -383,7 +383,7 @@ In the special case that we query disjoint data sets, we can take advantage of t When $m \in \mathbb{Z}^+$ independent privacy mechanisms, satisfying $\varepsilon_1$-, $\varepsilon_2$-,\dots, $\varepsilon_m$-differential privacy respectively, are applied over disjoint independent subsets of a data set, they provide a privacy guarantee equal to $\max_{i \in [1, m]} \varepsilon_i$. \end{theorem} -When the users consider recent data releases more privacy sensitive than distant ones, we estimate the overall privacy loss in a time fading manner according to a temporal discounting function, e.g.,~exponential, hyperbolic,~\cite{farokhi2020temporally}. +When the users consider recent data releases more privacy-sensitive than distant ones, we estimate the overall privacy loss in a time fading manner according to a temporal discounting function, e.g.,~exponential, hyperbolic,~\cite{farokhi2020temporally}. \begin{theorem} [Sequential composition with temporal discounting] diff --git a/text/problem/main.tex b/text/problem/main.tex index df35ccd..0d114a6 100644 --- a/text/problem/main.tex +++ b/text/problem/main.tex @@ -15,7 +15,7 @@ Example~\ref{ex:scenario} shows the result of user--LBS interaction while retrie Consider a finite sequence of spatiotemporal data generated by Bob during an interval of $8$ timestamps, as shown in Figure~\ref{fig:scenario}. Events in a shade correspond to privacy-sensitive - \kat{You should not say that only significant events are privacy sensitive, because then why put noise to the normal timestamps? Maybe say directly significant for the shaded events?} events that Bob has defined beforehand. For instance, $p_1$ and $p_8$ are significant because he was at his home, which is around {\'E}lys{\'e}e, at $p_3$ he was at his workplace around the Louvre, and at $p_5$ he was at his hangout around Canal Saint-Martin. + \kat{You should not say that only significant events are privacy-sensitive, because then why put noise to the normal timestamps? Maybe say directly significant for the shaded events?} events that Bob has defined beforehand. For instance, $p_1$ and $p_8$ are significant because he was at his home, which is around {\'E}lys{\'e}e, at $p_3$ he was at his workplace around the Louvre, and at $p_5$ he was at his hangout around Canal Saint-Martin. \begin{figure}[htp] \centering diff --git a/text/problem/theotherthing/main.tex b/text/problem/theotherthing/main.tex index 4dcab94..df656d2 100644 --- a/text/problem/theotherthing/main.tex +++ b/text/problem/theotherthing/main.tex @@ -4,7 +4,7 @@ In Section~\ref{sec:thething}, we introduced the notion of {\thething} events in privacy-preserving time series publishing. The differentiation among regular and {\thething} events stipulates a privacy budget allocation that deviates from the application of existing differential privacy protection levels. Based on this novel event categorization, we designed three models (Section~\ref{subsec:lmdk-mechs}) that achieve {\thething} privacy. -For this, we assumed that the timestamps in the {\thething} set $L$ are not privacy sensitive, and therefore we used them in our models as they were. +For this, we assumed that the timestamps in the {\thething} set $L$ are not privacy-sensitive, and therefore we used them in our models as they were. This may pose a direct or indirect privacy threat to the data generators (users). For the former, we consider the case where we desire to publish $L$ as complimentary information to the release of the event values. diff --git a/text/problem/theotherthing/solution.tex b/text/problem/theotherthing/solution.tex index 13e0138..5c84494 100644 --- a/text/problem/theotherthing/solution.tex +++ b/text/problem/theotherthing/solution.tex @@ -198,8 +198,8 @@ In the next step of the process, we randomly select a set by utilizing the expon For this procedure, we allocate a small fraction of the available privacy budget, i.e.,~$1$\% or even less (see Section~\ref{subsec:sel-eps} for more details). -\paragraph{Score function} -Prior to selecting a set, the exponential mechanism evaluates each set using a score function. +\paragraph{Utility score function} +Prior to selecting a set, the exponential mechanism evaluates each set using a utility score function. One way evaluate each set is by taking into account the temporal position the events in the sequence. % Nearby events diff --git a/text/problem/thething/solution.tex b/text/problem/thething/solution.tex index 9f7bf15..cbe7259 100644 --- a/text/problem/thething/solution.tex +++ b/text/problem/thething/solution.tex @@ -60,7 +60,7 @@ to the next timestamps. \subsubsection{{\Thething} privacy under temporal correlation} -\label{subsec:lmdk-cor} +\label{subsec:lmdk-tpl} From the discussion so far, it is evident that for the budget distribution it is not the positions but rather the number of the {\thethings} that matters. However, this is not the case under the presence of temporal correlation, which is inherent in continuously generated data. diff --git a/text/related/statistical.tex b/text/related/statistical.tex index cfe2856..864474f 100644 --- a/text/related/statistical.tex +++ b/text/related/statistical.tex @@ -238,7 +238,7 @@ However, the framework does not take into account privacy leakage stemming from % - perturbation (Laplace) \hypertarget{bolot2013private}{Bolot et al.}~\cite{bolot2013private} introduce the notion of \emph{decayed privacy} in continual observation of aggregates (sums). The authors recognize the fact that monitoring applications focus more on recent events, and data, therefore, the value of previous data releases exponentially fades. -This leads to a schema of privacy with expiration, according to which, recent events, and data are more privacy sensitive than those preceding. +This leads to a schema of privacy with expiration, according to which, recent events, and data are more privacy-sensitive than those preceding. Based on this, they apply decayed sum functions for answering sliding window queries of fixed window size $w$ on data streams. Namely, window sum compute the difference of two running sums, and exponentially decayed and polynomial decayed sums estimate the sum of decayed data. For every consecutive $w$ data points the algorithm generates binary trees where each node is perturbed with Laplace noise with scale proportional to $w$. @@ -409,7 +409,7 @@ RPTR adapts the rate with which it samples data according to the accuracy with w Before releasing data statistics, the mechanism perturbs the original values with Laplacian noise the impact of which is mitigated by using Ensemble Kalman filtering. The combination of adaptive sampling and filtering can improve the accuracy when predicting the values of non-sampled data points, and thus saving more privacy budget (i.e.,~higher data utility) for data points that the mechanism decides to release. The mechanism detects highly frequented map regions and, using a quad-tree, it calculate the each region's privacy weight. -In their implementation, the authors assume that highly frequented regions tend to be more privacy sensitive, and thus more noise (i.e.,~less privacy budget to invest) needs to be introduced before publicly releasing the users' data falling into these regions. +In their implementation, the authors assume that highly frequented regions tend to be more privacy-sensitive, and thus more noise (i.e.,~less privacy budget to invest) needs to be introduced before publicly releasing the users' data falling into these regions. The efficiency (both in terms of user privacy and data utility) of the mechanism depends on the number of regions that it divides the map, and therefore the challenge of its optimal division is an interesting future research topic. % Temporally Discounted Differential Privacy for Evolving Datasets on an Infinite Horizon