diff --git a/text/problem/theotherthing/solution.tex b/text/problem/theotherthing/solution.tex index 842ac87..7a7bfa3 100644 --- a/text/problem/theotherthing/solution.tex +++ b/text/problem/theotherthing/solution.tex @@ -1,25 +1,22 @@ \subsection{Protecting {\thethings}} \label{subsec:lmdk-sel-sol} +The main idea of the privacy-preserving {\thething} selection component is to privately select extra {\thething} event timestamps, i.e.,~dummy {\thethings}, from the set of timestamps $T /\ L$ of the time series $S_T$ and add them to the original {\thething} set $L$. +Thus, we create a new set $L'$ such that $L \subset L' \subseteq T$. +We generate a set of dummy {\thething} set options by adding regular event timestamps from $T /\ L$ to $L$ (Section~\ref{subsec:lmdk-set-opts}). +Then (Section~\ref{subsec:lmdk-opt-sel}), we utilize the exponential mechanism, with a utility function that calculates an indicator for each of the options in the set based on how much it differs from the original {\thething} set $L$, and randomly select one ot the options that we created earlier. +This process provides an extra layer of privacy protection to {\thethings}, and thus allows the release, and thereafter processing, of {\thething} timestamps. + +% We utilize the exponential mechanism with a utility function that calculates an indicator for each of the options in the set that we selected in the previous step. +% The utility depends on the positioning of the {\thething} timestamps of an option in the series, e.g.,~the distance from the previous/next {\thething}, the distance from the start/end of the series, etc. + + \subsubsection{{\Thething} set options} \label{subsec:lmdk-set-opts} This step aims to select a set of candidate {\thething} timestamps options either by randomizing the actual timestamps (Section~\ref{subsec:lmdk-rnd}), or by inserting dummy timestamps (Section~\ref{subsec:lmdk-dum-gen}) to the actual {\thething} timestamps. -\paragraph{{\Thething} randomization} -\label{subsec:lmdk-rnd} - -A simple way to select a set of timestamps without disclosing the actual {\thethings} is by \emph{randomly} selecting an equally sized set of timestamps. -The randomization of the process, as we will discuss in more detail in Section~\ref{subsec:priv-opt-sel}, will depend on the positioning of the {\thethings} in the series of events. -In more detail, given a set of {\thething} timestamps $\{l_k\} \subseteq \{t_n\}$, where $\{t_n\}$ is an event sequence, we need to select all possible sets of size $k$ from $\{t_n\}$. - -However, the introduction of randomization could impact arbitrarily the effectiveness of non-uniform privacy-protection methods. -This applies mainly in cases where we try to achieve optimal privacy-protection of {\thething} events while maximizing the utility of the data that corresponds to the rest of the series of events. -As a consequence, it is possible to end up providing lower levels of protection to {\thething} data than the one necessary, i.e.,~worse than the users' privacy-protection expectations. -The methodology that we present next (Section~\ref{subsec:lmdk-dum-gen}) attempts to tackle the aforementioned shortcoming. - - \paragraph{Dummy {\thething} generation} \label{subsec:lmdk-dum-gen} @@ -138,7 +135,7 @@ Note that the reverse heuristic approach, i.e.,~starting with $\{t_n\}$ {\thethi \subsubsection{Privacy-preserving option selection} -\label{subsec:priv-opt-sel} +\label{subsec:lmdk-opt-sel} % Nearby events Events that occur at recent timestamps are more likely to reveal sensitive information regarding the users involved~\cite{kellaris2014differentially}.