diff --git a/graphics/problem/lmdk-scenario.pdf b/graphics/problem/lmdk-scenario.pdf index 804b4a1..1d1975d 100644 Binary files a/graphics/problem/lmdk-scenario.pdf and b/graphics/problem/lmdk-scenario.pdf differ diff --git a/text/bibliography.bib b/text/bibliography.bib index 363dd8d..50db20d 100644 --- a/text/bibliography.bib +++ b/text/bibliography.bib @@ -719,6 +719,17 @@ publisher = {ACM} } +@article{gaskell2000telescoping, + title = {Telescoping of landmark events: implications for survey research}, + author = {Gaskell, George D and Wright, Daniel B and O'Muircheartaigh, Colm A}, + journal = {The Public Opinion Quarterly}, + volume = {64}, + number = {1}, + pages = {77--89}, + year = {2000}, + publisher = {JSTOR} +} + @article{gedik2008protecting, title = {Protecting location privacy with personalized k-anonymity: Architecture and algorithms}, author = {Gedik, Bugra and Liu, Ling}, diff --git a/text/problem/main.tex b/text/problem/main.tex index fa4c12b..caf9134 100644 --- a/text/problem/main.tex +++ b/text/problem/main.tex @@ -5,43 +5,47 @@ % Crowdsensing applications The plethora of sensors currently embedded in personal devices and other infrastructures have paved the way for the development of numerous \emph{crowdsensing services} (e.g.,~Ring~\cite{ring}, TousAntiCovid~\cite{tousanticovid}, Waze~\cite{waze}, etc.) based on the collected personal, and usually geotagged and timestamped data. % Continuously user-generated data -User--service interactions gather personal event-like data that are data items comprised by pairs of an identifying attribute of an individual and the---possibly sensitive---information at a timestamp (including contextual information), e.g.,~(\emph{`Bob', `dining', `Canal Saint-Martin', $17{:}00$}). -%For a reminder, when the interactions are performed in a continuous manner, we obtain time series of events. -% Observation/interaction duration -%Depending on the duration, we distinguish the interaction/observation into finite, when taking place during a predefined time interval, and infinite, when taking place in an uninterrupted fashion. -Example~\ref{ex:scenario} shows the result of user--LBS interaction while retrieving location-based information or reporting user-state at various locations. - +User--service interactions gather personal event-like data, which are tuples of an identifying attribute of an individual and the---possibly sensitive---information with a timestamp +%(including contextual information), +e.g.,~(\emph{`Bob', `dining', `Canal Saint-Martin', $17{:}00$}). +When the interactions are performed in a continuous manner, we obtain ~\emph{time series} of events. +Example~\ref{ex:scenario} is an example of a user--service interaction that results in retrieving location-based information or reporting user-state at various locations. + \begin{example} \label{ex:scenario} - - Consider a finite sequence of spatiotemporal data generated by Bob during an interval of $8$ timestamps, as shown in Figure~\ref{fig:scenario}. - Events in a shade correspond to privacy-sensitive - \kat{You should not say that only significant events are privacy-sensitive, because then why put noise to the normal timestamps? Maybe say directly significant for the shaded events?} events that Bob has defined beforehand. For instance, $p_1$ and $p_8$ are significant because he was at his home, which is around {\'E}lys{\'e}e, at $p_3$ he was at his workplace around the Louvre, and at $p_5$ he was at his hangout around Canal Saint-Martin. - + Figure~\ref{fig:lmdk-scenario} shows a finite sequence of spatiotemporal data, generated by Bob, during an interval of $8$ timestamps. + Events in gray correspond to + % privacy-sensitive + % \kat{You should not say that only significant events are privacy-sensitive, because then why put noise to the normal timestamps? Maybe say directly significant for the shaded events?} + significant + events that Bob has defined beforehand, because they are related to his home (around {\'E}lys{\'e}e), his workplace (around the Louvre), and his hangout (around Canal Saint-Martin). \begin{figure}[htp] \centering \includegraphics[width=\linewidth]{problem/lmdk-scenario} - \caption{A time series with {\thethings} (highlighted in gray). - } - \label{fig:scenario} + \caption{A time series with {\thethings} (highlighted in gray).} + \label{fig:lmdk-scenario} \end{figure} - \end{example} % Privacy-preserving data processing -The services collect and further process the time series in order to give useful feedback to the involved users or to provide valuable insight to various internal/external analytical services. +% Services collect and further process the time series in order to give useful feedback to the involved users or to provide valuable insight to various internal/external analytical services. The regulation regarding the processing of user-generated data sets~\cite{tankard2016gdpr} requires the provision of privacy guarantees to the users. -At the same time, it is essential to provide utility metrics to the final consumers of the privacy-preserving process output. -To accomplish this, various privacy techniques perturb the original data or the processing output at the expense of the overall utility of the final output. -A widely recognized tool that introduces probabilistic randomness to the original data, while quantifying with a parameter $\varepsilon$ (`privacy budget'~\cite{mcsherry2009privacy}) the privacy/utility ratio is \emph{$\varepsilon$-differential privacy}~\cite{dwork2006calibrating}. +To accomplish this, various privacy techniques perturb the original data or their statistical output at the expense of the overall utility of the final output. +Meanwhile, it is essential to provide data of high utility +%\kat{why metrics and not say to provide high utility? we do not define in this work new metrics..} +to the final consumers of the privacy-preserving process. +A widely recognized method that introduces probabilistic randomness to the original data, while quantifying with a parameter $\varepsilon$ (`privacy budget'~\cite{mcsherry2009privacy}) the privacy/utility ratio, is \emph{$\varepsilon$-differential privacy}~\cite{dwork2006calibrating}. Due to its \emph{composition} property, i.e.,~the combination of differentially private outputs satisfies differential privacy as well, differential privacy is suitable for privacy-preserving time series publishing. -\emph{Event}, \emph{user}~\cite{dwork2010differential, dwork2010pan}, and \emph{$w$-event}~\cite{kellaris2014differentially} comprise the possible levels of privacy protection. -Event-level limits the privacy protection to \emph{any single event}, user-level protects \emph{all the events} of any user, and $w$-event provides privacy protection to \emph{any sequence of $w$ events}. -\kat{Please write another introduction for your chapter, that is in connection to your thesis, not the paper.. all this information in this paragraph must be said in the introduction of the thesis, not of the chapter.. } +\emph{Event}, \emph{user}~\cite{dwork2010differential}, and \emph{$w$-event}~\cite{kellaris2014differentially} comprise the possible levels of privacy protection. +Event-level limits the protection to \emph{any single event}, user-level protects \emph{all the events} of any user, and $w$-event provides protection to \emph{any sequence of $w$ events}. +In every case, privacy protection boils down to allocating to events an overall privacy budget that does not exceed $\varepsilon$. +% \kat{Please write another introduction for your chapter, that is in connection to your thesis, not the paper.. all this information in this paragraph must be said in the introduction of the thesis, not of the chapter.. } +% \mk{I cannot think of something better rn} In this chapter, we propose a novel configurable privacy scheme, \emph{\thething} privacy (Section~\ref{sec:thething}), which takes into account significant events (\emph{\thethings}) in the time series and allocates the available privacy budget accordingly. We propose three privacy schemes that guarantee {\thething} privacy. -To further enhance our privacy methodology, and protect the {\thethings} position in the time series, we propose techniques to perturb the initial {\thethings} set (Section~\ref{sec:theotherthing}).\kat{this is the content that you must enrich and motivate more in the intro of this chapter} +To further enhance our privacy methodology, and protect the {\thethings} position in the time series, we propose techniques to perturb the initial {\thethings} set (Section~\ref{sec:theotherthing}). +% \kat{this is the content that you must enrich and motivate more in the intro of this chapter} \input{problem/thething/main} \input{problem/theotherthing/main}