the-last-thing/text/problem/main.tex

54 lines
4.6 KiB
TeX
Raw Permalink Normal View History

2021-10-10 22:27:53 +02:00
\chapter{{\Thething} privacy}
2021-10-08 21:32:06 +02:00
\label{ch:lmdk-prv}
\nnfootnote{This chapter will appear in the proceedings of the $12$th ACM conference on Data and Application Security and Privacy~\cite{katsomallos2022landmark}.}
% Crowdsensing applications
The plethora of sensors currently embedded in personal devices and other infrastructures have paved the way for the development of numerous \emph{crowdsensing services} (e.g.,~Ring~\cite{ring}, TousAntiCovid~\cite{tousanticovid}, Waze~\cite{waze}, etc.) based on the collected personal, and usually geotagged and timestamped data.
% Continuously user-generated data
2021-10-25 01:26:59 +02:00
User--service interactions gather personal event-like data, which are tuples of an identifying attribute of an individual and the---possibly sensitive---information with a timestamp
%(including contextual information),
2022-01-07 06:09:10 +01:00
e.g.,~(\emph{`Quackmore', `dining', `Canal Saint-Martin', $17{:}00$}).
2021-10-25 01:26:59 +02:00
When the interactions are performed in a continuous manner, we obtain ~\emph{time series} of events.
Example~\ref{ex:scenario} is an example of a user--service interaction that results in retrieving location-based information or reporting user-state at various locations.
\begin{example}
\label{ex:scenario}
2022-01-07 06:09:10 +01:00
Figure~\ref{fig:lmdk-scenario} shows a finite sequence of spatiotemporal data, generated by Quackmore, during an interval of $8$ timestamps.
2021-10-25 01:26:59 +02:00
Events in gray correspond to
% privacy-sensitive
% \kat{You should not say that only significant events are privacy-sensitive, because then why put noise to the normal timestamps? Maybe say directly significant for the shaded events?}
significant
events that Bob has defined beforehand, because they are related to his home (around {\'E}lys{\'e}e), his workplace (around the Louvre), and his hangout (around Canal Saint-Martin).
\begin{figure}[htp]
\centering
\includegraphics[width=\linewidth]{problem/lmdk-scenario}
2021-10-25 01:26:59 +02:00
\caption{A time series with {\thethings} (highlighted in gray).}
\label{fig:lmdk-scenario}
\end{figure}
\end{example}
% Privacy-preserving data processing
2021-10-25 01:26:59 +02:00
% Services collect and further process the time series in order to give useful feedback to the involved users or to provide valuable insight to various internal/external analytical services.
The regulation regarding the processing of user-generated data sets~\cite{tankard2016gdpr} requires the provision of privacy guarantees to the users.
2021-10-25 01:26:59 +02:00
To accomplish this, various privacy techniques perturb the original data or their statistical output at the expense of the overall utility of the final output.
Meanwhile, it is essential to provide data of high utility
%\kat{why metrics and not say to provide high utility? we do not define in this work new metrics..}
to the final consumers of the privacy-preserving process.
A widely recognized method that introduces probabilistic randomness to the original data, while quantifying with a parameter $\varepsilon$ (`privacy budget'~\cite{mcsherry2009privacy}) the privacy/utility ratio, is \emph{$\varepsilon$-differential privacy}~\cite{dwork2006calibrating}.
Due to its \emph{composition} property, i.e.,~the combination of differentially private outputs satisfies differential privacy as well, differential privacy is suitable for privacy-preserving time series publishing.
2021-10-25 01:26:59 +02:00
\emph{Event}, \emph{user}~\cite{dwork2010differential}, and \emph{$w$-event}~\cite{kellaris2014differentially} comprise the possible levels of privacy protection.
Event-level limits the protection to \emph{any single event}, user-level protects \emph{all the events} of any user, and $w$-event provides protection to \emph{any sequence of $w$ events}.
In every case, privacy protection boils down to allocating to events an overall privacy budget that does not exceed $\varepsilon$.
% \kat{Please write another introduction for your chapter, that is in connection to your thesis, not the paper.. all this information in this paragraph must be said in the introduction of the thesis, not of the chapter.. }
% \mk{I cannot think of something better rn}
2021-10-14 14:01:12 +02:00
In this chapter, we propose a novel configurable privacy scheme, \emph{\thething} privacy (Section~\ref{sec:thething}), which takes into account significant events (\emph{\thethings}) in the time series and allocates the available privacy budget accordingly.
2021-10-15 09:02:12 +02:00
We propose three privacy schemes that guarantee {\thething} privacy.
2021-10-25 01:26:59 +02:00
To further enhance our privacy methodology, and protect the {\thethings} position in the time series, we propose techniques to perturb the initial {\thethings} set (Section~\ref{sec:theotherthing}).
% \kat{this is the content that you must enrich and motivate more in the intro of this chapter}
2021-09-07 16:06:42 +02:00
\input{problem/thething/main}
2021-10-08 21:32:06 +02:00
\input{problem/theotherthing/main}
2021-10-08 21:32:06 +02:00
\input{problem/summary}