35 lines
3.2 KiB
TeX
35 lines
3.2 KiB
TeX
\section{Thesis summary}
|
|
\label{sec:sum-thesis}
|
|
This thesis revolves around the topic of quality and privacy in user-generated Big Data, focusing on the problems regarding privacy-preserving continuous data publishing that we summarize below.
|
|
|
|
|
|
\paragraph{Survey on continuous data publishing}
|
|
We reviewed the existing literature regarding methods on privacy-preserving continuous data publishing, spanning the past two decades, while elaborating on data correlation. Our contributions are:
|
|
\begin{itemize}
|
|
\item We categorized the works that we reviewed based on their input data in either \emph{microdata} or \emph{statistical data} and further separated each data category based on its observation span in \emph{finite} and \emph{infinite}.
|
|
\item We identified the privacy protection algorithms and techniques that each work is using, focusing on feature like the privacy method, operation, attack, and protection level.
|
|
\item We organized the reviewed literature in a tabular form to allow for a more efficient indexation of the related works, using a number of relevant features.
|
|
\end{itemize}
|
|
% \kat{mention here again that the work appears in the article... in the journal...}
|
|
This work appeared in the special feature on Geospatial Privacy and Security of the $19$th
|
|
journal of Spatial Information Science~\cite{katsomallos2019privacy}.
|
|
|
|
\paragraph{Configurable privacy protection for time series}
|
|
We presented ($\varepsilon$, $L$)-\emph{{\thething} privacy}, a novel privacy notion that is based on differential privacy allowing for better data utility in the presence of significant events. Our contributions are:
|
|
\begin{itemize}
|
|
\item We introduced the notion of \emph{{\thething} events} in privacy-preserving data publishing and differentiated events between regular and events that a user might consider more privacy-sensitive (\emph{\thethings}).
|
|
% \item We proposed and formally defined a novel privacy notion, ($\varepsilon$, $L$)-\emph{{\thething} privacy}.
|
|
\item We designed and implemented three {\thething} privacy schemes for {\thethings} spanning a finite time series.
|
|
\item We studied {\thething} privacy under temporal correlation, which is inherent in time series, and observed the effect of {\thethings} on the temporal privacy loss propagation.
|
|
\item We designed an additional differential privacy mechanism, based on the exponential mechanism, for providing
|
|
% additional
|
|
protection to the temporal position of the {\thethings}
|
|
% \kat{what is the name of the mechanism? how do you quantify 'additional' ?}
|
|
by generating dummy {\thething} set options.
|
|
\item We experimentally evaluated our proposal on real and synthetic data sets, and compared {\thething} privacy schemes with event- and user-level privacy protection, for different {\thething} percentages.
|
|
% \kat{what are the conclusions that show the quality/benefits of the proposed solution?}
|
|
We showed that our methodology can provide adequate differential privacy guarantees while achieving better data utility than the user-level scheme.
|
|
\end{itemize}
|
|
% \kat{mention here again that the work appears in the article... submitted at...}
|
|
This work will appear in the proceedings of the $12$th ACM conference on Data and Application Security and Privacy~\cite{katsomallos2022landmark}.
|