related: Intro and shapes

This commit is contained in:
Manos Katsomallos 2021-10-22 17:10:34 +02:00
parent ebe481d089
commit eb51e54d4c
8 changed files with 21 additions and 37 deletions

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -1,47 +1,24 @@
\chapter{Related work} \chapter{Related work}
\label{ch:rel} \label{ch:rel}
% \kat{Change the way you introduce the related work chapter; do not list a series of surveys. You should speak about the several directions for privacy-preserving methods (and then citing the surveys if you want). Then, you should focus on the particular configuration that you are interested in (continual observation). Summarize what we will see in the next sections by giving also the general structure of the chapter.}
\kat{Change the way you introduce the related work chapter; do not list a series of surveys. You should speak about the several directions for privacy-preserving methods (and then citing the surveys if you want). Then, you should focus on the particular configuration that you are interested in (continual observation). Summarize what we will see in the next sections by giving also the general structure of the chapter.} % \mk{Moved to summary}
In this chapter, we survey works that deal with privacy under continuous data publishing covering diverse use cases.
Since the domain of data privacy is vast, several surveys have already been published with different scopes. We present $48$ published articles spanning $16$ years of research from $2006$ to $2021$, with $2015$ being the median, based on two levels of categorization (Figure~\ref{fig:rel-yrs}).
A group of surveys focuses on specific different families of privacy-preserving algorithms and techniques. % \kat{The related work section of your thesis, should make a connection/comparison to your work. This means that you should position the works presented wrt your problem and your solution if the problems are the same. Put a small (or big) paragraph in the end of each of the two sections (microdata and statistical data) and name the similarities/differences }
For instance, Simi et al.~\cite{simi2017extensive} provide an extensive study of works on $k$-anonymity and Dwork~\cite{dwork2008differential} focuses on differential privacy. % \mk{OK}
Another group of surveys focuses on techniques that allow the execution of data mining or machine learning tasks with some privacy guarantees, e.g.,~Wang et al.~\cite{wang2009survey}, and Ji et al.~\cite{ji2014differential}.
In a more general scope, Wang et al.~\cite{wang2010privacy} analyze the challenges of privacy-preserving data publishing, and offer a summary and evaluation of relevant techniques.
Additional surveys look into issues around Big Data and user privacy.
Indicatively, Jain et al.~\cite{jain2016big}, and Soria-Comas and Domingo-Ferrer~\cite{soria2016big} examine how Big Data conflict with pre-existing concepts of privacy-preserving data management, and how efficiently $k$-anonymity and $\varepsilon$-differential privacy deal with the characteristics of Big Data.
Others narrow down their research to location privacy issues.
To name a few, Chow and Mokbel~\cite{chow2011trajectory} investigate privacy protection in continuous LBSs and trajectory data publishing, Chatzikokolakis et al.~\cite{chatzikokolakis2017methods} review privacy issues around the usage of LBSs and relevant protection mechanisms and metrics, Primault et al.~\cite{primault2018long} summarize location privacy threats and privacy-preserving mechanisms, and Fiore et al.~\cite{fiore2019privacy} focus only on privacy-preserving publishing of trajectory microdata.
Finally, there are some surveys on application-specific privacy challenges.
For example, Zhou et al.~\cite{zhou2008brief} have a focus on social networks, and Christin et al.~\cite{christin2011survey} give an outline of how privacy aspects are addressed in crowdsensing applications.
In this chapter, we document works that deal with privacy under continuous data publishing covering diverse use cases.
We present the works in the literature based on two levels of categorization.
First, we group works with respect to whether they deal with microdata or statistical data (see Section~\ref{subsec:data-categories} for the definitions) as input.
Then, we further group them into two subcategories, whether they are designed for the finite or infinite (see Section.~\ref{subsec:data-publishing}) observation setting. \kat{continue.. say also in which category you place your work}
%Such a documentation becomes very useful nowadays, due to the abundance of continuously user-generated data sets that could be analyzed and/or published in a privacy-preserving way, and the quick progress made in this research field.
\kat{The related work section of your thesis, should make a connection/comparison to your work. This means that you should position the works presented wrt your problem and your solution if the problems are the same. Put a small (or big) paragraph in the end of each of the two sections (microdata and statistical data) and name the similarities/differences }
\begin{figure}[htp] \begin{figure}[htp]
\centering \centering
\includegraphics[width=1.\linewidth]{related/rel-yrs}% \includegraphics[width=.75\linewidth]{related/rel-yrs}%
\caption{.} \caption{Number of reviewed published articles on continuous data publishing of microdata and statistical data per year.}
\label{fig:rel-yrs} \label{fig:rel-yrs}
\end{figure} \end{figure}
\mk{WIP} First, we group works with respect to whether they deal with microdata or statistical data (see Section~\ref{subsec:data-categories} for the definitions) as input.
The works are equally divided between the two data categories, while $55$\% of them propose location-specific techniques.
$48$ articles in total Then, we further group them into two subcategories, whether they are designed for the finite or infinite (see Section.~\ref{subsec:data-publishing}) observation setting.
spanning $15$ years of research from $2006$ to $2021$ $59$\% of the reviewed literature deals with finite data observation, $57$\% implements the streaming publishing mode, while $77$\% applies the global publishing scheme.
median year $2015$ Finally, we identify the privacy-related aspects of each work in terms of the method and protection level that they apply, as well as the privacy attacks that they are considering with emphasis on the underlying data dependence (see Figure~\ref{fig:rel-stats} for the detailed cumulative statistics).
$50$\% microdata
$55$\% geo-tagged data
$59$\% finite data observation
$57$\% streaming publishing mode
$77$\% global publishing scheme
\begin{figure}[htp] \begin{figure}[htp]
\centering \centering
@ -52,16 +29,23 @@ $77$\% global publishing scheme
\includegraphics[width=.5\linewidth]{related/rel-prot}% \includegraphics[width=.5\linewidth]{related/rel-prot}%
}% }%
\hfill \hfill
\\ \bigskip
\subcaptionbox{Privacy attack\label{fig:rel-atk}}{% \subcaptionbox{Privacy attack\label{fig:rel-atk}}{%
\includegraphics[width=.5\linewidth]{related/rel-atk}% \includegraphics[width=.5\linewidth]{related/rel-atk}%
}% }%
\subcaptionbox{Data dependence\label{fig:rel-dep}}{% \subcaptionbox{Data dependence\label{fig:rel-dep}}{%
\includegraphics[width=.5\linewidth]{related/rel-dep}% \includegraphics[width=.5\linewidth]{related/rel-dep}%
}% }%
\caption{.} \caption{The privacy-related aspects of the reviewed literature in terms of (a)~the privacy method utilized, (b)~the protection level provided, (c)~the privacy attack considered, and (d)~data dependence therein.}
\label{fig:rel-stats} \label{fig:rel-stats}
\end{figure} \end{figure}
% \kat{continue.. say also in which category you place your work}
Our work, which we present subsequently in Section~\ref{ch:lmdk-prv}, focuses primarily on microdata for its use case.
However, it is possible to deal with statistical data in specific scenarios.
For simplicity, we limit the conversation in microdata and plan to investigate more diverse settings in our future work.
\input{related/micro} \input{related/micro}
\input{related/statistical} \input{related/statistical}
\input{related/summary} \input{related/summary}