Merge branch 'master'

This commit is contained in:
Manos Katsomallos 2021-10-08 20:13:29 +02:00
commit 731ae65e7f
13 changed files with 71 additions and 24 deletions

View File

@ -98,7 +98,8 @@
} }
\caption{Summary table of reviewed privacy-preserving algorithms for continuous microdata publishing. \caption{Summary table of reviewed privacy-preserving algorithms for continuous microdata publishing.
Location-specific techniques are listed in bold.} Location-specific techniques are listed in bold.\kat{
do you still need to have in bold the location specific techniques? if yes mention why in the text..}}
\label{tab:micro} \label{tab:micro}

View File

@ -99,7 +99,7 @@
} }
\caption{Summary table of reviewed privacy-preserving algorithms for continuous statistical data publishing. \caption{Summary table of reviewed privacy-preserving algorithms for continuous statistical data publishing.
Location-specific techniques are listed in bold.} Location-specific techniques are listed in bold.\kat{same remarks as before for the location}}
\label{tab:statistical} \label{tab:statistical}

View File

@ -1,5 +1,9 @@
<<<<<<< HEAD
\chapter{Landmark privacy} \chapter{Landmark privacy}
\label{ch:thething-prv} \label{ch:thething-prv}
=======
\chapter{Landmark Privacy}
>>>>>>> b334e056b320357ce4f4eaa89a1be7f3576350cf
\input{problem/thething/main} \input{problem/thething/main}
\input{problem/theotherthing/main}

View File

@ -1,2 +1,2 @@
\section{Selection of events} \subsection{Selection of events}
\label{sec:theotherthing} \label{subsec:theotherthing}

View File

@ -1,5 +1,5 @@
\subsection{Contribution} \section{Contribution}
\label{subsec:lmdk-contrib} \label{sec:lmdk-contrib}
In this chapter, we formally define a novel privacy notion that we call \emph{{\thething} privacy}. In this chapter, we formally define a novel privacy notion that we call \emph{{\thething} privacy}.
We apply this privacy notion to time series consisting of \emph{{\thethings}} and regular events, and we design and implement three {\thething} privacy mechanisms. We apply this privacy notion to time series consisting of \emph{{\thethings}} and regular events, and we design and implement three {\thething} privacy mechanisms.

View File

@ -1,6 +1,7 @@
\section{Significant events} %\section{Significant events}
\label{sec:thething} %\label{sec:thething}
<<<<<<< HEAD
In this chapter, we propose a novel configurable privacy scheme, \emph{{\thething} privacy}, which takes into account significant events (\emph{\thethings}) in the time series and allocates the available privacy budget accordingly. In this chapter, we propose a novel configurable privacy scheme, \emph{{\thething} privacy}, which takes into account significant events (\emph{\thethings}) in the time series and allocates the available privacy budget accordingly.
We propose three privacy models that guarantee {\thething} privacy and validate our proposal on real and synthetic data sets. We propose three privacy models that guarantee {\thething} privacy and validate our proposal on real and synthetic data sets.
\kat{Now, you have space so you need to be more detailed in the discussions, the motivation, the examples etc.} \kat{Now, you have space so you need to be more detailed in the discussions, the motivation, the examples etc.}
@ -8,4 +9,16 @@ We propose three privacy models that guarantee {\thething} privacy and validate
\input{problem/thething/contribution} \input{problem/thething/contribution}
\input{problem/thething/problem} \input{problem/thething/problem}
\input{problem/thething/solution} \input{problem/thething/solution}
=======
In this chapter, we propose a novel configurable privacy scheme, \emph{\thething} privacy, which takes into account significant events (\emph{\thethings}) in the time series and allocates the available privacy budget accordingly.
We propose two privacy models that guarantee {\thething} privacy.
To further enhance our privacy method, and protect the landmarks position in the time series, we propose techniques to perturb the initial landmarks set (Section~\ref{sec:theotherthing}).
% and validate our proposal on real and synthetic data sets. \kat{this will go in the experiments section}
\input{problem/thething/motivation}
\input{problem/thething/contribution}
\input{problem/thething/problem}
\input{problem/theotherthing/main}
>>>>>>> b334e056b320357ce4f4eaa89a1be7f3576350cf
\input{problem/thething/summary} \input{problem/thething/summary}

View File

@ -1,5 +1,5 @@
\subsection{Motivation} \section{Motivation}
\label{subsec:lmdk-motiv} \label{sec:lmdk-motiv}
% Crowdsensing applications % Crowdsensing applications
The plethora of sensors currently embedded in personal devices and other infrastructures have paved the way for the development of numerous \emph{crowdsensing services} (e.g.,~Ring~\cite{ring}, TousAntiCovid~\cite{tousanticovid}, Waze~\cite{waze}, etc.) based on the collected personal, and usually geotagged and timestamped data. The plethora of sensors currently embedded in personal devices and other infrastructures have paved the way for the development of numerous \emph{crowdsensing services} (e.g.,~Ring~\cite{ring}, TousAntiCovid~\cite{tousanticovid}, Waze~\cite{waze}, etc.) based on the collected personal, and usually geotagged and timestamped data.

View File

@ -1,5 +1,10 @@
<<<<<<< HEAD
\subsection{Problem description and definition} \subsection{Problem description and definition}
\label{subsec:lmdk-prob} \label{subsec:lmdk-prob}
=======
\section{{\Thething} privacy}
\label{sec:lmdk-prob}
>>>>>>> b334e056b320357ce4f4eaa89a1be7f3576350cf
Our problem setting consists of three entities: (i) data generators (users), (ii) data publishers (trusted non-adversarial entities), and (iii) data consumers (possibly adversarial entities). Our problem setting consists of three entities: (i) data generators (users), (ii) data publishers (trusted non-adversarial entities), and (iii) data consumers (possibly adversarial entities).
Users generate sensitive data, which are processed in a secure and private way by a trusted curator and are later published in order to be consumed by potentially adversarial data analysts. Users generate sensitive data, which are processed in a secure and private way by a trusted curator and are later published in order to be consumed by potentially adversarial data analysts.

View File

@ -1,6 +1,7 @@
\subsection{Summary and future work} \section{Summary}
\label{subsec:lmdk-sum} \label{sec:lmdk-sum}
In this chapter, we presented \emph{{\thething} privacy} for privacy-preserving time series publishing, which allows for the protection of significant events, while improving the utility of the final result w.r.t. the traditional user-level differential privacy. In this chapter, we presented \emph{{\thething} privacy} for privacy-preserving time series publishing, which allows for the protection of significant events, while improving the utility of the final result w.r.t. the traditional user-level differential privacy.
We also proposed three models for {\thething} privacy, and quantified the privacy loss under temporal correlation. We also proposed three models for {\thething} privacy, and quantified the privacy loss under temporal correlation.
Our experiments on real and synthetic data sets validate our proposal. %Our experiments on real and synthetic data sets validate our proposal.
In the future, we aim to investigate privacy-preserving {\thething} selection and propose a mechanism based on user-preferences and semantics. %In the future, we aim to investigate privacy-preserving {\thething} selection and propose a mechanism based on user-preferences and semantics.
\kat{Advertise your work! Say what is cool about the work and how it differs from the others! Mention also the summary for selection of events. The discussion for the experiments and future work you postpone for the respective sections, you may though make reference to specific experiments to support your claims. }

View File

@ -1,7 +1,7 @@
\chapter{Related work} \chapter{Related work}
\label{ch:rel} \label{ch:rel}
\kat{Be sure to update this chapter with more recent works, after the survey was published.. Moreover, the introduction here must be updated, we are not talking about a survey anymore but for your thesis. This means that possibly you need to add a section about some general privacy techniques, which go beyond the continuous publication scenario.} \kat{Change the way you introduce the related work chapter; do not list a series of surveys. You should speak about the several directions for privacy preserving methods (and then citing the surveys if you want). Then, you should focus on the particular configuration that you are interested in (continual observation). Summarize what we will see in the next sections by giving also the general structure of the chapter.}
Since the domain of data privacy is vast, several surveys have already been published with different scopes. Since the domain of data privacy is vast, several surveys have already been published with different scopes.
A group of surveys focuses on specific different families of privacy-preserving algorithms and techniques. A group of surveys focuses on specific different families of privacy-preserving algorithms and techniques.
@ -15,8 +15,14 @@ To name a few, Chow and Mokbel~\cite{chow2011trajectory} investigate privacy pro
Finally, there are some surveys on application-specific privacy challenges. Finally, there are some surveys on application-specific privacy challenges.
For example, Zhou et al.~\cite{zhou2008brief} have a focus on social networks, and Christin et al.~\cite{christin2011survey} give an outline of how privacy aspects are addressed in crowdsensing applications. For example, Zhou et al.~\cite{zhou2008brief} have a focus on social networks, and Christin et al.~\cite{christin2011survey} give an outline of how privacy aspects are addressed in crowdsensing applications.
In the following sections, we document works that deal with privacy under continuous data publishing covering diverse use cases. In this chapter, we document works that deal with privacy under continuous data publishing covering diverse use cases.
Such a documentation becomes very useful nowadays, due to the abundance of continuously user-generated data sets that could be analyzed and/or published in a privacy-preserving way, and the quick progress made in this research field. We present the works in the literature based on two levels of categorisation.
First, we group works w.r.t. whether they receive microdata or statistical data (see Section~\ref{subsec:data-categories} for the definitions) as input.
Then, we further group them into two subcategories, whether they are designed for the finite or infinite (see Section.~\ref{subsec:data-publishing}) observation setting. \kat{continue.. say also in which category you place your work}
%Such a documentation becomes very useful nowadays, due to the abundance of continuously user-generated data sets that could be analyzed and/or published in a privacy-preserving way, and the quick progress made in this research field.
\kat{The related work section of your thesis, should make a connection/comparison to your work. This means that you should position the works presented wrt your problem and your solution if the problems are the same. Put a small (or big) paragraph in the end of each of the two sections (microdata and statistical data) and name the similarities/differences }
\input{related/micro} \input{related/micro}
\input{related/statistical} \input{related/statistical}

View File

@ -1,14 +1,21 @@
\section{Microdata} \section{Microdata}
\label{sec:micro} \label{sec:micro}
As observed in Table~\ref{tab:micro}, privacy-preserving algorithms for microdata rely mostly on $k$-anonymity or derivatives of it.
Table~\ref{tab:micro} summarizes the literature for the Microdata category.
Each reviewed work is abstractly described in this table, by its category (finite or infintite), its publishing mode (batch or streaming) and scheme(global or local), the level of privacy achieved (user, event, w-event), the attacks addressed, the privacy operation applied, and the base method it is built upon.
We observe that privacy-preserving algorithms for microdata rely mostly on $k$-anonymity or derivatives of it.
Ganta et al.~\cite{ganta2008composition} revealed that $k$-anonymity methods are vulnerable to complementary release attacks (or \emph{composition attacks} in the original publication). Ganta et al.~\cite{ganta2008composition} revealed that $k$-anonymity methods are vulnerable to complementary release attacks (or \emph{composition attacks} in the original publication).
Consequently, the research community proposed solutions based on $k$-anonymity, focusing on different threats linked to continuous publication, as we review later on. Consequently, the research community proposed solutions based on $k$-anonymity, focusing on different threats linked to continuous publication, as we review later on.
However, notice that only a couple~\cite{li2016hybrid,shmueli2015privacy} However, notice that only a couple~\cite{li2016hybrid,shmueli2015privacy}
of the following works assume that data sets are privacy-protected \emph{independently} of one another, meaning that the publisher is oblivious of the rest of the publications. of the following works assume that data sets are privacy-protected \emph{independently} of one another, meaning that the publisher is oblivious of the rest of the publications.
On the other side, algorithms that are based on differential privacy are not concerned with so specific attacks as, by definition, differential privacy considers that the adversary may possess any kind of background knowledge. On the other side, algorithms based on differential privacy are not concerned with so specific attacks as, by definition, differential privacy considers that the adversary may possess any kind of background knowledge.
Later on, data dependencies were also considered for differential privacy algorithms, to account for the extra privacy loss entailed by them. Moreover, more recent works consider also data dependencies
%are considered for differential privacy algorithms,
to account for the extra privacy loss entailed by them.
\bigskip
We begin the discussion with the works designed for microdata as finite observations (Section~\ref{subsec:micro-finite}), to continue with the infinite observations setting (Section~\ref{subsec:micro-infinite}).
\includetable{micro} \includetable{micro}
@ -451,3 +458,6 @@ The goal is to minimize the information throughput and always answer users' requ
They model the dependence between requests using a Markov chain, which is publicly known, where each state represents an available service. They model the dependence between requests using a Markov chain, which is publicly known, where each state represents an available service.
Setting privacy to ON, the user obfuscates their original query by randomly sending requests to (and receiving answers from) a subset of all of the available services. Setting privacy to ON, the user obfuscates their original query by randomly sending requests to (and receiving answers from) a subset of all of the available services.
Although this randomization step makes the original query indistinguishable while making sure that the users always get the information that they need, there is no clear quantification of the privacy guarantee that the scheme offers over time. Although this randomization step makes the original query indistinguishable while making sure that the users always get the information that they need, there is no clear quantification of the privacy guarantee that the scheme offers over time.
\bigskip
\kat{Add here the comparison/contrast paragraph of microdata techniques shown previously, and your work}

View File

@ -1,10 +1,14 @@
\section{Statistical data} \section{Statistical data}
\label{sec:statistical} \label{sec:statistical}
When continuously publishing statistical data, usually in the form of counts, the most widely used privacy method is differential privacy, or derivatives of it, as witnessed in Table~\ref{tab:statistical}. As in Section~\ref{sec:micro}, we summarize the literature for the Statistical Data category in Table~\ref{tab:statistical}, which we structure identically as Table~\ref{tab:micro}.
In theory differential privacy makes no assumptions about the background knowledge available to the adversary. For a reminder, each reviewed work is abstractly described in this table, by its category (finite or infintite), its publishing mode (batch or streaming) and scheme(global or local), the level of privacy achieved (user, event, w-event), the attacks addressed, the privacy operation applied, and the base method it is built upon.
In practice, as we observe in Table~\ref{tab:statistical}, data dependencies (e.g.,~correlations) arising in the continuous publication setting are frequently (but without it being the rule) considered as attacks in the proposed algorithms.
As witnessed in Table~\ref{tab:statistical}, when continuously publishing statistical data, usually in the form of counts, the most widely used privacy method is differential privacy, or derivatives of it.
In theory differential privacy makes no assumptions about the background knowledge available to the adversary.
In practice, data dependencies (e.g.,~correlations) arising in the continuous publication setting are frequently (but without it being the rule) considered as attacks in the proposed algorithms.
We begin the discussion with the works designed for microdata as finite observations (Section~\ref{subsec:statistical-finite}), to continue with the infinite observations setting (Section~\ref{subsec:statistical-infinite}).
\includetable{statistical} \includetable{statistical}
@ -424,3 +428,5 @@ Increasing the discount factor offers stronger privacy protection, equivalent to
Whereas, increasing the discount coefficient resembles the behavior of event-level differential privacy. Whereas, increasing the discount coefficient resembles the behavior of event-level differential privacy.
Selecting a suitable value for the privacy budget and the discount parameter allows for bounding the overall privacy loss in an infinite observation scenario. Selecting a suitable value for the privacy budget and the discount parameter allows for bounding the overall privacy loss in an infinite observation scenario.
However, the assumption that all users discount previous data releases limits the applicability of the the current scheme in real-world scenarios for statistical data. However, the assumption that all users discount previous data releases limits the applicability of the the current scheme in real-world scenarios for statistical data.
\kat{Add here a paragraph that contrasts/compares your work with the works presented for statistical data.}

View File

@ -2,3 +2,4 @@
\label{sec:sum-rel} \label{sec:sum-rel}
This is the summary of this chapter. This is the summary of this chapter.
\kat{? Don't forget to mention here the publication that you have.}