From 900982e1f6cb389f441e9550e8d935957c8019c6 Mon Sep 17 00:00:00 2001 From: katerinatzo Date: Fri, 8 Oct 2021 14:57:19 +0200 Subject: [PATCH 1/6] text intro related work --- text/related/main.tex | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/text/related/main.tex b/text/related/main.tex index 68a37b6..fd94456 100644 --- a/text/related/main.tex +++ b/text/related/main.tex @@ -1,7 +1,7 @@ \chapter{Related work} \label{ch:rel} -\kat{Be sure to update this chapter with more recent works, after the survey was published.. Moreover, the introduction here must be updated, we are not talking about a survey anymore but for your thesis. This means that possibly you need to add a section about some general privacy techniques, which go beyond the continuous publication scenario.} +\kat{Change the way you introduce the related work chapter; do not list a series of surveys. You should speak about the several directions for privacy preserving methods (and then citing the surveys if you want). Then, you should focus on the particular configuration that you are interested in (continual observation). Summarize what we will see in the next sections by giving also the general structure of the chapter.} Since the domain of data privacy is vast, several surveys have already been published with different scopes. A group of surveys focuses on specific different families of privacy-preserving algorithms and techniques. @@ -15,8 +15,12 @@ To name a few, Chow and Mokbel~\cite{chow2011trajectory} investigate privacy pro Finally, there are some surveys on application-specific privacy challenges. For example, Zhou et al.~\cite{zhou2008brief} have a focus on social networks, and Christin et al.~\cite{christin2011survey} give an outline of how privacy aspects are addressed in crowdsensing applications. -In the following sections, we document works that deal with privacy under continuous data publishing covering diverse use cases. -Such a documentation becomes very useful nowadays, due to the abundance of continuously user-generated data sets that could be analyzed and/or published in a privacy-preserving way, and the quick progress made in this research field. +In this chapter, we document works that deal with privacy under continuous data publishing covering diverse use cases. +We present the works in the literature based on two levels of categorisation. First we group works w.r.t. whether they receive as input microdata or statistical data (see Section~\ref{subsec:data-categories} for the definitions). Then, we further group them into two subcategories, whether they are designed for the finite or infinite (see Section.~\ref{subsec:data-publishing}) observation setting. \kat{continue} + +%Such a documentation becomes very useful nowadays, due to the abundance of continuously user-generated data sets that could be analyzed and/or published in a privacy-preserving way, and the quick progress made in this research field. + +\kat{The related work section of your thesis, should make a connection/comparison to your work. This means that you should position the works presented wrt your problem and your solution if the problems are the same. } \input{related/micro} \input{related/statistical} From 2952f0f4fae6131a1b99928b2cd4eba47309c20a Mon Sep 17 00:00:00 2001 From: katerinatzo Date: Fri, 8 Oct 2021 15:20:04 +0200 Subject: [PATCH 2/6] introduction for 3.1.microdata --- text/related/main.tex | 4 +++- text/related/micro.tex | 13 ++++++++++--- 2 files changed, 13 insertions(+), 4 deletions(-) diff --git a/text/related/main.tex b/text/related/main.tex index fd94456..a017ef0 100644 --- a/text/related/main.tex +++ b/text/related/main.tex @@ -16,7 +16,9 @@ Finally, there are some surveys on application-specific privacy challenges. For example, Zhou et al.~\cite{zhou2008brief} have a focus on social networks, and Christin et al.~\cite{christin2011survey} give an outline of how privacy aspects are addressed in crowdsensing applications. In this chapter, we document works that deal with privacy under continuous data publishing covering diverse use cases. -We present the works in the literature based on two levels of categorisation. First we group works w.r.t. whether they receive as input microdata or statistical data (see Section~\ref{subsec:data-categories} for the definitions). Then, we further group them into two subcategories, whether they are designed for the finite or infinite (see Section.~\ref{subsec:data-publishing}) observation setting. \kat{continue} +We present the works in the literature based on two levels of categorisation. +First, we group works w.r.t. whether they receive microdata or statistical data (see Section~\ref{subsec:data-categories} for the definitions) as input. +Then, we further group them into two subcategories, whether they are designed for the finite or infinite (see Section.~\ref{subsec:data-publishing}) observation setting. \kat{continue} %Such a documentation becomes very useful nowadays, due to the abundance of continuously user-generated data sets that could be analyzed and/or published in a privacy-preserving way, and the quick progress made in this research field. diff --git a/text/related/micro.tex b/text/related/micro.tex index ffe1d4e..1d068c1 100644 --- a/text/related/micro.tex +++ b/text/related/micro.tex @@ -1,14 +1,21 @@ \section{Microdata} \label{sec:micro} -As observed in Table~\ref{tab:micro}, privacy-preserving algorithms for microdata rely mostly on $k$-anonymity or derivatives of it. +\kat{Table 1 must be properly introduced in the text, and also commented on (derive all the conclusions from it, instead of reporting one conclusion randomly here).} +Table~\ref{tab:micro} summarizes the literature for the Microdata category. +Each reviewed work is abstractly described in this table, by its category (finite or infintite), its publishing mode (batch or streaming) and scheme(global or local), the level of privacy achieved (user, event, w-event), the attacks addressed, the privacy operation applied, and the base method it is built upon. +We observe that privacy-preserving algorithms for microdata rely mostly on $k$-anonymity or derivatives of it. Ganta et al.~\cite{ganta2008composition} revealed that $k$-anonymity methods are vulnerable to complementary release attacks (or \emph{composition attacks} in the original publication). Consequently, the research community proposed solutions based on $k$-anonymity, focusing on different threats linked to continuous publication, as we review later on. However, notice that only a couple~\cite{li2016hybrid,shmueli2015privacy} of the following works assume that data sets are privacy-protected \emph{independently} of one another, meaning that the publisher is oblivious of the rest of the publications. -On the other side, algorithms that are based on differential privacy are not concerned with so specific attacks as, by definition, differential privacy considers that the adversary may possess any kind of background knowledge. -Later on, data dependencies were also considered for differential privacy algorithms, to account for the extra privacy loss entailed by them. +On the other side, algorithms based on differential privacy are not concerned with so specific attacks as, by definition, differential privacy considers that the adversary may possess any kind of background knowledge. +Moreover, more recent works consider also data dependencies +%are considered for differential privacy algorithms, +to account for the extra privacy loss entailed by them. +\bigskip +Next, we begin the discussion with the works designed for microdata as finite observations. \includetable{micro} From 891b6ab9cb2265f70c47038e795ee4d41b3f6d6a Mon Sep 17 00:00:00 2001 From: katerinatzo Date: Fri, 8 Oct 2021 15:33:35 +0200 Subject: [PATCH 3/6] 3.1. paragraph comment --- tables/micro.tex | 3 ++- text/related/main.tex | 4 ++-- text/related/micro.tex | 3 +++ 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/tables/micro.tex b/tables/micro.tex index b8c5794..b2c1500 100644 --- a/tables/micro.tex +++ b/tables/micro.tex @@ -98,7 +98,8 @@ } \caption{Summary table of reviewed privacy-preserving algorithms for continuous microdata publishing. - Location-specific techniques are listed in bold.} + Location-specific techniques are listed in bold.\kat{ +do you still need to have in bold the location specific techniques? if yes mention why in the text..}} \label{tab:micro} diff --git a/text/related/main.tex b/text/related/main.tex index a017ef0..496ec76 100644 --- a/text/related/main.tex +++ b/text/related/main.tex @@ -18,11 +18,11 @@ For example, Zhou et al.~\cite{zhou2008brief} have a focus on social networks, a In this chapter, we document works that deal with privacy under continuous data publishing covering diverse use cases. We present the works in the literature based on two levels of categorisation. First, we group works w.r.t. whether they receive microdata or statistical data (see Section~\ref{subsec:data-categories} for the definitions) as input. -Then, we further group them into two subcategories, whether they are designed for the finite or infinite (see Section.~\ref{subsec:data-publishing}) observation setting. \kat{continue} +Then, we further group them into two subcategories, whether they are designed for the finite or infinite (see Section.~\ref{subsec:data-publishing}) observation setting. \kat{continue.. say also in which category you place your work} %Such a documentation becomes very useful nowadays, due to the abundance of continuously user-generated data sets that could be analyzed and/or published in a privacy-preserving way, and the quick progress made in this research field. -\kat{The related work section of your thesis, should make a connection/comparison to your work. This means that you should position the works presented wrt your problem and your solution if the problems are the same. } +\kat{The related work section of your thesis, should make a connection/comparison to your work. This means that you should position the works presented wrt your problem and your solution if the problems are the same. Put a small (or big) paragraph in the end of each of the two sections (microdata and statistical data) and name the similarities/differences } \input{related/micro} \input{related/statistical} diff --git a/text/related/micro.tex b/text/related/micro.tex index 1d068c1..494bad5 100644 --- a/text/related/micro.tex +++ b/text/related/micro.tex @@ -458,3 +458,6 @@ The goal is to minimize the information throughput and always answer users' requ They model the dependence between requests using a Markov chain, which is publicly known, where each state represents an available service. Setting privacy to ON, the user obfuscates their original query by randomly sending requests to (and receiving answers from) a subset of all of the available services. Although this randomization step makes the original query indistinguishable while making sure that the users always get the information that they need, there is no clear quantification of the privacy guarantee that the scheme offers over time. +\bigskip + +\kat{Add here the comparison/contrast paragraph of microdata techniques shown previously, and your work} \ No newline at end of file From 637c26ffff3d284ce9bebc70588597dcf04c398a Mon Sep 17 00:00:00 2001 From: katerinatzo Date: Fri, 8 Oct 2021 15:42:43 +0200 Subject: [PATCH 4/6] chapter 3. done --- text/related/micro.tex | 4 ++-- text/related/statistical.tex | 12 +++++++++--- text/related/summary.tex | 1 + 3 files changed, 12 insertions(+), 5 deletions(-) diff --git a/text/related/micro.tex b/text/related/micro.tex index 494bad5..7ffe666 100644 --- a/text/related/micro.tex +++ b/text/related/micro.tex @@ -1,7 +1,7 @@ \section{Microdata} \label{sec:micro} -\kat{Table 1 must be properly introduced in the text, and also commented on (derive all the conclusions from it, instead of reporting one conclusion randomly here).} + Table~\ref{tab:micro} summarizes the literature for the Microdata category. Each reviewed work is abstractly described in this table, by its category (finite or infintite), its publishing mode (batch or streaming) and scheme(global or local), the level of privacy achieved (user, event, w-event), the attacks addressed, the privacy operation applied, and the base method it is built upon. We observe that privacy-preserving algorithms for microdata rely mostly on $k$-anonymity or derivatives of it. @@ -15,7 +15,7 @@ Moreover, more recent works consider also data dependencies to account for the extra privacy loss entailed by them. \bigskip -Next, we begin the discussion with the works designed for microdata as finite observations. +We begin the discussion with the works designed for microdata as finite observations (Section~\ref{subsec:micro-finite}), to continue with the infinite observations setting (Section~\ref{subsec:micro-infinite}). \includetable{micro} diff --git a/text/related/statistical.tex b/text/related/statistical.tex index ad72dd1..4fd13bb 100644 --- a/text/related/statistical.tex +++ b/text/related/statistical.tex @@ -1,10 +1,14 @@ \section{Statistical data} \label{sec:statistical} -When continuously publishing statistical data, usually in the form of counts, the most widely used privacy method is differential privacy, or derivatives of it, as witnessed in Table~\ref{tab:statistical}. -In theory differential privacy makes no assumptions about the background knowledge available to the adversary. -In practice, as we observe in Table~\ref{tab:statistical}, data dependencies (e.g.,~correlations) arising in the continuous publication setting are frequently (but without it being the rule) considered as attacks in the proposed algorithms. +As in Section~\ref{sec:micro}, we summarize the literature for the Statistical Data category in Table~\ref{tab:statistical}, which we structure identically as Table~\ref{tab:micro}. +For a reminder, each reviewed work is abstractly described in this table, by its category (finite or infintite), its publishing mode (batch or streaming) and scheme(global or local), the level of privacy achieved (user, event, w-event), the attacks addressed, the privacy operation applied, and the base method it is built upon. +As witnessed in Table~\ref{tab:statistical}, when continuously publishing statistical data, usually in the form of counts, the most widely used privacy method is differential privacy, or derivatives of it. +In theory differential privacy makes no assumptions about the background knowledge available to the adversary. +In practice, data dependencies (e.g.,~correlations) arising in the continuous publication setting are frequently (but without it being the rule) considered as attacks in the proposed algorithms. + +We begin the discussion with the works designed for microdata as finite observations (Section~\ref{subsec:statistical-finite}), to continue with the infinite observations setting (Section~\ref{subsec:statistical-infinite}). \includetable{statistical} @@ -424,3 +428,5 @@ Increasing the discount factor offers stronger privacy protection, equivalent to Whereas, increasing the discount coefficient resembles the behavior of event-level differential privacy. Selecting a suitable value for the privacy budget and the discount parameter allows for bounding the overall privacy loss in an infinite observation scenario. However, the assumption that all users discount previous data releases limits the applicability of the the current scheme in real-world scenarios for statistical data. + +\kat{Add here a paragraph that contrasts/compares your work with the works presented for statistical data.} diff --git a/text/related/summary.tex b/text/related/summary.tex index 574bdf3..9f98a42 100644 --- a/text/related/summary.tex +++ b/text/related/summary.tex @@ -2,3 +2,4 @@ \label{sec:sum-rel} This is the summary of this chapter. +\kat{? Don't forget to mention here the publication that you have.} \ No newline at end of file From 58952a88b4efede484f520a0df135b758d3339a0 Mon Sep 17 00:00:00 2001 From: katerinatzo Date: Fri, 8 Oct 2021 15:44:35 +0200 Subject: [PATCH 5/6] table 3.2 --- tables/statistical.tex | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tables/statistical.tex b/tables/statistical.tex index e691885..3370754 100644 --- a/tables/statistical.tex +++ b/tables/statistical.tex @@ -99,7 +99,7 @@ } \caption{Summary table of reviewed privacy-preserving algorithms for continuous statistical data publishing. - Location-specific techniques are listed in bold.} + Location-specific techniques are listed in bold.\kat{same remarks as before for the location}} \label{tab:statistical} From b334e056b320357ce4f4eaa89a1be7f3576350cf Mon Sep 17 00:00:00 2001 From: katerinatzo Date: Fri, 8 Oct 2021 16:05:52 +0200 Subject: [PATCH 6/6] general comments and restructure of section 4 --- text/problem/main.tex | 4 ++-- text/problem/theotherthing/main.tex | 4 ++-- text/problem/thething/contribution.tex | 4 ++-- text/problem/thething/main.tex | 12 ++++++++---- text/problem/thething/motivation.tex | 4 ++-- text/problem/thething/problem.tex | 4 ++-- text/problem/thething/summary.tex | 9 +++++---- 7 files changed, 23 insertions(+), 18 deletions(-) diff --git a/text/problem/main.tex b/text/problem/main.tex index c97cbd4..5cc12cc 100644 --- a/text/problem/main.tex +++ b/text/problem/main.tex @@ -1,4 +1,4 @@ -\chapter{The problem} +\chapter{Landmark Privacy} \input{problem/thething/main} -\input{problem/theotherthing/main} + diff --git a/text/problem/theotherthing/main.tex b/text/problem/theotherthing/main.tex index 4c72eda..83ed973 100644 --- a/text/problem/theotherthing/main.tex +++ b/text/problem/theotherthing/main.tex @@ -1,2 +1,2 @@ -\section{Selection of events} -\label{sec:theotherthing} +\subsection{Selection of events} +\label{subsec:theotherthing} diff --git a/text/problem/thething/contribution.tex b/text/problem/thething/contribution.tex index 64f16fd..266ff1c 100644 --- a/text/problem/thething/contribution.tex +++ b/text/problem/thething/contribution.tex @@ -1,5 +1,5 @@ -\subsection{Contribution} -\label{subsec:lmdk-contrib} +\section{Contribution} +\label{sec:lmdk-contrib} In this chapter, we formally define a novel privacy notion that we call \emph{{\thething} privacy}. We apply this privacy notion to time series consisting of \emph{{\thethings}} and regular events, and we design and implement three {\thething} privacy mechanisms. diff --git a/text/problem/thething/main.tex b/text/problem/thething/main.tex index b19fcf5..d4d6095 100644 --- a/text/problem/thething/main.tex +++ b/text/problem/thething/main.tex @@ -1,10 +1,14 @@ -\section{Significant events} -\label{sec:thething} +%\section{Significant events} +%\label{sec:thething} In this chapter, we propose a novel configurable privacy scheme, \emph{\thething} privacy, which takes into account significant events (\emph{\thethings}) in the time series and allocates the available privacy budget accordingly. -We propose two privacy models that guarantee {\thething} privacy and validate our proposal on real and synthetic data sets. -\kat{Now, you have space so you need to be more detailed in the discussions, the motivation, the examples etc.} +We propose two privacy models that guarantee {\thething} privacy. +To further enhance our privacy method, and protect the landmarks position in the time series, we propose techniques to perturb the initial landmarks set (Section~\ref{sec:theotherthing}). + +% and validate our proposal on real and synthetic data sets. \kat{this will go in the experiments section} + \input{problem/thething/motivation} \input{problem/thething/contribution} \input{problem/thething/problem} +\input{problem/theotherthing/main} \input{problem/thething/summary} diff --git a/text/problem/thething/motivation.tex b/text/problem/thething/motivation.tex index 3899f63..7a0d588 100644 --- a/text/problem/thething/motivation.tex +++ b/text/problem/thething/motivation.tex @@ -1,5 +1,5 @@ -\subsection{Motivation} -\label{subsec:lmdk-motiv} +\section{Motivation} +\label{sec:lmdk-motiv} The plethora of sensors currently embedded in or paired with personal devices and other infrastructures have paved the way for the development of numerous \emph{crowdsensing services} (e.g.,~Google Maps~\cite{gmaps}, Waze~\cite{waze}, etc.) based on the collected personal, and usually geotagged and timestamped data. diff --git a/text/problem/thething/problem.tex b/text/problem/thething/problem.tex index ecd167c..b193354 100644 --- a/text/problem/thething/problem.tex +++ b/text/problem/thething/problem.tex @@ -1,5 +1,5 @@ -\subsection{{\Thething} privacy} -\label{subsec:lmdk-prob} +\section{{\Thething} privacy} +\label{sec:lmdk-prob} {\Thething} privacy is based on differential privacy. For this reason, we revisit the definition and important properties of differential privacy before moving on to the main ideas of this paper. diff --git a/text/problem/thething/summary.tex b/text/problem/thething/summary.tex index f1e9468..225c371 100644 --- a/text/problem/thething/summary.tex +++ b/text/problem/thething/summary.tex @@ -1,6 +1,7 @@ -\subsection{Summary and future work} -\label{subsec:lmdk-sum} +\section{Summary} +\label{sec:lmdk-sum} In this chapter, we presented \emph{{\thething} privacy} for privacy-preserving time series publishing, which allows for the protection of significant events, while improving the utility of the final result w.r.t. the traditional user-level differential privacy. We also proposed three models for {\thething} privacy, and quantified the privacy loss under temporal correlation. -Our experiments on real and synthetic data sets validate our proposal. -In the future, we aim to investigate privacy-preserving {\thething} selection and propose a mechanism based on user-preferences and semantics. +%Our experiments on real and synthetic data sets validate our proposal. +%In the future, we aim to investigate privacy-preserving {\thething} selection and propose a mechanism based on user-preferences and semantics. +\kat{Advertise your work! Say what is cool about the work and how it differs from the others! Mention also the summary for selection of events. The discussion for the experiments and future work you postpone for the respective sections, you may though make reference to specific experiments to support your claims. } \ No newline at end of file