data: Moved ex:snapshot here

This commit is contained in:
Manos Katsomallos 2021-09-03 12:56:16 +03:00
parent 58fee32c99
commit 084b2faf2d
2 changed files with 13 additions and 19 deletions

View File

@ -68,18 +68,5 @@ Typically, in such cases, we have a collection of data referring to the same ind
Additionally, in many cases, the privacy-preserving processes should take into account implicit correlations and restrictions that exist, e.g.,~space-imposed collocation or movement restrictions.
Since these data are related to most of the important applications and services that enjoy high utilization rates, privacy-preserving continuous data publishing becomes one of the emblematic problems of our time.
To accompany and facilitate the descriptions in this chapter, we provide the following running example.
\begin{example}
\label{ex:snapshot}
Users interact with an LBS by making queries in order to retrieve some useful location-based information or just reporting user-state at various locations.
This user--LBS interaction generates user-related data, organized in a schema with the following attributes: \emph{Name} (the unique identifier of the table), \emph{Age}, \emph{Location}, and \emph{Status} (Table~\ref{tab:snapshot-micro}).
The `Status' attribute includes information that characterizes the user's state or the query itself, and its value varies according to the service functionality.
Subsequently, the generated data are aggregated (by issuing count queries over them) in order to derive useful information about the popularity of the venues during the day (Table~\ref{tab:snapshot-statistical}).
\includetable{snapshot}
\end{example}
\input{introduction/contribution}
\input{introduction/structure}

View File

@ -1,13 +1,20 @@
\chapter{Preliminaries}
\label{ch:prel}
\kat{mention also the different ways data are organized, e.g., as tuples in tables, KVs, graphs, etc and in what formats you consider them in this work.}
In this chapter, we introduce some relevant terminology and information around the problem of continuous publishing of privacy-sensitive data sets \kat{the title of the thesis is '..in user generated big data' not in 'continuous publishing'. Consider rephrase here, and if needed position the user generated big data w.r.t. the continuous publishing so that you continue later on discussing for the continuous publishing setting. }.
First, in Section~\ref{sec:data}, we categorize user-generated data sets and review data processing in the context of continuous data publishing.
% \kat{mention also the different ways data are organized, e.g., as tuples in tables, KVs, graphs, etc and in what formats you consider them in this work.}
In this chapter, we introduce some relevant terminology and information around the problem of
quality and privacy in user-generated Big Data with a special focus on continuous data publishing.
% continuous publishing of privacy-sensitive data sets
% \kat{the title of the thesis is '..in user generated big data' not in 'continuous publishing'. Consider rephrase here, and if needed position the user generated big data w.r.t. the continuous publishing so that you continue later on discussing for the continuous publishing setting. }
First, in Section~\ref{sec:data}, we categorize user-generated data sets, that we consider in a tabular form, and review data processing in the context of continuous data publishing.
Second, in Section~\ref{sec:privacy}, we define information disclosure in data privacy. Thereafter, we list the categories of privacy attacks, %identified in the literature,
the possible privacy protection levels, the fundamental privacy operations that are applied to achieve data privacy, and finally we provide a brief overview of the \kat{also here reconsider the term seminal, so as it does not read like we are in the related work section} seminal works on privacy-preserving data publishing.
\kat{The correlations are not intuitively connected to privacy, so put here a linking sentence to data privacy.}
Third, in Section~\ref{sec:correlation}, we discuss the different types of correlation, we document ways to extract data dependence from continuous data, and we investigate the privacy risks that data correlation entails with special focus on the privacy loss under temporal correlation.
the possible privacy protection levels, the fundamental privacy operations that are applied to achieve data privacy, and finally we provide a brief overview of the
% \kat{also here reconsider the term seminal, so as it does not read like we are in the related work section}
% seminal works on privacy-preserving data publishing.
basic notions for data privacy protection.
% \kat{The correlations are not intuitively connected to privacy, so put here a linking sentence to data privacy.}
Third, in Section~\ref{sec:correlation}, we focus on the impact of correlation on data privacy.
More particularly, we discuss the different types of correlation, we document ways to extract data correlation from continuous data, and we investigate the privacy risks that data correlation entails with special focus on the privacy loss under temporal correlation.
\input{preliminaries/data}
\input{preliminaries/privacy}