diff --git a/text/preliminaries/correlation.tex b/text/preliminaries/correlation.tex index cf01257..f3c29b9 100644 --- a/text/preliminaries/correlation.tex +++ b/text/preliminaries/correlation.tex @@ -1,10 +1,12 @@ \section{Data correlation} \label{sec:correlation} +\kat{Please add some introduction to each section, presenting what you will discuss afterwards, and link it somehow to what was already discussed.} + \subsection{Types of correlation} \label{subsec:cor-types} -The most prominent types of correlation might be: +The most prominent types of correlation are: \begin{itemize} \item \emph{Temporal}~\cite{wei2006time}---appearing in observations (i.e.,~values) of the same object over time. @@ -15,7 +17,7 @@ The most prominent types of correlation might be: Contrary to one-dimensional correlation, spatial correlation is multi-dimensional and multi-directional, and can be measured by indicators (e.g.,~\emph{Moran's I}~\cite{moran1950notes}) that reflect the \emph{spatial association} of the concerned data. Spatial autocorrelation has its foundations in the \emph{First Law of Geography} stating that ``everything is related to everything else, but near things are more related than distant things''~\cite{tobler1970computer}. A positive spatial autocorrelation indicates that similar data are \emph{clustered}, a negative that data are dispersed and are close to dissimilar ones, and when close to zero, that data are \emph{randomly arranged} in space. - +\kat{I still do not like this focus on spatial correlation.. maybe remove it totally? we only consider temporal correlation in the main work in any case.} \subsection{Extraction of correlation} \label{subsec:cor-ext} @@ -30,7 +32,7 @@ Some common stochastic processes modeling techniques include: \begin{itemize} \item \emph{Conditional probabilities}~\cite{allan2013probability}---probabilities of events in the presence of other events. \item \emph{Conditional Random Fields} (CRFs)~\cite{lafferty2001conditional}---undirected graphs encoding conditional probability distributions. - \item \emph{Markov processes}~\cite{rogers2000diffusions}---stochastic processes for which the conditional probability of their future states depends only on the present state and it is independent of its previous states (\emph{Markov assumption}). + \item \emph{Markov processes}~\cite{rogers2000diffusions}---stochastic processes for which the conditional probability of their future states depends only on the present state and it is independent of its previous states (\emph{Markov assumption}). We highlight the following two sub-categories: \begin{itemize} \item \emph{Markov chains}~\cite{gagniuc2017markov}---sequences of possible events whose probability depends on the state attained in the previous event. \item \emph{Hidden Markov Models} (HMMs)~\cite{baum1966statistical}---statistical Markov models of Markov processes with unobserved states. @@ -45,7 +47,7 @@ Correlation appears in dependent data: \begin{itemize} \item within one data set, and - \item within one data set and among one data set and previous data releases, and/or other external sources~\cite{kifer2011no, chen2014correlated, liu2016dependence, zhao2017dependent}. + \item among one data set and previous data releases, and/or other external sources~\cite{kifer2011no, chen2014correlated, liu2016dependence, zhao2017dependent}. \end{itemize} In the former case, data tuples and data values within a data set may be correlated, or linked in such a way that information about one person can be inferred even if the person is absent from the database.