lmdk-expt: New results and discussion

This commit is contained in:
Manos Katsomallos 2021-10-09 12:57:31 +02:00
parent 377ac4cb29
commit edb98f736d
4 changed files with 22 additions and 9 deletions

BIN
graphics/copenhagen.pdf Normal file

Binary file not shown.

Binary file not shown.

View File

@ -14,28 +14,41 @@ Whereas, when each timestamp corresponds to a {\thething} we consider and protec
\subsection{Experiments} \subsection{Experiments}
\label{sec:lmdk-expt}
\paragraph{Budget allocation schemes} \subsubsection{Budget allocation schemes}
Figure~\ref{fig:real} exhibits the performance of the three mechanisms: Skip, Uniform, and Adaptive. Figure~\ref{fig:real} exhibits the performance of the three mechanisms: Skip, Uniform, and Adaptive.
\begin{figure}[htp] \begin{figure}[htp]
\centering \centering
\subcaptionbox{Geolife\label{fig:geolife}}{% \subcaptionbox{Copenhagen\label{fig:copenhagen}}{%
\includegraphics[width=.5\linewidth]{geolife}% \includegraphics[width=.5\linewidth]{copenhagen}%
}%
\hspace{\fill}
\subcaptionbox{HUE\label{fig:hue}}{%
\includegraphics[width=.5\linewidth]{hue}%
}% }%
\subcaptionbox{T-drive\label{fig:t-drive}}{% \subcaptionbox{T-drive\label{fig:t-drive}}{%
\includegraphics[width=.5\linewidth]{t-drive}% \includegraphics[width=.5\linewidth]{t-drive}%
}% }%
\caption{The mean absolute error (in meters) of the released data for different {\thethings} percentages.} \caption{The mean absolute error (a)~as a percentage, (b)~in kWh, and (c)~in meters of the released data for different {\thethings} percentages.}
\label{fig:real} \label{fig:real}
\end{figure} \end{figure}
For the Geolife data set (Figure~\ref{fig:geolife}), Skip has the best performance (measured in Mean Absolute Error, in meters) because it invests the most budget overall at every regular event, by approximating the {\thething} data based on previous releases. % For the Geolife data set (Figure~\ref{fig:geolife}), Skip has the best performance (measured in Mean Absolute Error, in meters) because it invests the most budget overall at every regular event, by approximating the {\thething} data based on previous releases.
Due to the data set's high density (every $1$--$5$ seconds or every $5$--$10$ meters per point) approximating constantly has a low impact on the data utility. % Due to the data set's high density (every $1$--$5$ seconds or every $5$--$10$ meters per point) approximating constantly has a low impact on the data utility.
On the contrary, the lower density of the T-drive data set (Figure~\ref{fig:t-drive}) has a negative impact on the performance of Skip. % On the contrary, the lower density of the T-drive data set (Figure~\ref{fig:t-drive}) has a negative impact on the performance of Skip.
In the T-drive data set, the Adaptive mechanism outperforms the Uniform one by $10$\%--$20$\% for all {\thethings} percentages greater than $0$ and by more than $20$\% the Skip one. For the Copenhagen data set (Figure~\ref{fig:copenhagen}), Adaptive has a constant overall performance and performs best for $0$, $60$, and $80$\% {\thethings}.
In general, we can claim that the Adaptive is the best performing mechanism, if we take into consideration the drawbacks of the Skip mechanism mentioned in Section~\ref{subsec:lmdk-mechs}. Moreover, designing a data-dependent sampling scheme would possibly result in better results for Adaptive. The Skip model excels, compared to the others, at cases where it needs to approximate a lot ($100$\%).
The combination of the low range in HUE ($[0.28$, $4.45]$ with an average of $0.88$kWh) and the large scale in the Laplace mechanism results in a low mean absolute error for Skip(Figure~\ref{fig:hue}).
In general, a scheme that favors approximation over noise injection would achieve a better performance in this case.
However, the Adaptive model performs by far better than Uniform and strikes a nice balance between event- and user-level protection for all {\thethings} percentages.
In the T-drive data set (Figure~\ref{fig:t-drive}), the Adaptive mechanism outperforms the Uniform one by $10$\%--$20$\% for all {\thethings} percentages greater than $40$ and by more than $20$\% the Skip one.
The lower density (average distance of $623$ meters) of the T-drive data set has a negative impact on the performance of Skip.
In general, we can claim that the Adaptive is the most reliable and best performing mechanism with minimal tuning, if we take into consideration the drawbacks of the Skip mechanism mentioned in Section~\ref{subsec:lmdk-mechs}.
Moreover, designing a data-dependent sampling scheme would possibly result in better results for Adaptive.
\paragraph{Temporal distance and correlation} \paragraph{Temporal distance and correlation}