evaluation: Minor corrections
This commit is contained in:
parent
34404faeed
commit
d5c39c4e42
@ -43,7 +43,7 @@ We take into account only the temporal order of the points and the position of r
|
||||
\subsection{Configurations}
|
||||
\label{subsec:eval-conf}
|
||||
|
||||
\subsubsection{{\Thethings}' percentage}
|
||||
\subsubsection{{\Thething} percentage}
|
||||
|
||||
For the Copenhagen data set, we achieve
|
||||
$0\%$ {\thethings} by considering an empty list of contact devices,
|
||||
@ -53,16 +53,16 @@ $60\%$ with $[181$, $182$, $192$, $195$, $196$, $201$, $203$, $207$, $221$, $230
|
||||
$80\%$ with $[260$, $282$, $287$, $289$, $290$, $291$, $308$, $311$, $318$, $323$, $324$, $330$, $334$, $335$, $344$, $350$, $353$, $355$, $357$, $358$, $361$, $363]$, and
|
||||
$100\%$ by including all of the possible contacts.
|
||||
|
||||
In HUE, we get $0$, $20$ $40$, $60$, $80$, and $100$ {\thethings} percentages by setting the energy consumption threshold below $0.28$, $1.12$, $0.88$, $0.68$, $0.54$, $4.45$kWh respectively.
|
||||
In HUE, we get $0$\%, $20$\% $40$\%, $60$\%, $80$\%, and $100$\% {\thethings} by setting the energy consumption threshold below $0.28$kWh, $1.12$kWh, $0.88$kWh, $0.68$kWh, $0.54$kWh, $4.45$kWh respectively.
|
||||
|
||||
In T-drive, we achieved the desired {\thethings} percentages by utilizing the method of Li et al.~\cite{li2008mining} for detecting stay points in trajectory data.
|
||||
In T-drive, we achieved the desired {\thething} percentages by utilizing the method of Li et al.~\cite{li2008mining} for detecting stay points in trajectory data.
|
||||
In more detail, the algorithm checks for each data item if each subsequent item is within a given distance threshold $\Delta l$ and measures the time period $\Delta t$ between the present point and the last subsequent point.
|
||||
We achieve $0$, $20$ $40$, $60$, $80$, and $100$ {\thethings} percentages by setting the ($\Delta l$ in meters, $\Delta t$ in minutes) pairs input to the stay point discovery method as [($0$, $1000$), ($2095$, $30$), ($2790$, $30$), ($3590$, $30$), ($4825$, $30$), ($10350$, $30$)].
|
||||
We achieve $0$\%, $20$\% $40$\%, $60$\%, $80$\%, and $100$\% {\thethings} by setting the ($\Delta l$ in meters, $\Delta t$ in minutes) pairs input to the stay point discovery method as [($0$, $1000$), ($2095$, $30$), ($2790$, $30$), ($3590$, $30$), ($4825$, $30$), ($10350$, $30$)].
|
||||
|
||||
We generated synthetic data with \emph{skewed} (the {\thethings} are distributed towards the beginning/end of the series), \emph{symmetric} (in the middle), \emph{bimodal} (both end and beginning), and \emph{uniform} (all over the time series) {\thething} distributions.
|
||||
In order to get {\thethings} with the above distribution features, we generate probability distributions with appropriate characteristics and sample from them, without replacement, the desired number of points.
|
||||
%The generated distributions are representative of the cases that we wish to examine during the experiments.
|
||||
For example, for a left-skewed {\thethings} distribution we would utilize a truncated distribution resulting from the restriction of the domain of a distribution to the beginning and end of the time series with its location shifted to the center of the right half of the series.
|
||||
For example, for a left-skewed {\thething} distribution we would utilize a truncated distribution resulting from the restriction of the domain of a distribution to the beginning and end of the time series with its location shifted to the center of the right half of the series.
|
||||
For consistency, we calculate the scale parameter depending on the length of the series by setting it equal to the series' length over a constant.
|
||||
|
||||
Notice that in our experiments, in the cases when we have $0\%$ and $100\%$ of the events being {\thethings}, we get the same behavior as in event- and user-level privacy respectively.
|
||||
|
@ -25,20 +25,21 @@ Figure~\ref{fig:real} exhibits the performance of the three mechanisms: Skip, Un
|
||||
\subcaptionbox{T-drive\label{fig:t-drive}}{%
|
||||
\includegraphics[width=.5\linewidth]{evaluation/t-drive}%
|
||||
}%
|
||||
\caption{The mean absolute error (a)~as a percentage, (b)~in kWh, and (c)~in meters of the released data for different {\thethings} percentages.}
|
||||
\caption{The mean absolute error (a)~as a percentage, (b)~in kWh, and (c)~in meters of the released data for different {\thething} percentages.}
|
||||
\label{fig:real}
|
||||
\end{figure}
|
||||
|
||||
% For the Geolife data set (Figure~\ref{fig:geolife}), Skip has the best performance (measured in Mean Absolute Error, in meters) because it invests the most budget overall at every regular event, by approximating the {\thething} data based on previous releases.
|
||||
% Due to the data set's high density (every $1$--$5$ seconds or every $5$--$10$ meters per point) approximating constantly has a low impact on the data utility.
|
||||
% On the contrary, the lower density of the T-drive data set (Figure~\ref{fig:t-drive}) has a negative impact on the performance of Skip.
|
||||
For the Copenhagen data set (Figure~\ref{fig:copenhagen}), Adaptive has a constant overall performance and performs best for $0$, $60$, and $80$\% {\thethings}.
|
||||
The Skip model excels, compared to the others, at cases where it needs to approximate a lot ($100$\%).
|
||||
The combination of the low range in HUE ($[0.28$, $4.45]$ with an average of $0.88$kWh) and the large scale in the Laplace mechanism results in a low mean absolute error for Skip(Figure~\ref{fig:hue}).
|
||||
For the Copenhagen data set (Figure~\ref{fig:copenhagen}), Adaptive has a constant overall performance and performs best for $0$\%, $60$\%, and $80$\% {\thethings}.
|
||||
We notice that for $0$\% {\thethings}, it achieves better utility than the event-level protection.
|
||||
The Skip model excels, compared to the others, at cases where it needs to approximate $20$\%--$40$\% or $100$\% of the times.
|
||||
The combination of the low range in HUE ($[0.28$, $4.45]$ with an average of $0.88$kWh) and the large scale in the Laplace mechanism, results in a low mean absolute error for Skip (Figure~\ref{fig:hue}).
|
||||
In general, a scheme that favors approximation over noise injection would achieve a better performance in this case.
|
||||
However, the Adaptive model performs by far better than Uniform and strikes a nice balance between event- and user-level protection for all {\thethings} percentages.
|
||||
In the T-drive data set (Figure~\ref{fig:t-drive}), the Adaptive mechanism outperforms Uniform by $10$\%--$20$\% for all {\thethings} percentages greater than $40$ and Skip by more than $20$\%.
|
||||
The lower density (average distance of $623$ meters) of the T-drive data set has a negative impact on the performance of Skip.
|
||||
However, the Adaptive model performs by far better than Uniform and strikes a nice balance between event- and user-level protection for all {\thething} percentages.
|
||||
In the T-drive data set (Figure~\ref{fig:t-drive}), the Adaptive mechanism outperforms Uniform by $10$\%--$20$\% for all {\thething} percentages greater than $40$\% and Skip by more than $20$\%.
|
||||
The lower density (average distance of $623$m) of the T-drive data set has a negative impact on the performance of Skip.
|
||||
|
||||
In general, we can claim that the Adaptive is the most reliable and best performing mechanism with minimal tuning, if we take into consideration the drawbacks of the Skip mechanism mentioned in Section~\ref{subsec:lmdk-mechs}.
|
||||
Moreover, designing a data-dependent sampling scheme would possibly result in better results for Adaptive.
|
||||
|
Loading…
Reference in New Issue
Block a user