You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: chapters/04/_04-magnitude-tabular-data-02-Disclosure-Control-Concepts-for-Magnitude-Tabular-Data.qmd
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -133,7 +133,7 @@ When we replace a $(2,k)$‑dominance rule, by a $p\%$‑rule, the natural choic
133
133
134
134
Thus, a $(2,80)$‑dominance rule would be replaced by a $p\%$‑rule with $p = 25$, a $(2,95)$‑dominance rule by a $p\%$‑rule with $p = 5.26$ .
135
135
136
-
If we also derive $p$ from this formula, when replacing a $(1,k)$‑dominance rule, we will obtain a much larger number of sensitive cells. In addition to the cells which are unsafe according to the $(1,k)$-dominance rule which will then also be unsafe according to the $p\%$‑rule, there will be cells which were safe according to the $(1,k)$‑dominance rule, but are not safe according to the $p\%$‑rule, because the rule correctly considers the insider knowledge of a large second largest contributor. We could then put up with this increase in the number of sensitive cells. Alternatively, we could consider the number of sensitive cells that we used to assign (with the $(1,k)$-dominance rule) as a kind of a maximum-prize we are prepared to 'pay' for data protection. In that case we will reduce the parameter $p$. The effect will be that some of the cells we used to consider as sensitive according to the $(1,k)$-dominance rule will now not be sensitive. But this would be justified because those cells are less sensitive as the cells which are unsafe according to the $p\%$-rule, but are not according to the former $(1,k)$-dominance rule, as illustrated above by Example 1.
136
+
If we also derive $p$ from this formula, when replacing a $(1,k)$‑dominance rule, we will obtain a much larger number of sensitive cells. In addition to the cells which are unsafe according to the $(1,k)$-dominance rule which will then also be unsafe according to the $p\%$‑rule, there will be cells which were safe according to the $(1,k)$‑dominance rule, but are not safe according to the $p\%$‑rule, because the rule correctly considers the insider knowledge of a large second largest contributor. We could then put up with this increase in the number of sensitive cells. Alternatively, we could consider the number of sensitive cells that we used to assign (with the $(1,k)$-dominance rule) as a kind of a maximum-price we are prepared to 'pay' for data protection. In that case we will reduce the parameter $p$. The effect will be that some of the cells we used to consider as sensitive according to the $(1,k)$-dominance rule will now not be sensitive. But this would be justified because those cells are less sensitive as the cells which are unsafe according to the $p\%$-rule, but are not according to the former $(1,k)$-dominance rule, as illustrated above by Example 1.
137
137
138
138
:::{.callout-note appearance="simple"}
139
139
**Example 2**
@@ -178,7 +178,7 @@ $$
178
178
179
179
\
180
180
***The $(p,q)$-rule***\
181
-
A well known extension of the $p\%$-rule is the so called prior‑posterior $(p,q)$‑rule. With the extended rule, one can formally account for general knowledge about individual contributions assumed to be around *prior* to the publication, in particular that the second largest contributor can estimate the smaller contributions $X_{R} = \sum_{ i > 2 } x_{1}$ to within $q\%$. An aggregate is then considered unsafe when the second largest respondent could estimate the largest contribution $x_{1}$ to within $p$ percent of $x_{1}$ , by subtracting her own contribution and this estimate ${\hat{X}}_{R}$ from the cell total, *i.e.* when $|\left( X - x_{2} \right) - x_{1} - {\hat{X}}_{R}| < \frac{p}{100} x_{1}$. Because $\left( X - x_{2} \right) - x_{1} = X_{R}$, the left hand side is assumed to be less than $\frac{q}{100} X_{R}$. So the aggregate is considered to be sensitive, if $X_{R} < \frac{p}{q} x_{1}$. Evidently, it is actually the ratio $\frac{p}{q}$ which determines which cells are considered safe, or unsafe. Therefore, any $(p,q)$‑rule with $q < 100$ can also be expressed as $( p^*, q^*)$‑rule, with $q^* = 100$ and
181
+
A well known extension of the $p\%$-rule is the so called prior‑posterior $(p,q)$‑rule. With the extended rule, one can formally account for general knowledge about individual contributions assumed to be around *prior* to the publication, in particular that the second largest contributor can estimate the smaller contributions $X_{R} = \sum_{ i > 2 } x_{i}$ to within $q\%$. An aggregate is then considered unsafe when the second largest respondent could estimate the largest contribution $x_{1}$ to within $p$ percent of $x_{1}$ , by subtracting her own contribution and this estimate ${\hat{X}}_{R}$ from the cell total, *i.e.* when $|\left( X - x_{2} \right) - x_{1} - {\hat{X}}_{R}| < \frac{p}{100} x_{1}$. Because $\left( X - x_{2} \right) - x_{1} = X_{R}$, the left hand side is assumed to be less than $\frac{q}{100} X_{R}$. So the aggregate is considered to be sensitive, if $X_{R} < \frac{p}{q} x_{1}$. Evidently, it is actually the ratio $\frac{p}{q}$ which determines which cells are considered safe, or unsafe. Therefore, any $(p,q)$‑rule with $q < 100$ can also be expressed as $( p^*, q^*)$‑rule, with $q^* = 100$ and
0 commit comments