Skip to content

Commit 15f4d44

Browse files
authored
Refactor explanation of dominance rules and examples
1 parent c8fd3af commit 15f4d44

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

chapters/04/_04-magnitude-tabular-data-02-Disclosure-Control-Concepts-for-Magnitude-Tabular-Data.qmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ When we replace a $(2,k)$‑dominance rule, by a $p\%$‑rule, the natural choic
133133
134134
Thus, a $(2,80)$‑dominance rule would be replaced by a $p\%$‑rule with $p = 25$, a $(2,95)$‑dominance rule by a $p\%$‑rule with $p = 5.26$ .
135135
136-
If we also derive $p$ from this formula, when replacing a $(1,k)$‑dominance rule, we will obtain a much larger number of sensitive cells. In addition to the cells which are unsafe according to the $(1,k)$-dominance rule which will then also be unsafe according to the $p\%$‑rule, there will be cells which were safe according to the $(1,k)$‑dominance rule, but are not safe according to the $p\%$‑rule, because the rule correctly considers the insider knowledge of a large second largest contributor. We could then put up with this increase in the number of sensitive cells. Alternatively, we could consider the number of sensitive cells that we used to assign (with the $(1,k)$-dominance rule) as a kind of a maximum-prize we are prepared to 'pay' for data protection. In that case we will reduce the parameter $p$. The effect will be that some of the cells we used to consider as sensitive according to the $(1,k)$-dominance rule will now not be sensitive. But this would be justified because those cells are less sensitive as the cells which are unsafe according to the $p\%$-rule, but are not according to the former $(1,k)$-dominance rule, as illustrated above by Example 1.
136+
If we also derive $p$ from this formula, when replacing a $(1,k)$‑dominance rule, we will obtain a much larger number of sensitive cells. In addition to the cells which are unsafe according to the $(1,k)$-dominance rule which will then also be unsafe according to the $p\%$‑rule, there will be cells which were safe according to the $(1,k)$‑dominance rule, but are not safe according to the $p\%$‑rule, because the rule correctly considers the insider knowledge of a large second largest contributor. We could then put up with this increase in the number of sensitive cells. Alternatively, we could consider the number of sensitive cells that we used to assign (with the $(1,k)$-dominance rule) as a kind of a maximum-price we are prepared to 'pay' for data protection. In that case we will reduce the parameter $p$. The effect will be that some of the cells we used to consider as sensitive according to the $(1,k)$-dominance rule will now not be sensitive. But this would be justified because those cells are less sensitive as the cells which are unsafe according to the $p\%$-rule, but are not according to the former $(1,k)$-dominance rule, as illustrated above by Example 1.
137137
138138
:::{.callout-note appearance="simple"}
139139
**Example 2**
@@ -178,7 +178,7 @@ $$
178178

179179
\
180180
***The $(p,q)$-rule***\
181-
A well known extension of the $p\%$-rule is the so called prior‑posterior $(p,q)$‑rule. With the extended rule, one can formally account for general knowledge about individual contributions assumed to be around *prior* to the publication, in particular that the second largest contributor can estimate the smaller contributions $X_{R} = \sum_{ i > 2 } x_{1}$ to within $q\%$. An aggregate is then considered unsafe when the second largest respondent could estimate the largest contribution $x_{1}$ to within $p$ percent of $x_{1}$ , by subtracting her own contribution and this estimate ${\hat{X}}_{R}$ from the cell total, *i.e.* when $|\left( X - x_{2} \right) - x_{1} - {\hat{X}}_{R}| < \frac{p}{100} x_{1}$. Because $\left( X - x_{2} \right) - x_{1} = X_{R}$, the left hand side is assumed to be less than $\frac{q}{100} X_{R}$. So the aggregate is considered to be sensitive, if $X_{R} < \frac{p}{q} x_{1}$. Evidently, it is actually the ratio $\frac{p}{q}$ which determines which cells are considered safe, or unsafe. Therefore, any $(p,q)$‑rule with $q < 100$ can also be expressed as $( p^*, q^*)$‑rule, with $q^* = 100$ and
181+
A well known extension of the $p\%$-rule is the so called prior‑posterior $(p,q)$‑rule. With the extended rule, one can formally account for general knowledge about individual contributions assumed to be around *prior* to the publication, in particular that the second largest contributor can estimate the smaller contributions $X_{R} = \sum_{ i > 2 } x_{i}$ to within $q\%$. An aggregate is then considered unsafe when the second largest respondent could estimate the largest contribution $x_{1}$ to within $p$ percent of $x_{1}$ , by subtracting her own contribution and this estimate ${\hat{X}}_{R}$ from the cell total, *i.e.* when $|\left( X - x_{2} \right) - x_{1} - {\hat{X}}_{R}| < \frac{p}{100} x_{1}$. Because $\left( X - x_{2} \right) - x_{1} = X_{R}$, the left hand side is assumed to be less than $\frac{q}{100} X_{R}$. So the aggregate is considered to be sensitive, if $X_{R} < \frac{p}{q} x_{1}$. Evidently, it is actually the ratio $\frac{p}{q}$ which determines which cells are considered safe, or unsafe. Therefore, any $(p,q)$‑rule with $q < 100$ can also be expressed as $( p^*, q^*)$‑rule, with $q^* = 100$ and
182182
$$
183183
p^* := 100 \frac{p}{q}
184184
$${#eq-p-star}

0 commit comments

Comments
 (0)