Skip to content

Integrated the random allocation directory#414

Open
moshenfeld wants to merge 1 commit into
google:mainfrom
moshenfeld:random-alocation
Open

Integrated the random allocation directory#414
moshenfeld wants to merge 1 commit into
google:mainfrom
moshenfeld:random-alocation

Conversation

@moshenfeld
Copy link
Copy Markdown

Added a new pld/random_allocation package which implements the accounting algorithm proposed in "Efficient privacy loss accounting for subsampling and random allocation".
It contains public builders for Gaussian and explicit-realization random allocation PLDs in random_allocation_api.py.
It also wires the feature into dp_accounting as RandomAllocationDpEvent in dp_event.py, and makes PLDAccountant handle Gaussian inner events in pld_privacy_accountant.py.

@google-cla
Copy link
Copy Markdown

google-cla Bot commented Apr 30, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Copy Markdown
Contributor

@arung54 arung54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Moshe,

No comments on the logic / implementation. Most of these comments are about style (and stuff that we will probably have to address anyway to pass the style checks) which are hopefully easy to fix. Some of these could have been prevented if I shared our style guide in advance, apologies for only pointing you to it during the review! If you haven't already, I'd also suggest fixing any issues raised by:

$ flake8 random_allocation/*.py
$ pyink random_allocation/*.py
$ pydocstyle --convention=google --add-ignore=D101,D102,D103,D105,D202,D402 random_allocation/*.py
$ pylint random_allocation/*.py
$ pytype random_allocation/*.py -k

Some top-level comments:

  • Since this library is already in a folder named 'random_allocation', consider renaming some of the libraries to be shorter (e.g. from 'random_allocation_convolution' to 'convolution') for brevity.
  • See https://google.github.io/styleguide/pyguide.html#22-imports - a lot of these libraries are importing methods, whereas the style guide suggests importing modules and then calling the method from the package name. Renaming to shorter module names might help with doing this while maintaining brevity.
  • For the test files, can you match the style of the other test files in dp_accounting here?:
    • Use absltest instead of pytest
    • Prefer absltest.TestCase or numpy.testing assertions over assert ..., since these will give more verbose failures.
    • Put "Test" at the end of test class names rather than beginning

"""Represents the random-allocation mechanism.
In this event, each element in the dataset is independently allocated to k
uniformly sampled calls to the mechanism out of t total calls.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference to the PLD accounting paper could be useful here? (Will also get you more visibility on the paper :) )

)
self._pld = self._pld.compose(truncated_subsampled_gaussian_pld)
return None
elif isinstance(event, dp_event.RandomAllocationDpEvent):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a simple test for this event? Similar to e.g "test_gaussian_basic" in pld_privacy_accountant_test.

self._pld = self._pld.compose(truncated_subsampled_gaussian_pld)
return None
elif isinstance(event, dp_event.RandomAllocationDpEvent):
ra_event: dp_event.RandomAllocationDpEvent = event
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the re-naming needed here? The isinstance() should let the interpreter know there is a subevent to access?

If it's just for clarity (e.g. to avoid calling it 'event.event') maybe better to rename the field in RandomAllocationDpEvent.

assert_not_contains_named_tuples(reconstructed)
self.assertEqual(event, reconstructed)

def test_random_allocation_event_has_no_discretization_override(self):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a comment on why this test is necessary would be useful?


import numpy as np
from dp_accounting.pld.common import compute_self_convolve_bounds
from numba import njit
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to keep numba an optional dependency in the overall library, while still allowing users who have it installed to benefit from the njit. Can you change to something like this?

try:
    from numba import njit
    _HAS_NUMBA = True
except ImportError:
    _HAS_NUMBA = False

def optional_njit():
    if _HAS_NUMBA:
        return njit(cache=True)
    else:
        return lambda x: x

if bound_type == BoundType.DOMINATES:
# Avoid inflating the grid when the target is finer than the original one.
effective_disc = max(realization.step, loss_discretization)
coarsened_base = rediscretize_dist(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to another comment, here and elsewhere it seems like our type checker will guarantee a DenseDiscreteDist, in which case the additional type check might not be necessary (but maybe you want to check spacing anyway)

) -> DenseDiscreteDist:
"""Prepare add-direction factors from a loss-space realization.

Algorithm 2 (`rand-alloc-add`) in Appendix C.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add link to paper here

) -> DenseDiscreteDist:
"""Exponentiation by squaring-based self-convolution using a provided convolve function.

Algorithm 3 (`self-conv`) in Appendix C.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add link to paper here



# =============================================================================
# Internal Helper Functions
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these to above their first usage, including reordering them (e.g. _expand_to_grid above _align_distributions_to_union_grid)



def calc_pld_dual(realization: PLDRealization) -> PLDRealization:
"""Compute the paper PLD dual ``D(L)`` (Definition 3.1).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to paper here

@ryan112358
Copy link
Copy Markdown

A few minor comments

@onlinedigitalservicexplacebank
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants