Skip to content

Commit 8d71495

Browse files
dkooclaudeCopilot
authored
feat(access-control): add group subscription identifier to metadata (#4697)
* fix(access-control): filter out nonexistent and non-group subs from user results * chore: update deprecated `filter_enabled_fields` method * feat(access-control): add Content_Access_Group ESP metadata field Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(access-control): include owned group sub names in Content_Access_Group Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(access-control): add `group` GA4 param for content access groups Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix(access-control): address Copilot review on PR #4697 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(access-control): address review blockers + extract shared name helpers - Restore Institution::user_matches_institution() IP-cache tightening lost in trunk merge: only treat the request as uncached when the visitor is the user being evaluated (or the cache-bypass cookie is set). Prevents the requestor's IP from leaking to non-current users during background metadata sync. - Guard get_group_labels() institution branch against non-array $value to avoid a PHP 8 TypeError when an institution rule has empty/scalar value. - Decouple Group_Subscription::get_group_subscriptions_for_user() from is_group_subscription()'s My-Account/Memberships side effect. Data-layer callers must always see the canonical group-enabled state. - Extract Group_Subscription::get_group_names_for_user() and Institution::get_matching_names_for_user() helpers, each with per-request static memoization. Use html_entity_decode( ENT_QUOTES | ENT_HTML5 ) so names with non-basic entities (&mdash;, &#8217;, etc.) decode cleanly for ESP/GA4 transport. Use WooCommerce_Connection::get_active_subscriptions_for_user() for the owned path so status + gifting filters live in one place. - Update Content_Gate metadata get_group_labels() and GoogleSiteKit::get_user_group_names() to delegate to the shared helpers. - GA4 helper now early-returns when ! Reader_Activation::is_user_reader( $user ), matching the framing of the surrounding is_reader / is_subscriber params. - Replace brittle one-shot filter ordering in test_subscription_filter_source with an unconditional filter + unregistered product so the strict per-product lookup naturally falls through. - Add regression tests: malformed institution rule values (data provider) and cross-user IP leak prevention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(access-control): address PR #4697 review round 2 T10 — Restore Memberships parity at the My-Account UI layer - Gate Group_Subscription_MyAccount::inject_member_group_subscriptions() on ! Memberships::is_active() so the data-layer decoupling done in 7426ab3 doesn't re-expose member group subs in the WC subscriptions list on Memberships sites. Mirrors the suppression that previously lived inside is_group_subscription(), but at the UI layer where it belongs. T7 — $strict filter contract caveat - Document on Access_Rules::has_active_subscription() that $strict only constrains the built-in ownership / group-membership checks. The newspack_access_rules_has_active_subscription filter is always applied; filter authors must opt into the 4th $strict arg and respect it, or callers relying on $strict (e.g. Content_Gate source labels) may misclassify filter-granted access as local ownership. T8 / T9 — Helper cache lifecycle + ID accessors - Move Group_Subscription and Institution per-request name caches to class-level statics so they can be reset. - Add public reset_cache() / reset_matching_cache() methods. - Hook reset on subscription status changes and group-member user-meta add/update/delete events so long-running CLI workers don't serve stale data across jobs. - Refactor each helper to a shared get_settings_map_for_user() / get_matching_map_for_user() that returns [id => decoded_name]; the existing get_*_names_for_user() functions and the new get_*_ids_for_user() accessors project from it. Settings are read once per candidate sub. - Institution cache key now includes get_current_user_id() so the IP-rule branch's current-user dependency doesn't produce stale hits. - Docblock note on the gifting-filter asymmetry between owned and member subscription paths. T11 — Anonymize GA4 group param - GoogleSiteKit::get_user_group_labels() now emits "Group {sub_id}" / "Institution {inst_id}" identifiers instead of display names. The unnamed-group fallback synthesizes a name from the owner's billing full name, so sending names to GA4 would leak PII for every member of an unnamed group. ESP path keeps human-readable names — only GA4 is anonymized. T14 — reader_data source label - get_source_labels() returns [ 'reader_data' ] for the reader_data rule slug. Previously a reader_data-only access rule yielded Content_Access=Yes but empty Content_Access_Source. T12 — Use get_post_field() for institution title transport - Replace get_the_title( $inst_id ) so the value going to GA4/ESP isn't run through the_title filters (texturization or third-party hooks that can introduce entities or markup). T13 — Document cookie-branch IP cross-attribution caveat - One-line caveat on IP_Access_Rule::is_cookie_set() disjunct: it still treats any non-current $user_id as uncached when the current visitor carries the cache-bypass cookie. Narrow pre-existing edge case. T18 — Direct $resolver(...) invocation - Drop call_user_func() in Content_Gate::collect_labels(); PHP 7+ invokes array callables directly. (Reverts the Copilot autofix in d350ee4.) T5 / T6 — Test isolation - Track Institution::create() post IDs in test classes that use them and delete them in tear_down(). Institution::create() inserts real posts not tracked by $this->factory, so they were leaking into later tests. - Reset Group_Subscription / Institution caches in set_up and tear_down. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: improve docblock for get_user_group_labels memoization --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
1 parent df92456 commit 8d71495

11 files changed

Lines changed: 1477 additions & 30 deletions

File tree

includes/content-gate/class-access-rules.php

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -355,11 +355,22 @@ public static function get_subscription_products_options() {
355355
* Whether the user has an active subscription for one of the given products.
356356
* Also checks if the user is a member of a group subscription with the required products.
357357
*
358+
* Note: `$strict` only constrains the built-in ownership / group-membership checks.
359+
* The `newspack_access_rules_has_active_subscription` filter is always applied and
360+
* its return value is the final result, so a third-party filter callback can grant
361+
* access even when `$strict` is true. Filter authors should opt in to the 4th `$strict`
362+
* arg (`accepted_args` >= 4) and respect it — e.g., short-circuit and return
363+
* `$has_subscription` unchanged when `$strict` is true and the access claim isn't
364+
* strictly an owned subscription. Otherwise callers using `$strict` to distinguish
365+
* owner-vs-member access (e.g., `Content_Gate` source labels) may misclassify
366+
* filter-granted access as local ownership.
367+
*
358368
* @param int $user_id User ID.
359369
* @param array $product_ids Required product IDs.
370+
* @param bool $strict If true, only consider active subscriptions owned by $user_id (ignore group subscription memberships).
360371
* @return bool
361372
*/
362-
public static function has_active_subscription( $user_id, $product_ids ) {
373+
public static function has_active_subscription( $user_id, $product_ids, $strict = false ) {
363374
$has_subscription = false;
364375

365376
// Check user's own subscriptions.
@@ -368,7 +379,7 @@ public static function has_active_subscription( $user_id, $product_ids ) {
368379
}
369380

370381
// Check group subscriptions the user is a member of.
371-
if ( ! $has_subscription && function_exists( 'wcs_get_subscription' ) ) {
382+
if ( ! $strict && ! $has_subscription && function_exists( 'wcs_get_subscription' ) ) {
372383
$group_subscriptions = Group_Subscription::get_group_subscriptions_for_user( $user_id );
373384
foreach ( $group_subscriptions as $subscription ) {
374385
if ( ! $subscription || ! $subscription->has_status( WooCommerce_Connection::ACTIVE_SUBSCRIPTION_STATUSES ) ) {
@@ -395,8 +406,9 @@ public static function has_active_subscription( $user_id, $product_ids ) {
395406
* @param bool $has_subscription Whether the user has an active subscription.
396407
* @param int $user_id User ID.
397408
* @param array $product_ids Required product IDs.
409+
* @param bool $strict If true, only consider active subscriptions owned by $user_id (ignore group subscription memberships).
398410
*/
399-
return apply_filters( 'newspack_access_rules_has_active_subscription', $has_subscription, $user_id, $product_ids );
411+
return apply_filters( 'newspack_access_rules_has_active_subscription', $has_subscription, $user_id, $product_ids, $strict );
400412
}
401413

402414
/**

includes/content-gate/class-institution.php

Lines changed: 123 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -235,6 +235,113 @@ public static function evaluate( $user_id, $institution_ids ) {
235235
return false;
236236
}
237237

238+
/**
239+
* Per-request cache of [inst_id => decoded_name] maps.
240+
*
241+
* @var array<string,array<int,string>>
242+
*/
243+
private static $names_cache = [];
244+
245+
/**
246+
* Reset the per-request matching-names cache.
247+
*
248+
* Tests, CLI workers, and invalidation hooks call this to bust the memoization in
249+
* `get_matching_names_for_user()` / `get_matching_ids_for_user()`. Distinct from
250+
* `invalidate_cache()`, which clears the underlying institutions transient.
251+
*/
252+
public static function reset_matching_cache() {
253+
self::$names_cache = [];
254+
}
255+
256+
/**
257+
* Get the sorted, deduplicated names of institutions whose rules a user matches.
258+
*
259+
* Memoized per request via {@see self::get_matching_map_for_user()} — see that helper
260+
* for cache scope, the IP/cookie context dependency, and invalidation.
261+
*
262+
* @param int $user_id User ID.
263+
* @param array|null $institution_filter Optional list of institution post IDs. If non-empty, only
264+
* institutions whose ID is in the list are considered.
265+
* Pass null or an empty array to scan every cached institution.
266+
*
267+
* @return string[] Sorted, deduplicated institution names.
268+
*/
269+
public static function get_matching_names_for_user( $user_id, $institution_filter = null ) {
270+
$map = self::get_matching_map_for_user( $user_id, $institution_filter );
271+
$names = array_values( array_unique( array_values( $map ) ) );
272+
sort( $names, SORT_NATURAL | SORT_FLAG_CASE );
273+
return $names;
274+
}
275+
276+
/**
277+
* Get the IDs of institutions whose rules a user matches.
278+
*
279+
* Returns institution post IDs. Shares the per-request cache with
280+
* {@see self::get_matching_names_for_user()}. Suitable for callers that want a stable,
281+
* non-PII identifier (e.g., GA4 anonymized labels) and don't need the display name.
282+
*
283+
* @param int $user_id User ID.
284+
* @param array|null $institution_filter Optional list of institution post IDs (same semantics
285+
* as {@see self::get_matching_names_for_user()}).
286+
*
287+
* @return int[] Sorted institution post IDs.
288+
*/
289+
public static function get_matching_ids_for_user( $user_id, $institution_filter = null ) {
290+
$ids = array_keys( self::get_matching_map_for_user( $user_id, $institution_filter ) );
291+
sort( $ids, SORT_NUMERIC );
292+
return $ids;
293+
}
294+
295+
/**
296+
* Build the [inst_id => decoded_name] map for the user, memoized per request.
297+
*
298+
* Cache key includes `get_current_user_id()` because `user_matches_institution()`'s IP
299+
* branch is context-dependent on the current visitor (see the comment in that method).
300+
* Without that the same `$user_id` could legitimately resolve to different sets across
301+
* requests in a long-running worker that swaps the current user.
302+
*
303+
* IDs are read with `get_post_field( 'post_title', ... )` rather than `get_the_title( ... )`
304+
* so the value going to GA4/ESP is a raw post title, not one that's been through
305+
* `the_title` filters (texturization, third-party plugins) that can introduce entities
306+
* or markup.
307+
*
308+
* @param int $user_id User ID.
309+
* @param array|null $institution_filter Optional list of institution post IDs.
310+
*
311+
* @return array<int,string> Map of institution post ID to decoded post title.
312+
*/
313+
private static function get_matching_map_for_user( $user_id, $institution_filter = null ) {
314+
$user_id = (int) $user_id;
315+
316+
// Normalize the filter so [], null, and unsorted/duplicate inputs share a cache key.
317+
$normalized_filter = is_array( $institution_filter ) && ! empty( $institution_filter )
318+
? array_values( array_unique( array_map( 'absint', $institution_filter ) ) )
319+
: null;
320+
if ( null !== $normalized_filter ) {
321+
sort( $normalized_filter, SORT_NUMERIC );
322+
}
323+
// Include the current user in the cache key because the IP-rule branch of
324+
// user_matches_institution() short-circuits when $user_id !== get_current_user_id().
325+
$cache_key = $user_id . '|' . get_current_user_id() . '|' . ( null === $normalized_filter ? '' : implode( ',', $normalized_filter ) );
326+
if ( isset( self::$names_cache[ $cache_key ] ) ) {
327+
return self::$names_cache[ $cache_key ];
328+
}
329+
330+
$map = [];
331+
foreach ( self::get_cached_institutions() as $inst_id => $rules ) {
332+
$inst_id = (int) $inst_id;
333+
if ( null !== $normalized_filter && ! in_array( $inst_id, $normalized_filter, true ) ) {
334+
continue;
335+
}
336+
if ( self::user_matches_institution( $user_id, $rules ) ) {
337+
$map[ $inst_id ] = html_entity_decode( (string) \get_post_field( 'post_title', $inst_id ), ENT_QUOTES | ENT_HTML5, 'UTF-8' );
338+
}
339+
}
340+
341+
self::$names_cache[ $cache_key ] = $map;
342+
return $map;
343+
}
344+
238345
/**
239346
* Check if a user matches an institution's rules (OR logic).
240347
*
@@ -254,12 +361,22 @@ public static function user_matches_institution( $user_id, $rules, $uncached = f
254361

255362
if ( ! empty( $rules['ip_range'] ) ) {
256363
// IP evaluation is page-cache-safe only when the response would be uncached anyway:
257-
// the caller flagged it as such, the visitor is logged in, or the visitor carries
258-
// the IP-access bypass cookie. A first-time anonymous on-campus visitor landing
259-
// directly on a gated post matches none of these and will see the gate — they must
260-
// first complete the IP check at /institutional-access (or ?institutional-access=1)
261-
// to set the cookie before subsequent gated requests can evaluate their IP.
262-
$is_uncached = $uncached || ! empty( $user_id ) || IP_Access_Rule::is_cookie_set();
364+
// the caller flagged it as such, the *current visitor* is logged in (so the IP we
365+
// would read is theirs), or the visitor carries the IP-access bypass cookie. We
366+
// require $user_id === get_current_user_id() to avoid attributing the requestor's
367+
// IP to a different user during background metadata sync (admin/cron/webhook). A
368+
// first-time anonymous on-campus visitor landing directly on a gated post matches
369+
// none of these and will see the gate — they must first complete the IP check at
370+
// /institutional-access (or ?institutional-access=1) to set the cookie before
371+
// subsequent gated requests can evaluate their IP.
372+
//
373+
// Caveat: the IP_Access_Rule::is_cookie_set() disjunct accepts any non-current
374+
// $user_id when the *current visitor* carries the cache-bypass cookie, so a
375+
// cookie-bearing on-campus visitor can still cause the current request's IP to
376+
// be matched against another user's institution range. This is a narrow,
377+
// pre-existing edge case (the cookie is only set after the visitor has already
378+
// passed an institutional-access check on this site).
379+
$is_uncached = $uncached || ( ! empty( $user_id ) && (int) $user_id === get_current_user_id() ) || IP_Access_Rule::is_cookie_set();
263380
if ( $is_uncached && IP_Access_Rule::ip_matches_ranges( IP_Access_Rule::get_visitor_ip(), $rules['ip_range'] ) ) {
264381
return true;
265382
}

includes/plugins/class-teams-for-memberships.php

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
use Newspack\Reader_Activation\Sync\Woocommerce as Sync_WooCommerce;
1414
use Newspack\Reader_Activation\Sync\Metadata as Sync_Metadata;
1515
use Newspack\Reader_Activation\Contact_Sync;
16+
use Newspack\Reader_Activation\Integrations;
1617

1718
/**
1819
* Main class.
@@ -187,7 +188,11 @@ public static function handle_esp_sync_contact( $contact ) {
187188
return $contact;
188189
}
189190

190-
$filtered_enabled_fields = Sync_Metadata::filter_enabled_fields( [ 'woo_team' ] );
191+
$esp = Integrations::get_integration( 'esp' );
192+
if ( ! $esp ) {
193+
return $contact;
194+
}
195+
$filtered_enabled_fields = $esp->filter_enabled_outgoing_fields( [ 'woo_team' ] );
191196

192197
if ( empty( $contact['email'] ) ) {
193198
return $contact;

includes/plugins/google-site-kit/class-googlesitekit.php

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,14 @@ function( $category ) {
260260
// If reader has any currently active non-donation subscriptions.
261261
$params['is_subscriber'] = empty( $reader_data['active_subscriptions'] ) ? 'no' : 'yes';
262262

263+
// Content access groups: anonymized identifiers for the user's active group
264+
// subscriptions and matching institutions. See get_user_group_labels() for
265+
// why we send IDs to GA4 rather than the human-readable names.
266+
if ( Content_Gate::is_newspack_feature_enabled() ) {
267+
$group_labels = self::get_user_group_labels( $current_user );
268+
$params['group'] = empty( $group_labels ) ? 'none' : implode( ', ', $group_labels );
269+
}
270+
263271
/**
264272
* Filters the custom parameters passed to GA4.
265273
*
@@ -268,6 +276,48 @@ function( $category ) {
268276
return apply_filters( 'newspack_ga4_custom_parameters', $params );
269277
}
270278

279+
/**
280+
* Build the GA4 `group` parameter value for a user.
281+
*
282+
* The value is a sorted, comma-delimited list of anonymized identifiers for
283+
* active group subscriptions the user owns or is a member of, plus institutions
284+
* whose rules the user matches via any means.
285+
*
286+
* We emit anonymized IDs (`Group {sub_id}`, `Institution {inst_id}`) rather than
287+
* publisher-facing display names because the unnamed-group fallback in
288+
* `Group_Subscription_Settings` synthesizes a name from the owner's billing full
289+
* name — sending that to GA4 would leak PII for every member of an unnamed group.
290+
* The ESP path keeps the human-readable names; only the GA4 surface is anonymized.
291+
*
292+
* Both lookups are delegated to per-request-memoized helpers in `Group_Subscription`
293+
* and `Institution`, so repeat calls within the same request are cheap. Memoization
294+
* is deliberately request-scoped because the institution branch is per-visitor.
295+
*
296+
* @param \WP_User $user The user to inspect.
297+
* @return string[] Sorted, deduplicated anonymized labels.
298+
*/
299+
private static function get_user_group_labels( $user ) {
300+
if ( ! $user || ! $user->ID ) {
301+
return [];
302+
}
303+
// Match the framing of the surrounding params (`is_reader`, `is_subscriber`):
304+
// only attribute groups to actual readers, not admins/editors.
305+
if ( ! Reader_Activation::is_user_reader( $user ) ) {
306+
return [];
307+
}
308+
$user_id = (int) $user->ID;
309+
310+
$labels = [];
311+
foreach ( Group_Subscription::get_group_ids_for_user( $user_id ) as $sub_id ) {
312+
$labels[] = 'Group ' . $sub_id;
313+
}
314+
foreach ( Institution::get_matching_ids_for_user( $user_id ) as $inst_id ) {
315+
$labels[] = 'Institution ' . $inst_id;
316+
}
317+
sort( $labels, SORT_NATURAL | SORT_FLAG_CASE );
318+
return $labels;
319+
}
320+
271321
/**
272322
* Filter the GA config to add custom parameters.
273323
*

includes/plugins/woocommerce-subscriptions/group-subscription/class-group-subscription-myaccount.php

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,13 @@ public static function inject_member_group_subscriptions( $subscriptions, $user_
284284
if ( ! function_exists( 'is_account_page' ) || ! \is_account_page() ) {
285285
return $subscriptions;
286286
}
287+
// Don't add Group Subscription features to My Account when Woo Memberships
288+
// is active. TODO: Remove this once Access Control is fully released.
289+
// Mirrors the suppression that used to live in Group_Subscription::is_group_subscription(),
290+
// preserved here at the UI layer now that data-layer callers always see the canonical state.
291+
if ( Memberships::is_active() ) {
292+
return $subscriptions;
293+
}
287294
$existing_ids = array_keys( $subscriptions );
288295
$group_subscriptions = Group_Subscription::get_group_subscriptions_for_user( $user_id );
289296
foreach ( $group_subscriptions as $group_subscription ) {

0 commit comments

Comments
 (0)