Skip to content

Latest commit

 

History

History
284 lines (213 loc) · 8.89 KB

File metadata and controls

284 lines (213 loc) · 8.89 KB

Federation Discovery and Entity Collection

This library provides a high-performance, specification-compliant toolkit for discovering entities within an OpenID Federation and interacting with Entity Collection Endpoints.

The functionality is split into two main operational modes:

  1. Federation Discovery — A top-down, recursive traversal of a federation hierarchy starting from a Trust Anchor.
  2. Entity Collection — A specialized protocol for optimized bulk-fetching of entities, featuring support for server-side filtering, sorting, and cursor-based pagination.

All components are integrated and accessible through the \SimpleSAML\OpenID\Federation facade.


Setup and Configuration

To enable federation discovery, initialize the Federation facade with a cache and (optionally) a logger.

<?php

declare(strict_types=1);

use SimpleSAML\OpenID\Federation;

$federationTools = new Federation(
    cache: $cache,              // \Psr\SimpleCache\CacheInterface (Highly Recommended)
    logger: $logger,            // \Psr\Log\LoggerInterface
    maxDiscoveryDepth: 10,      // Recursion limit for top-down traversal
);

Custom Entity Collection Store

By default, the library persists discovered entity payloads using the configured PSR-16 cache (CacheEntityCollectionStore). If no cache is provided, it falls back to an InMemoryEntityCollectionStore (ephemeral).

For production environments requiring persistent storage (e.g., Database, Redis), you should implement the EntityCollectionStoreInterface:

use SimpleSAML\OpenID\Federation\EntityCollection\EntityCollectionStoreInterface;

class MyDatabaseStore implements EntityCollectionStoreInterface 
{
    public function store(string $trustAnchorId, array $entities, int $ttl): void { /* ... */ }
    public function get(string $trustAnchorId): ?array { /* ... */ }
    public function clear(string $trustAnchorId): void { /* ... */ }
    public function storeLastUpdated(string $trustAnchorId, int $timestamp, int $ttl): void { /* ... */ }
    public function getLastUpdated(string $trustAnchorId): ?int { /* ... */ }
    public function clearLastUpdated(string $trustAnchorId): void { /* ... */ }
}

$federationTools = new Federation(
    entityCollectionStore: new MyDatabaseStore(),
);

Note

The store caches the JWT payload arrays of discovered entities. Actual JWS signatures and original JWT strings are managed by the EntityStatementFetcher which handles its own caching and validation logic.


Federation Discovery (Top-Down)

Federation Discovery performs a recursive traversal of the hierarchy. It starts at the Trust Anchor and follows federation_list_endpoint links to discover all subordinates.

Discovering Entities

/** @var \SimpleSAML\OpenID\Federation $federationTools */

$trustAnchorId = 'https://trust-anchor.example.org/';

try {
    // Traverse the federation and return an EntityCollection object.
    $collection = $federationTools->federationDiscovery()->discover($trustAnchorId);

    // Get the raw map of Entity ID => Payload
    $entities = $collection->getEntities();

    // Convenience: Get just the discovered entity IDs
    $ids = $federationTools->federationDiscovery()->discoverEntityIds($trustAnchorId);
} catch (\Throwable $exception) {
    $logger->error('Federation discovery failed: ' . $exception->getMessage());
}

Discovery Logic & Loop Protection

  1. Trust Anchor Config: Fetches and validates the TA's Entity Configuration.
  2. Subordinate Listing: Fetches the federation_list_endpoint. If filters are provided, they are passed as query parameters to this endpoint.
  3. Recursion: For each discovered subordinate, it fetches its configuration and repeats the process.
  4. Loop Protection: The algorithm tracks visited IDs to prevent infinite loops and is limited by maxDiscoveryDepth.
  5. Deduplication: Entities appearing in multiple branches are only stored once.

Applying Filters During Discovery

You can pass filters (like entity_type) directly to the discovery process. These are passed to the remote federation_list_endpoint to optimize the traversal:

$collection = $federationTools->federationDiscovery()
    ->discover($trustAnchorId, filters: ['entity_type' => 'openid_provider']);

Performance: Scheduled Refresh

Discovery is an expensive network-heavy operation. You should run it in a background process (Cron) using forceRefresh: true to populate the cache:

// In a background job:
$federationTools->federationDiscovery()
    ->discover($trustAnchorId, forceRefresh: true);

// In your web application (uses cache):
$collection = $federationTools->federationDiscovery()->discover($trustAnchorId);

Entity Collection Client

The Entity Collection Client allows fetching pre-filtered lists of entities from a remote federation_collection_endpoint. This is much more efficient than full traversal if the remote side supports it.

Bulk Fetching with Filters

The client supports all standard OpenID Federation query parameters:

$endpoint = 'https://federation.example.org/collection';

$collection = $federationTools->federationDiscovery()->fetchFromCollectionEndpoint(
    $endpoint,
    [
        'entity_type'     => ['openid_provider'],
        'trust_mark_type' => ['https://example.org/marks/certified'],
        'trust_anchor'    => 'https://trust-anchor.example.org/',
        'query'           => 'university',
        'limit'           => 50,
        'entity_claims'   => ['entity_types', 'ui_infos'], // Request specific claims
    ]
);

foreach ($collection->getEntities() as $id => $payload) {
    // Process entity...
}

Client-Side Caching

fetchFromCollectionEndpoint() automatically caches the remote response body. If you need fresh data, pass forceRefresh: true.

Pagination Handling

The EntityCollection object encapsulates the next cursor for seamless pagination:

$results = [];
$cursor = null;

do {
    $page = $federationTools->federationDiscovery()->fetchFromCollectionEndpoint(
        $endpoint,
        ['limit' => 100, 'from' => $cursor]
    );
    
    $results = array_merge($results, $page->getEntities());
    $cursor = $page->getNextPageToken();
} while ($cursor !== null);

Server-Side Implementation

If you are implementing your own federation_collection_endpoint, the library provides high-level building blocks to handle filtering, sorting, and pagination.

The Pipeline Pattern

The recommended implementation follows this pipeline: Discover → Filter → Sort → Paginate → Serialize.

public function __invoke(ServerRequestInterface $request): ResponseInterface
{
    $params = $request->getQueryParams();
    
    // 1. Load entities from the Federation traversal cache
    $collection = $this->federationTools->federationDiscovery()->discover($this->trustAnchorId);
    
    // 2. Filter (Standard OpenID Federation criteria)
    // Supports 'entity_type' (OR), 'trust_mark_type' (AND), and 'query' (Search)
    $collection->filter($params);
    
    // 3. Sort (By nested metadata claims)
    if (isset($params['sort_by'])) {
        $path = explode('.', $params['sort_by']); // e.g. "metadata.federation_entity.display_name"
        $collection->sort([$path], $params['sort_dir'] ?? 'asc');
    }
    
    // 4. Paginate (Using opaque cursors)
    $collection->paginate(
        limit: (int) ($params['limit'] ?? 100),
        from: $params['from'] ?? null
    );
    
    // 5. Serialize to spec-compliant array
    return new JsonResponse($collection->toCollectionEndpointResponseArray());
}

Filtering Technical Details

Criteria Behavior Fields Checked
entity_type OR (Any match) Metadata keys
trust_mark_type AND (All must match) trust_marks[].id
query Case-Insensitive sub, display_name, organization_name

Sorting Technical Details

The sort() method accepts an array of claim paths relative to the JWT payload root. When sorting by metadata claims, you must explicitly include the metadata prefix:

$collection->sort([
    ['metadata', 'openid_provider', 'display_name'],   // Primary (Metadata)
    ['metadata', 'federation_entity', 'display_name'], // Fallback 1 (Metadata)
    ['sub']                                            // Fallback 2 (Entity ID root claim)
], 'asc');

Serialized Response Format

The toCollectionEndpointResponseArray() method produces a structure compatible with the OpenID Federation specification:

{
  "entities": [
    {
      "entity_id": "https://idp.example.org/",
      "entity_types": ["openid_provider"],
      "ui_infos": {
        "openid_provider": {
          "display_name": "Example IDP"
        }
      },
      "trust_marks": [
        { "id": "https://example.org/marks/certified", "trust_mark": "..." }
      ]
    }
  ],
  "next": "aHR0cHM6Ly9pZHAuZXhhbXBsZS5vcmcv",
  "last_updated": 1745410000
}