Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 78 additions & 0 deletions .github/skills/myndigheter-monitoring/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,81 @@ interviews (5 labor economists), stakeholder statements*
- **Stakeholder voices** - Include citizens, experts, civil society
- **Public interest** - Agencies serve citizens, not themselves

## Statskontoret Data Integration

Statskontoret (Swedish Agency for Public Management) publishes open data that provides
authoritative, Admiralty-A1 ground truth for government-body context. Use this data
**before** relying on estimates or secondary sources when writing about agency headcounts,
organisational structures or central-government budget execution.

### Available Datasets

| Dataset key | Title | Cadence | Primary use |
|-------------|-------|---------|-------------|
| `myndighetsforteckning` | Myndighetsförteckning — öppna data | Annual | Headcount by department & leadership form (2007–present) |
| `arsutfall` | Årsutfall för statens budget — öppna data | Annual | Annual budget outturn by appropriation & agency |
| `manadsutfall` | Månadsutfall för statens budget — öppna data | Monthly | High-frequency budget-execution monitoring |
| `budget-time-series` | Tidsserier, statens budget m.m. | Annual | Long-run central-government budget context (1995+) |

### How to Fetch (agentic workflows)

The cached library helper is invoked from TypeScript code (see "Cached Fetch Module"
below). For ad-hoc CLI use, the `statskontoret-fetch.ts` wrapper is the entrypoint:

```bash
# CLI: list every built-in Statskontoret source
tsx scripts/statskontoret-fetch.ts list-sources

# CLI: discover downloadable files for a source
tsx scripts/statskontoret-fetch.ts discover --source myndighetsforteckning

# CLI: fetch + parse headcount workbook
tsx scripts/statskontoret-fetch.ts headcount --url <xlsx-url> --persist

# CLI: fetch + parse budget-outturn workbook
tsx scripts/statskontoret-fetch.ts budget-outturn --source arsutfall --url <xlsx-url> --doc-type Inkomst --persist
```

### Cached Fetch Module (`scripts/fetch-statskontoret.ts`)

The `fetch-statskontoret.ts` module provides a **30-day TTL cache layer** over the raw
HTTP client, making it suitable for agentic workflows that run daily but should only
re-download large Excel workbooks every 30 days:

```typescript
import { fetchStatskontoretCached, isStatskontoretCacheFresh } from './fetch-statskontoret.js';

// Check cache freshness without a network call
if (!isStatskontoretCacheFresh('myndighetsforteckning')) {
const payload = await fetchStatskontoretCached('myndighetsforteckning');
// payload.fromCache === false → fresh download
// payload.links → array of StatskontoretDownloadLink (Excel URLs)
}
```

On network failure the module automatically falls back to the most recent stale cache
entry, ensuring workflows remain resilient to temporary outages.

### Data Provenance Rule

Any implementation-feasibility or agency-context analysis that names a Swedish
government body **must** annotate the headcount or budget figure with a
Statskontoret source citation:

```markdown
*Headcount source: Statskontoret Myndighetsförteckning 2025
(analysis/data/statskontoret/myndighetsforteckning/) [A1]*
```

Admiralty grade for own-Statskontoret publications: **A1** (official statistics,
primary public record).

### Network Allowlist

`www.statskontoret.se` and `statskontoret.se` are included in the `network.allowed`
list of all 11 `news-*.md` agentic workflow files. No additional configuration is
required.

## References

- [Swedish Agency Directory](https://www.regeringen.se/regeringens-politik/myndigheter-under-regeringen/)
Expand All @@ -245,6 +320,9 @@ interviews (5 labor economists), stakeholder statements*
- [OECD Public Administration Reviews](https://www.oecd.org/governance/)
- [Transparency International Sweden](https://www.transparency.se/)
- [Swedish Agency for Public Management (Statskontoret)](https://www.statskontoret.se/)
- [Statskontoret Indicators Inventory](../../../analysis/statskontoret/indicators-inventory.json)
- [fetch-statskontoret.ts](../../../scripts/fetch-statskontoret.ts) — 30-day cache module
- [statskontoret-client.ts](../../../scripts/statskontoret-client.ts) — HTTP client library

---

Expand Down
1 change: 1 addition & 0 deletions analysis/statskontoret/indicators-inventory.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
"clients": {
"cli": "tsx scripts/statskontoret-fetch.ts (commands: list-sources, discover, headcount, budget-outturn)",
"library": "scripts/statskontoret-client.ts (StatskontoretClient class)",
"cachedFetch": "scripts/fetch-statskontoret.ts (fetchStatskontoretCached — 30-day TTL cache layer for agentic workflows)",
"persistence": "scripts/parliamentary-data/data-persistence.ts (persistStatskontoretData)"
},
"notes": {
Expand Down
246 changes: 246 additions & 0 deletions scripts/fetch-statskontoret.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,246 @@
/**
* @module scripts/fetch-statskontoret
* @description Cached fetch module for Statskontoret open data, providing a
* 30-day TTL cache layer over {@link StatskontoretClient}.
*
* This module is intended for use by agentic workflows that need Statskontoret
* context (authority register, budget outturn) without re-downloading large
* Excel/ZIP files on every run. It follows the same no-MCP client pattern as
* `imf-context.ts` and `scb-context.ts`.
*
* ### Cache behaviour
* - Cache root: `analysis/data/statskontoret/<sourceKey>/cache/`
* - TTL: 30 days (configurable via the `cacheTtlMs` option)
* - On hit: returns the cached payload with provenance metadata
* - On miss or stale: invokes `StatskontoretClient.discoverDownloads()` and
* persists the result before returning
* - On fetch error: falls back to the most recent stale cache entry (resilience)
*
* ### Security
* Fetch calls go only to `https://www.statskontoret.se` (enforced by
* `assertStatskontoretFetchTarget` inside `StatskontoretClient`). No
* credentials are required; all data is PUBLIC classification.
*
* @see analysis/statskontoret/indicators-inventory.json
* @see scripts/statskontoret-client.ts (low-level HTTP + parse)
* @see scripts/statskontoret-fetch.ts (CLI entry-point)
* @author Hack23 AB
* @license Apache-2.0
*/

import fs from 'node:fs';
import path from 'node:path';
import { fileURLToPath } from 'node:url';

import {
getStatskontoretSource,
STATSKONTORET_SOURCES,
StatskontoretClient,
StatskontoretError,
type StatskontoretClientConfig,
type StatskontoretDownloadLink,
type StatskontoretSourceKey,
} from './statskontoret-client.js';

// ---------------------------------------------------------------------------
// Constants
// ---------------------------------------------------------------------------

const __filename = fileURLToPath(import.meta.url);
const REPO_ROOT = path.resolve(path.dirname(__filename), '..');

/** Default 30-day cache TTL in milliseconds (30 days × 24 h × 60 min × 60 s × 1000 ms). */
export const CACHE_TTL_MS = 30 * 24 * 60 * 60 * 1000;

/** Root directory for cached Statskontoret payloads. */
export const STATSKONTORET_CACHE_ROOT = path.join(
REPO_ROOT,
'analysis',
'data',
'statskontoret',
);

// ---------------------------------------------------------------------------
// Types
// ---------------------------------------------------------------------------

/** A cached Statskontoret downloads payload with provenance metadata. */
export interface StatskontoretCachedPayload {
readonly sourceKey: StatskontoretSourceKey;
readonly sourceTitle: string;
readonly sourceUrl: string;
readonly links: readonly StatskontoretDownloadLink[];
readonly cachedAt: string;
readonly fetchedAt: string;
readonly fromCache: boolean;
readonly cacheAgeMs: number;
}

/** Options for {@link fetchStatskontoretCached}. */
export interface FetchStatskontoretCachedOptions {
/** Override the 30-day TTL (milliseconds). Mainly for testing. */
readonly cacheTtlMs?: number;
/** Override the cache root directory. Mainly for testing. */
readonly cacheRoot?: string;
/** Override the `StatskontoretClient` configuration (e.g. inject a mock fetch). */
readonly clientConfig?: StatskontoretClientConfig;
}

/** Internal cache file format. */
interface CacheEntry {
readonly fetchedAt: string;
readonly sourceKey: StatskontoretSourceKey;
readonly links: StatskontoretDownloadLink[];
}

// ---------------------------------------------------------------------------
// Private helpers
// ---------------------------------------------------------------------------

function cacheDir(sourceKey: StatskontoretSourceKey, cacheRoot: string): string {
return path.join(cacheRoot, sourceKey, 'cache');
}

function cacheFilePath(sourceKey: StatskontoretSourceKey, cacheRoot: string): string {
return path.join(cacheDir(sourceKey, cacheRoot), 'downloads.json');
}

function readCacheEntry(filePath: string): CacheEntry | undefined {
try {
const raw = fs.readFileSync(filePath, 'utf-8');
return JSON.parse(raw) as CacheEntry;
} catch {
return undefined;
}
}

function writeCacheEntry(filePath: string, entry: CacheEntry): void {
const dir = path.dirname(filePath);
fs.mkdirSync(dir, { recursive: true });
fs.writeFileSync(filePath, JSON.stringify(entry, null, 2), 'utf-8');
}

function isCacheFresh(fetchedAt: string, ttlMs: number): boolean {
const age = Date.now() - new Date(fetchedAt).getTime();
return age < ttlMs;
}

// ---------------------------------------------------------------------------
// Public API
// ---------------------------------------------------------------------------

/**
* Fetch Statskontoret download links for a given source key, using a 30-day
* file-system cache.
*
* @param sourceKey - The Statskontoret source to fetch
* (`myndighetsforteckning`, `arsutfall`, `manadsutfall`, `budget-time-series`).
* @param options - Optional TTL, cache-root and client overrides.
* @returns A {@link StatskontoretCachedPayload} with links and provenance info.
*
* @example
* ```ts
* const payload = await fetchStatskontoretCached('myndighetsforteckning');
* console.log(`Found ${payload.links.length} download links (fromCache=${payload.fromCache})`);
* ```
*/
export async function fetchStatskontoretCached(
sourceKey: StatskontoretSourceKey,
options: FetchStatskontoretCachedOptions = {},
): Promise<StatskontoretCachedPayload> {
const {
cacheTtlMs = CACHE_TTL_MS,
cacheRoot = STATSKONTORET_CACHE_ROOT,
clientConfig = {},
} = options;

const source = getStatskontoretSource(sourceKey);
const filePath = cacheFilePath(sourceKey, cacheRoot);

// --- Cache hit ---
const cached = readCacheEntry(filePath);
if (cached !== undefined && isCacheFresh(cached.fetchedAt, cacheTtlMs)) {
const cacheAgeMs = Date.now() - new Date(cached.fetchedAt).getTime();
return {
sourceKey,
sourceTitle: source.title,
sourceUrl: source.url,
links: cached.links,
cachedAt: cached.fetchedAt,
fetchedAt: cached.fetchedAt,
fromCache: true,
cacheAgeMs,
};
}

// --- Cache miss or stale: fetch from origin ---
const client = new StatskontoretClient(clientConfig);
let links: StatskontoretDownloadLink[];
let fetchedAt: string;

try {
links = await client.discoverDownloads(sourceKey);
// Stamp provenance after the fetch completes so `fetchedAt` reflects when
// the data was actually retrieved, not when the request was issued.
fetchedAt = new Date().toISOString();
writeCacheEntry(filePath, { fetchedAt, sourceKey, links });
} catch (error) {
Comment on lines +157 to +187
// --- Resilience: return stale cache on fetch failure ---
if (cached !== undefined) {
const cacheAgeMs = Date.now() - new Date(cached.fetchedAt).getTime();
return {
sourceKey,
sourceTitle: source.title,
sourceUrl: source.url,
links: cached.links,
cachedAt: cached.fetchedAt,
fetchedAt: cached.fetchedAt,
fromCache: true,
cacheAgeMs,
};
}
const detail = error instanceof Error ? error.message : String(error);
throw new StatskontoretError(
`fetch-statskontoret: failed to fetch ${sourceKey} and no cache available: ${detail}`,
'http',
{ cause: error },
);
}

return {
sourceKey,
sourceTitle: source.title,
sourceUrl: source.url,
links,
cachedAt: fetchedAt,
fetchedAt,
fromCache: false,
cacheAgeMs: 0,
};
}

/**
* Check whether a fresh cache entry exists for the given source key without
* triggering a network fetch.
*
* @param sourceKey - The Statskontoret source to check.
* @param options - Optional TTL and cache-root overrides.
* @returns `true` if a fresh cache entry exists, `false` otherwise.
*/
export function isStatskontoretCacheFresh(
sourceKey: StatskontoretSourceKey,
options: Pick<FetchStatskontoretCachedOptions, 'cacheTtlMs' | 'cacheRoot'> = {},
): boolean {
const { cacheTtlMs = CACHE_TTL_MS, cacheRoot = STATSKONTORET_CACHE_ROOT } = options;
const filePath = cacheFilePath(sourceKey, cacheRoot);
const cached = readCacheEntry(filePath);
return cached !== undefined && isCacheFresh(cached.fetchedAt, cacheTtlMs);
}

/**
* Return the list of all built-in Statskontoret source keys.
* Useful for iterating over all sources in agentic workflows.
*/
export function statskontoretSourceKeys(): readonly StatskontoretSourceKey[] {
return STATSKONTORET_SOURCES.map((s) => s.key);
}
Loading
Loading