Skip to content

Commit 851e871

Browse files
authored
Merge pull request #2045 from Hack23/copilot/add-ministerial-responses-coverage
feat: Statskontoret 30-day cache module, feasibility contract tests, myndigheter skill update
2 parents 1858f03 + 5c20d35 commit 851e871

5 files changed

Lines changed: 675 additions & 1 deletion

File tree

.github/skills/myndigheter-monitoring/SKILL.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -286,6 +286,81 @@ When an agency is named in `implementation-feasibility.md`:
286286
- **Stakeholder voices** - Include citizens, experts, civil society
287287
- **Public interest** - Agencies serve citizens, not themselves
288288

289+
## Statskontoret Data Integration
290+
291+
Statskontoret (Swedish Agency for Public Management) publishes open data that provides
292+
authoritative, Admiralty-A1 ground truth for government-body context. Use this data
293+
**before** relying on estimates or secondary sources when writing about agency headcounts,
294+
organisational structures or central-government budget execution.
295+
296+
### Available Datasets
297+
298+
| Dataset key | Title | Cadence | Primary use |
299+
|-------------|-------|---------|-------------|
300+
| `myndighetsforteckning` | Myndighetsförteckning — öppna data | Annual | Headcount by department & leadership form (2007–present) |
301+
| `arsutfall` | Årsutfall för statens budget — öppna data | Annual | Annual budget outturn by appropriation & agency |
302+
| `manadsutfall` | Månadsutfall för statens budget — öppna data | Monthly | High-frequency budget-execution monitoring |
303+
| `budget-time-series` | Tidsserier, statens budget m.m. | Annual | Long-run central-government budget context (1995+) |
304+
305+
### How to Fetch (agentic workflows)
306+
307+
The cached library helper is invoked from TypeScript code (see "Cached Fetch Module"
308+
below). For ad-hoc CLI use, the `statskontoret-fetch.ts` wrapper is the entrypoint:
309+
310+
```bash
311+
# CLI: list every built-in Statskontoret source
312+
tsx scripts/statskontoret-fetch.ts list-sources
313+
314+
# CLI: discover downloadable files for a source
315+
tsx scripts/statskontoret-fetch.ts discover --source myndighetsforteckning
316+
317+
# CLI: fetch + parse headcount workbook
318+
tsx scripts/statskontoret-fetch.ts headcount --url <xlsx-url> --persist
319+
320+
# CLI: fetch + parse budget-outturn workbook
321+
tsx scripts/statskontoret-fetch.ts budget-outturn --source arsutfall --url <xlsx-url> --doc-type Inkomst --persist
322+
```
323+
324+
### Cached Fetch Module (`scripts/fetch-statskontoret.ts`)
325+
326+
The `fetch-statskontoret.ts` module provides a **30-day TTL cache layer** over the raw
327+
HTTP client, making it suitable for agentic workflows that run daily but should only
328+
re-download large Excel workbooks every 30 days:
329+
330+
```typescript
331+
import { fetchStatskontoretCached, isStatskontoretCacheFresh } from './fetch-statskontoret.js';
332+
333+
// Check cache freshness without a network call
334+
if (!isStatskontoretCacheFresh('myndighetsforteckning')) {
335+
const payload = await fetchStatskontoretCached('myndighetsforteckning');
336+
// payload.fromCache === false → fresh download
337+
// payload.links → array of StatskontoretDownloadLink (Excel URLs)
338+
}
339+
```
340+
341+
On network failure the module automatically falls back to the most recent stale cache
342+
entry, ensuring workflows remain resilient to temporary outages.
343+
344+
### Data Provenance Rule
345+
346+
Any implementation-feasibility or agency-context analysis that names a Swedish
347+
government body **must** annotate the headcount or budget figure with a
348+
Statskontoret source citation:
349+
350+
```markdown
351+
*Headcount source: Statskontoret Myndighetsförteckning 2025
352+
(analysis/data/statskontoret/myndighetsforteckning/) [A1]*
353+
```
354+
355+
Admiralty grade for own-Statskontoret publications: **A1** (official statistics,
356+
primary public record).
357+
358+
### Network Allowlist
359+
360+
`www.statskontoret.se` and `statskontoret.se` are included in the `network.allowed`
361+
list of all 11 `news-*.md` agentic workflow files. No additional configuration is
362+
required.
363+
289364
## References
290365

291366
- [Swedish Agency Directory](https://www.regeringen.se/regeringens-politik/myndigheter-under-regeringen/)
@@ -294,6 +369,9 @@ When an agency is named in `implementation-feasibility.md`:
294369
- [OECD Public Administration Reviews](https://www.oecd.org/governance/)
295370
- [Transparency International Sweden](https://www.transparency.se/)
296371
- [Swedish Agency for Public Management (Statskontoret)](https://www.statskontoret.se/)
372+
- [Statskontoret Indicators Inventory](../../../analysis/statskontoret/indicators-inventory.json)
373+
- [fetch-statskontoret.ts](../../../scripts/fetch-statskontoret.ts) — 30-day cache module
374+
- [statskontoret-client.ts](../../../scripts/statskontoret-client.ts) — HTTP client library
297375

298376
---
299377

analysis/statskontoret/indicators-inventory.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
"clients": {
99
"cli": "tsx scripts/statskontoret-fetch.ts (commands: list-sources, discover, headcount, budget-outturn)",
1010
"library": "scripts/statskontoret-client.ts (StatskontoretClient class)",
11+
"cachedFetch": "scripts/fetch-statskontoret.ts (fetchStatskontoretCached — 30-day TTL cache layer for agentic workflows)",
1112
"persistence": "scripts/parliamentary-data/data-persistence.ts (persistStatskontoretData)"
1213
},
1314
"notes": {

scripts/fetch-statskontoret.ts

Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
/**
2+
* @module scripts/fetch-statskontoret
3+
* @description Cached fetch module for Statskontoret open data, providing a
4+
* 30-day TTL cache layer over {@link StatskontoretClient}.
5+
*
6+
* This module is intended for use by agentic workflows that need Statskontoret
7+
* context (authority register, budget outturn) without re-downloading large
8+
* Excel/ZIP files on every run. It follows the same no-MCP client pattern as
9+
* `imf-context.ts` and `scb-context.ts`.
10+
*
11+
* ### Cache behaviour
12+
* - Cache root: `analysis/data/statskontoret/<sourceKey>/cache/`
13+
* - TTL: 30 days (configurable via the `cacheTtlMs` option)
14+
* - On hit: returns the cached payload with provenance metadata
15+
* - On miss or stale: invokes `StatskontoretClient.discoverDownloads()` and
16+
* persists the result before returning
17+
* - On fetch error: falls back to the most recent stale cache entry (resilience)
18+
*
19+
* ### Security
20+
* Fetch calls go only to `https://www.statskontoret.se` (enforced by
21+
* `assertStatskontoretFetchTarget` inside `StatskontoretClient`). No
22+
* credentials are required; all data is PUBLIC classification.
23+
*
24+
* @see analysis/statskontoret/indicators-inventory.json
25+
* @see scripts/statskontoret-client.ts (low-level HTTP + parse)
26+
* @see scripts/statskontoret-fetch.ts (CLI entry-point)
27+
* @author Hack23 AB
28+
* @license Apache-2.0
29+
*/
30+
31+
import fs from 'node:fs';
32+
import path from 'node:path';
33+
import { fileURLToPath } from 'node:url';
34+
35+
import {
36+
getStatskontoretSource,
37+
STATSKONTORET_SOURCES,
38+
StatskontoretClient,
39+
StatskontoretError,
40+
type StatskontoretClientConfig,
41+
type StatskontoretDownloadLink,
42+
type StatskontoretSourceKey,
43+
} from './statskontoret-client.js';
44+
45+
// ---------------------------------------------------------------------------
46+
// Constants
47+
// ---------------------------------------------------------------------------
48+
49+
const __filename = fileURLToPath(import.meta.url);
50+
const REPO_ROOT = path.resolve(path.dirname(__filename), '..');
51+
52+
/** Default 30-day cache TTL in milliseconds (30 days × 24 h × 60 min × 60 s × 1000 ms). */
53+
export const CACHE_TTL_MS = 30 * 24 * 60 * 60 * 1000;
54+
55+
/** Root directory for cached Statskontoret payloads. */
56+
export const STATSKONTORET_CACHE_ROOT = path.join(
57+
REPO_ROOT,
58+
'analysis',
59+
'data',
60+
'statskontoret',
61+
);
62+
63+
// ---------------------------------------------------------------------------
64+
// Types
65+
// ---------------------------------------------------------------------------
66+
67+
/** A cached Statskontoret downloads payload with provenance metadata. */
68+
export interface StatskontoretCachedPayload {
69+
readonly sourceKey: StatskontoretSourceKey;
70+
readonly sourceTitle: string;
71+
readonly sourceUrl: string;
72+
readonly links: readonly StatskontoretDownloadLink[];
73+
readonly cachedAt: string;
74+
readonly fetchedAt: string;
75+
readonly fromCache: boolean;
76+
readonly cacheAgeMs: number;
77+
}
78+
79+
/** Options for {@link fetchStatskontoretCached}. */
80+
export interface FetchStatskontoretCachedOptions {
81+
/** Override the 30-day TTL (milliseconds). Mainly for testing. */
82+
readonly cacheTtlMs?: number;
83+
/** Override the cache root directory. Mainly for testing. */
84+
readonly cacheRoot?: string;
85+
/** Override the `StatskontoretClient` configuration (e.g. inject a mock fetch). */
86+
readonly clientConfig?: StatskontoretClientConfig;
87+
}
88+
89+
/** Internal cache file format. */
90+
interface CacheEntry {
91+
readonly fetchedAt: string;
92+
readonly sourceKey: StatskontoretSourceKey;
93+
readonly links: StatskontoretDownloadLink[];
94+
}
95+
96+
// ---------------------------------------------------------------------------
97+
// Private helpers
98+
// ---------------------------------------------------------------------------
99+
100+
function cacheDir(sourceKey: StatskontoretSourceKey, cacheRoot: string): string {
101+
return path.join(cacheRoot, sourceKey, 'cache');
102+
}
103+
104+
function cacheFilePath(sourceKey: StatskontoretSourceKey, cacheRoot: string): string {
105+
return path.join(cacheDir(sourceKey, cacheRoot), 'downloads.json');
106+
}
107+
108+
function readCacheEntry(filePath: string): CacheEntry | undefined {
109+
try {
110+
const raw = fs.readFileSync(filePath, 'utf-8');
111+
return JSON.parse(raw) as CacheEntry;
112+
} catch {
113+
return undefined;
114+
}
115+
}
116+
117+
function writeCacheEntry(filePath: string, entry: CacheEntry): void {
118+
const dir = path.dirname(filePath);
119+
fs.mkdirSync(dir, { recursive: true });
120+
fs.writeFileSync(filePath, JSON.stringify(entry, null, 2), 'utf-8');
121+
}
122+
123+
function isCacheFresh(fetchedAt: string, ttlMs: number): boolean {
124+
const age = Date.now() - new Date(fetchedAt).getTime();
125+
return age < ttlMs;
126+
}
127+
128+
// ---------------------------------------------------------------------------
129+
// Public API
130+
// ---------------------------------------------------------------------------
131+
132+
/**
133+
* Fetch Statskontoret download links for a given source key, using a 30-day
134+
* file-system cache.
135+
*
136+
* @param sourceKey - The Statskontoret source to fetch
137+
* (`myndighetsforteckning`, `arsutfall`, `manadsutfall`, `budget-time-series`).
138+
* @param options - Optional TTL, cache-root and client overrides.
139+
* @returns A {@link StatskontoretCachedPayload} with links and provenance info.
140+
*
141+
* @example
142+
* ```ts
143+
* const payload = await fetchStatskontoretCached('myndighetsforteckning');
144+
* console.log(`Found ${payload.links.length} download links (fromCache=${payload.fromCache})`);
145+
* ```
146+
*/
147+
export async function fetchStatskontoretCached(
148+
sourceKey: StatskontoretSourceKey,
149+
options: FetchStatskontoretCachedOptions = {},
150+
): Promise<StatskontoretCachedPayload> {
151+
const {
152+
cacheTtlMs = CACHE_TTL_MS,
153+
cacheRoot = STATSKONTORET_CACHE_ROOT,
154+
clientConfig = {},
155+
} = options;
156+
157+
const source = getStatskontoretSource(sourceKey);
158+
const filePath = cacheFilePath(sourceKey, cacheRoot);
159+
160+
// --- Cache hit ---
161+
const cached = readCacheEntry(filePath);
162+
if (cached !== undefined && isCacheFresh(cached.fetchedAt, cacheTtlMs)) {
163+
const cacheAgeMs = Date.now() - new Date(cached.fetchedAt).getTime();
164+
return {
165+
sourceKey,
166+
sourceTitle: source.title,
167+
sourceUrl: source.url,
168+
links: cached.links,
169+
cachedAt: cached.fetchedAt,
170+
fetchedAt: cached.fetchedAt,
171+
fromCache: true,
172+
cacheAgeMs,
173+
};
174+
}
175+
176+
// --- Cache miss or stale: fetch from origin ---
177+
const client = new StatskontoretClient(clientConfig);
178+
let links: StatskontoretDownloadLink[];
179+
let fetchedAt: string;
180+
181+
try {
182+
links = await client.discoverDownloads(sourceKey);
183+
// Stamp provenance after the fetch completes so `fetchedAt` reflects when
184+
// the data was actually retrieved, not when the request was issued.
185+
fetchedAt = new Date().toISOString();
186+
writeCacheEntry(filePath, { fetchedAt, sourceKey, links });
187+
} catch (error) {
188+
// --- Resilience: return stale cache on fetch failure ---
189+
if (cached !== undefined) {
190+
const cacheAgeMs = Date.now() - new Date(cached.fetchedAt).getTime();
191+
return {
192+
sourceKey,
193+
sourceTitle: source.title,
194+
sourceUrl: source.url,
195+
links: cached.links,
196+
cachedAt: cached.fetchedAt,
197+
fetchedAt: cached.fetchedAt,
198+
fromCache: true,
199+
cacheAgeMs,
200+
};
201+
}
202+
const detail = error instanceof Error ? error.message : String(error);
203+
throw new StatskontoretError(
204+
`fetch-statskontoret: failed to fetch ${sourceKey} and no cache available: ${detail}`,
205+
'http',
206+
{ cause: error },
207+
);
208+
}
209+
210+
return {
211+
sourceKey,
212+
sourceTitle: source.title,
213+
sourceUrl: source.url,
214+
links,
215+
cachedAt: fetchedAt,
216+
fetchedAt,
217+
fromCache: false,
218+
cacheAgeMs: 0,
219+
};
220+
}
221+
222+
/**
223+
* Check whether a fresh cache entry exists for the given source key without
224+
* triggering a network fetch.
225+
*
226+
* @param sourceKey - The Statskontoret source to check.
227+
* @param options - Optional TTL and cache-root overrides.
228+
* @returns `true` if a fresh cache entry exists, `false` otherwise.
229+
*/
230+
export function isStatskontoretCacheFresh(
231+
sourceKey: StatskontoretSourceKey,
232+
options: Pick<FetchStatskontoretCachedOptions, 'cacheTtlMs' | 'cacheRoot'> = {},
233+
): boolean {
234+
const { cacheTtlMs = CACHE_TTL_MS, cacheRoot = STATSKONTORET_CACHE_ROOT } = options;
235+
const filePath = cacheFilePath(sourceKey, cacheRoot);
236+
const cached = readCacheEntry(filePath);
237+
return cached !== undefined && isCacheFresh(cached.fetchedAt, cacheTtlMs);
238+
}
239+
240+
/**
241+
* Return the list of all built-in Statskontoret source keys.
242+
* Useful for iterating over all sources in agentic workflows.
243+
*/
244+
export function statskontoretSourceKeys(): readonly StatskontoretSourceKey[] {
245+
return STATSKONTORET_SOURCES.map((s) => s.key);
246+
}

0 commit comments

Comments
 (0)