You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+20-1Lines changed: 20 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,24 @@
1
1
# AI_stats_measurement
2
-
Pilot project for measuring official statistics in AI
2
+
Pilot project for measuring official statistics in AI,
3
+
4
+
🔗 Live demo: https://ai-stats-measurement.lab.sspcloud.fr/
5
+
6
+
# Overview
7
+
This project explores the reliability of Large Language Models (LLMs) when answering questions based on official statistical data. It compares model responses against trusted sources such as National Statistical Institutes (NSIs).
8
+
9
+
# Goal
10
+
The goal of this project is to assess whether publicly available data from NSIs is machine-readable and findable. The aim is to identify whether NSIs need to take action to improve the machine-readability and discoverability of their data to better support AI systems.
11
+
12
+
13
+
# Features
14
+
- Analytics dashboard
15
+
Evaluate model performance using metrics such as ARR (Accuracy Rate Ratio) per NSI and per model.
16
+
- Response inspection
17
+
Load and review all responses for a given prompt.
18
+
19
+
# Access & Limitations
20
+
Submitting new prompts is currently limited to CBS researchers only.
21
+
Public users can explore results and existing evaluations.
0 commit comments