The system uses a layered modular architecture with clear separation between API routes, business logic, and support services.
graph TB
Admin[Admin] -->|Start/Stop| API[API Layer]
Team[Team] -->|Submit| API
Browser[Browser] -->|View| UI[Leaderboard UI]
API --> Health[health.py]
API --> AdminR[admin.py]
API --> Submit[submission.py]
API --> LB[leaderboard.py]
API --> Cfg[config.py]
AdminR -->|Manage| Session[Session Manager]
Submit -->|Normalize| Normalizer[Normalizer]
Submit -->|Score| Scorer[Scorer]
LB -->|Aggregate| Session
Session -->|Generate| Fake[Fake Teams]
Normalizer -->|Parse| NS[NormalizedSubmission]
NS -->|Validate| Scorer
Scorer -->|Calculate| Score[Score + Quality]
Session -->|Track| State[Global State]
State -->|GT_TABLE| GT[Ground Truth]
State -->|Scoring Params| Scorer
GT -->|Load from| CSV[groundtruth.csv]
Scorer -->|Results| Response[JSON Response]
Response -->|HTTP| Team
Response -->|Feeds| Session
Response -->|Feeds| LB
LB -->|JSON| UI
style API fill:#4CAF50
style Session fill:#9C27B0
style Scorer fill:#FF9800
style State fill:#E91E63
style UI fill:#2196F3
style Fake fill:#FFC107
Key Features:
- Modular Architecture: Separate API, Core, Services layers
- Global State: Centralized GT_TABLE via
app.state - Server-controlled timing: Admin starts/stops questions
- Session management: Tracks real + fake teams
- Tolerance-based scoring: Distance-weighted match quality
- Real-time leaderboard: Grid + table views with 20 real teams
- Team registration: Each real team registers to obtain a
team_session_idtoken for submissions.
app/
├── main.py # FastAPI entry point (~100 lines)
├── state.py # Global state (GT_TABLE)
├── models.py # Pydantic data models
├── utils.py # Utility functions
│
├── api/ # API Layer - FastAPI routers
│ ├── __init__.py
│ ├── health.py # GET / - Health check
│ ├── admin.py # POST /admin/* - Admin controls
│ ├── submission.py # POST /submit, GET /questions
│ ├── leaderboard.py # GET /api/leaderboard-data, UI routes
│ └── config.py # GET /config
│
├── core/ # Core Business Logic
│ ├── __init__.py
│ ├── groundtruth.py # Load CSV → GT_TABLE
│ ├── normalizer.py # Parse & normalize KIS/QA/TR
│ ├── scoring.py # Score with tolerance + distance
│ └── session.py # Question session management
│
├── services/ # Support Services
│ ├── __init__.py
│ └── fake_teams.py # Generate fake leaderboard entries
Handles HTTP requests and routes to core business logic.
Routers:
-
health.py: Health check endpointGET /→ Server status, version, question count
-
admin.py: Admin managementPOST /admin/start-question→ Start question with timerPOST /admin/stop-question→ Stop active questionGET /admin/sessions→ List all sessionsPOST /admin/reset→ Reset all sessions
-
submission.py: Team submissionsPOST /submit→ Submit answer (requiresteamSessionIdtoken)GET /questions→ List all available questions
-
leaderboard.py: Leaderboard & UIGET /api/leaderboard-data→ JSON data for all questionsGET /leaderboard-ui→ Serve HTML pageGET /admin-dashboard→ Serve admin HTML
-
config.py: ConfigurationGET /config→ Active question config + all questions
-
team.py: Team registrationPOST /teams/register→ Register a team name and obtainteam_session_id
Design Pattern:
# All routers use FastAPI's APIRouter
from fastapi import APIRouter
from app import state
router = APIRouter(prefix="/admin", tags=["admin"])
@router.post("/start-question")
async def start_question_endpoint(request: dict):
# Access global state
if question_id not in state.GT_TABLE:
raise HTTPException(404, "Question not found")
# ... business logicThe core layer contains all business logic, independent of HTTP/API concerns.
Modules:
groundtruth.py - Load ground truth data
def load_groundtruth(csv_path: str) -> Dict[int, GroundTruth]:
"""Load CSV → Dict[question_id, GroundTruth]"""
# Parse CSV, validate points (must be even-length)
# Return GT_TABLE for global statenormalizer.py - Parse and normalize submissions
def normalize_kis(body: dict, qid: int) -> NormalizedSubmission:
"""Parse KIS format → normalized values"""
def normalize_qa(body: dict, qid: int) -> NormalizedSubmission:
"""Parse QA-<answer>-<scene>_<video>-<ms1>,<ms2>"""
# Validate: uppercase, no accents, no spaces
def normalize_tr(body: dict, qid: int) -> NormalizedSubmission:
"""Parse TR-<scene>_<video>-<frame1>,<frame2>"""scoring.py - Score calculation with tolerance
def score_submission(
submission: NormalizedSubmission,
gt: GroundTruth,
elapsed_time: float,
k: int,
params: ScoringParams
) -> dict:
"""
Calculate score with:
- Tolerance-based matching (±2500ms KIS/QA, ±12 frames TR)
- Distance-based quality (linear decay center → boundary)
- Time factor (earlier = higher score)
- Penalty (k wrong attempts)
Returns: {score, correctness_factor, match_quality, ...}
"""session.py - Question session management
# Global session storage
active_questions: Dict[int, QuestionSession] = {}
def start_question(qid: int, time_limit: int, buffer: int) -> QuestionSession:
"""Start question timer, create session"""
def stop_question(qid: int):
"""Stop question, mark inactive"""
def is_question_active(qid: int) -> bool:
"""Check if within time limit + buffer"""
def record_submission(qid: int, team_id: str, is_correct: bool, score: float):
"""Track submission for team"""
def get_question_leaderboard(qid: int) -> List[dict]:
"""Get rankings for one question"""Data Models (app/models.py):
classDiagram
class GroundTruth {
+int stt
+str type
+str scene_id
+str video_id
+List~int~ points
}
class NormalizedSubmission {
+int question_id
+str qtype
+str video_id
+str scene_id
+List~int~ values
}
class QuestionSession {
+int question_id
+float start_time
+int time_limit
+int buffer_time
+bool is_active
+Dict team_submissions
+Dict fake_teams
}
class TeamSubmission {
+str team_id
+int question_id
+int wrong_count
+int correct_count
+List submit_times
+bool is_completed
+float first_correct_time
+float final_score
}
class ScoringParams {
+int p_max
+int p_base
+int p_penalty
+int time_limit
+int buffer_time
}
QuestionSession --> TeamSubmission
GroundTruth:
- Represents one question from CSV
points: Even-length list, each pair = 1 event
NormalizedSubmission:
- Unified format after normalization
values: User submitted values (ms or frame_id)- Includes scene_id and video_id validation
QuestionSession:
- Server-controlled question timing
- Tracks all team submissions
- Manages active/inactive state
- Contains both real teams and fake teams
TeamSubmission:
- Per-team tracking within a question
- Records wrong attempts (wrong_count) and correct attempts (correct_count)
- Stores submission timing
- Stores final score after completion
ScoringParams:
- Competition scoring parameters
- p_max=100, p_base=50, p_penalty=10
- time_limit=300s, buffer_time=10s
fake_teams.py - Generate realistic fake team data
The system now uses 20 real AIC 2025 team names instead of generic fake names:
REAL_TEAM_NAMES = [
"UIT@Dzeus", "TKU.TonNGoYsss", "UTE AI LAB",
"HCMUS_DeepLearning_Team", "HCMIU-Beyond_Limits",
"NLU_Knights", "HUST_ICT01", "Titan",
"VGU_Guardian", "UIT_Together", "FPT_AI",
"Zalo AI", "SOICT_BKAI", "VNUHCM_AI",
"HCMUS_AIChallenge", "SGU_Warriors", "HUTECH_AI",
"TDTU_MindSet", "UTH_Phoenix", "VNU_Hanoi_AI"
]
# Fake team pool used for leaderboard fillers
# Total: 19 fake teams + 1 real team = 20 teams on leaderboardFunctions:
def generate_fake_teams(question_id: int, gt: GroundTruth,
p_max: int, time_limit: int) -> Dict[str, TeamSubmission]:
"""
Generate 19 fake teams with realistic behavior:
- Score distribution: 80-100 (15%), 60-80 (30%), 40-60 (35%), 0-40 (20%)
- Submission rate: 85% teams submit, 15% don't
- Attempts: 60% first try, 25% 1 wrong, 10% 2-3 wrong, 5% only wrong
- Time variance: ±30% of time limit
"""leaderboard.py - Assemble leaderboard data
def assemble_leaderboard_data(gt_table: Dict) -> dict:
"""
Aggregate data from all question sessions:
- Merge real teams + fake teams
- Calculate per-question scores
- Calculate total scores
- Sort by total (desc) → time (asc)
- Return: {questions: [...], teams: [...]}
"""
### 4. Global State (`app/state.py`)
Centralized global state accessible across all modules:
```python
from typing import Dict, Optional
# Global ground truth table
# Loaded at startup and accessible throughout the application
GT_TABLE: Optional[Dict] = None
Usage Pattern:
# In main.py (startup)
from app import state
from app.core.groundtruth import load_groundtruth
@asynccontextmanager
async def lifespan(app: FastAPI):
state.GT_TABLE = load_groundtruth("data/groundtruth.csv")
yield
# In any router or module
from app import state
@router.get("/questions")
async def list_questions():
if not state.GT_TABLE:
return {"questions": []}
for qid, gt in state.GT_TABLE.items():
# ... use ground truthBenefits:
- ✅ No dependency injection complexity
- ✅ Single source of truth
- ✅ Easy to access from any module
- ✅ FastAPI recommended pattern for global state
Server-controlled question timing and team tracking:
flowchart TD
A[Admin Starts Question] --> B[Create QuestionSession]
B --> B1[Generate 19 Fake Teams]
B --> C[Record start_time]
C --> D[Set time_limit + buffer]
D --> E[Session Active]
E --> F{Team Submits}
F --> G{First Submission?}
G -->|Yes| H[Create TeamSubmission]
G -->|No| I[Get Existing]
H --> J{Correct?}
I --> J
J -->|No| K[Increment wrong_count]
J -->|Yes| L[Record first_correct_time]
L --> M[Calculate Score]
M --> N[Mark Completed]
K --> O[Return Score 0]
Key Functions:
start_question(): Admin starts timerstop_question(): Admin stops manuallyis_question_active(): Check if within time limit + bufferget_elapsed_time(): Time since startrecord_submission(): Track team attemptsget_question_leaderboard(): Generate rankings
flowchart TD
A[Load CSV] --> B{Validate Format}
B -->|Invalid| C[Raise Error]
B -->|Valid| D[Parse Points with Dash]
D --> E{Even Count?}
E -->|No| C
E -->|Yes| F{Sorted?}
F -->|No| C
F -->|Yes| G[Create GroundTruth]
G --> H[Add to Table]
H --> I[Return Dict]
Validations:
- Points are dash-separated (
-) - Points count must be even
- Points must be sorted ascending
- All required fields present
Converts different body formats to unified NormalizedSubmission:
flowchart LR
A[Request Body + teamSessionId] --> B{Task Type?}
B -->|KIS| C[normalize_kis]
B -->|QA| D[normalize_qa]
B -->|TR| E[normalize_tr]
C --> F[NormalizedSubmission]
D --> F
E --> F
F --> G[Competition Scorer]
KIS Format:
- Multiple
answerswithmediaItemName,start,end - Each answer represents one timestamp
- Must match all groundtruth timestamps exactly
QA Format:
- Single
textwith pattern:QA-<ANSWER>-<VIDEO_ID>-<MS1>,<MS2>,... - Comma-separated milliseconds in one text
- Must match all groundtruth timestamps exactly
TR Format:
- Single
textwith pattern:TR-<VIDEO_ID>-<FRAME_ID1>,<FRAME_ID2>,... - Comma-separated frame IDs in one text
- Supports partial matching (50-99% = half score)
Complete Rewrite for Competition Mode:
flowchart TD
A[Start Scoring] --> B[Parse GT Events]
B --> C[Check Exact Match]
C --> D{All Matched?}
D -->|Yes| E[Calculate Correctness Factor]
D -->|No| E
E --> F{Task Type?}
F -->|KIS/QA| G{100% Match?}
F -->|TR| H{Match Percentage?}
G -->|Yes| I[correctness = 1.0]
G -->|No| J[correctness = 0.0]
H -->|100%| I
H -->|50-99%| K[correctness = 0.5]
H -->|<50%| J
I --> L[Calculate Time Factor]
J --> M[Return Score 0]
K --> L
L --> N[fT = 1 - t/T]
N --> O[Apply Formula]
O --> P[Score = max 0, P_base + P_max-P_base × fT - k × P_penalty × correctness]
P --> Q[Return Final Score]
Key Functions:
-
calculate_time_factor(t_submit, t_task)- Formula:
fT(t) = 1 - (t_submit / T_task) - Earlier = higher multiplier
- Formula:
-
check_exact_match(user_values, gt_events, task_type)- No tolerance - must match exactly
- Returns (matched_count, total_events)
-
calculate_correctness_factor(matched, total, task_type)- KIS/QA: 100% or nothing
- TR: 100%=1.0, 50-99%=0.5, <50%=0.0
-
calculate_final_score(params, t_submit, k, correctness)- Full competition formula
- Returns max(0, score) J -->|Yes| C J -->|No| K[Aggregate Scores] K --> L{Aggregation?} L -->|mean| M[Average] L -->|min| N[Minimum] L -->|sum| O[Sum] M --> P[Final Score] N --> P O --> P
**Key Functions:**
- `points_to_events()`: Converts `[p1,p2,p3,p4]` → `[(p1,p2), (p3,p4)]`
- `score_event_ms()`: Score for KIS/QA (milliseconds)
- `score_event_frame()`: Score for TR (frame_id)
- `score_submission()`: Main scoring orchestrator
## Data Flow
### Complete Request Flow
```mermaid
sequenceDiagram
participant C as Client
participant API as FastAPI
participant SESS as Session Manager
participant GT as GT Loader
participant N as Normalizer
participant S as Scorer
C->>API: POST /submit
API->>SESS: Get active session
SESS-->>API: Session + timers
API->>GT: Get question GT
GT-->>API: GroundTruth object
API->>N: Normalize body
N-->>API: NormalizedSubmission
API->>S: Score submission
S->>S: Parse events
S->>S: Score each event
S->>S: Aggregate scores
S-->>API: Final score + details
API-->>C: JSON response
- Admin endpoints create/update
QuestionSessionobjects insideapp.core.session. get_current_active_question_id()resolves the latest active session (validated by timer).- Scoring parameters live in
app.state.SCORING_PARAMS, so updates only require tweaking state (no YAML reload). - Because everything stays in memory, switching questions or resetting sessions takes effect instantly.
scoring-server/
├── app/
│ ├── __init__.py # Package marker
│ ├── main.py # FastAPI app, endpoints, CORS
│ ├── state.py # Global GT table + scoring params
│ ├── models.py # Pydantic data models
│ ├── core/
│ │ ├── groundtruth.py # CSV parser with validation
│ │ ├── normalizer.py # Body format normalizers (KIS/QA/TR)
│ │ ├── scoring.py # Core scoring algorithms
│ │ └── session.py # Question/session management
│ ├── api/
│ │ ├── admin.py # Admin controls
│ │ ├── config.py # Runtime config snapshot
│ │ ├── leaderboard.py # Leaderboard + UI routes
│ │ ├── submission.py # Submission endpoint
│ │ └── health.py # Health check
│ ├── services/
│ │ └── fake_teams.py # Fake team generator
│ └── utils.py # Helper functions
├── data/
│ └── groundtruth.csv # Question groundtruth data
├── static/ # Admin & leaderboard assets
├── tests/
│ ├── __init__.py
│ └── test_scoring.py # Unit tests
├── docs/
│ ├── system-design.md # This file
│ └── scoring-logic.md # Scoring algorithm details
├── requirements.txt # Python dependencies
├── Dockerfile # Container build
└── README.md # Quick start guide
Purpose: Health check
Response:
{
"status": "ok",
"message": "AIC 2025 Scoring Server",
"version": "1.0.0",
"total_questions": 5
}Purpose: View current active question configuration
Response:
{
"active_question_id": 1,
"type": "TR",
"video_id": "V017",
"scene_id": "L26",
"num_events": 2,
"fps": 25.0,
"max_score": 100.0,
"frame_tolerance": 12.0,
"aggregation": "mean"
}Purpose: List all available questions
Response:
{
"questions": [
{
"id": 1,
"type": "TR",
"video_id": "V017",
"scene_id": "L26",
"num_events": 2
}
]
}Purpose: Register a team and receive a submission token.
Request:
{
"team_name": "Zero9"
}Response:
{
"team_id": "team-zero9-a1b2c3",
"team_name": "Zero9",
"team_session_id": "5957bb4a31c74aadac118cc55a570d24",
"message": "Team registered. Keep your teamSessionId secret."
}Purpose: Submit answer and get score. Body must include the teamSessionId returned by /teams/register.
Request:
{
"teamSessionId": "5957bb4a31c74aadac118cc55a570d24",
"answerSets": [{
"answers": [
{ "mediaItemName": "K14_V026", "start": "370000", "end": "386000" }
]
}]
}Response:
{
"success": true,
"question_id": 1,
"type": "TR",
"video_id": "V017",
"score": 23.0,
"detail": {
"per_event_scores": [46.0, 0.0],
"gt_events": [[4890, 5000], [5001, 5020]],
"user_values": [4999, 5049],
"aggregation_method": "mean",
"num_gt_events": 2,
"num_user_events": 2
}
}flowchart TD
A[Request] --> B{Valid JSON?}
B -->|No| C[400 Bad Request]
B -->|Yes| D{Question Exists?}
D -->|No| E[400 Question Not Found]
D -->|Yes| F{Valid Format?}
F -->|No| G[400 Invalid Format]
F -->|Yes| H{Video Match?}
H -->|No| I[400 Video Mismatch]
H -->|Yes| J[Score Submission]
J --> K{Success?}
K -->|No| L[500 Internal Error]
K -->|Yes| M[200 OK]
Error Response Format:
{
"detail": "Error message here"
}uvicorn app.main:app --reloaduvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4docker build -t scoring-server .
docker run -p 8000:8000 scoring-servergunicorn app.main:app -w 4 -k uvicorn.workers.UvicornWorker- CSV Loading: Done once at startup, cached in memory
- Runtime Config: Stored in-process via
app.state.SCORING_PARAMS(no YAML I/O) - Scoring: O(n) where n = number of ground-truth events
- No Database: All in-memory for speed
app.state.SCORING_PARAMSstoresp_max,p_base,p_penalty, defaulttime_limit, andbuffer_time.- Admin API can override
time_limit/buffer_timeper question when calling/admin/start-question. - Updating scoring weights only requires tweaking the state (e.g., during application startup or via a maintenance endpoint).
static/
├── leaderboard.html # Main UI structure
├── leaderboard.css # Styling and animations
└── leaderboard.js # Auto-refresh logic
Key Features:
- Auto-refresh: Polls
/api/leaderboard-dataevery 2 seconds - Submission indicators:
- ✅ Green checkmark = correct submission
- ❌ Red X = wrong submission
- Shows count of each type
- Score color coding:
- High (80-100): Green gradient
- Good (60-80): Light green
- Medium (40-60): Amber/Yellow
- Low (0-40): Red
- Team highlighting:
- Registered teams use solid accents; fake teams use muted tones.
- All other teams = fake/simulated
- Rankings:
- 🥇 Gold medal for 1st place
- 🥈 Silver medal for 2nd place
- 🥉 Bronze medal for 3rd place
sequenceDiagram
participant Browser
participant FastAPI
participant Session
participant Fake
Browser->>FastAPI: GET /leaderboard-ui
FastAPI->>Browser: HTML page
loop Every 2 seconds
Browser->>FastAPI: GET /api/leaderboard-data
FastAPI->>Session: Get all sessions
Session->>FastAPI: Real teams data
Session->>Fake: Get fake teams
Fake->>FastAPI: Fake teams data
FastAPI->>Browser: JSON (real + fake teams)
Browser->>Browser: Update table
end
{
"questions": [1, 2, 3, 4, 5],
"teams": [
{
"team_name": "Example Team",
"is_real": true,
"questions": {
"1": {
"wrong_count": 1,
"correct_count": 1,
"score": 85.5
},
"2": {
"wrong_count": 0,
"correct_count": 1,
"score": 92.0
}
},
"total_score": 177.5
},
{
"team_name": "CodeNinja",
"is_real": false,
"questions": {
"1": {
"wrong_count": 0,
"correct_count": 1,
"score": 88.3
}
},
"total_score": 88.3
}
]
}- Test
points_to_events()conversion - Test scoring functions with various distances
- Test aggregation methods (mean/min/sum)
- Test edge cases (missing events, out of range)
- Test competition formula with time factors
- Test exact match logic
Manual testing with curl:
curl -X POST http://localhost:8000/submit \
-H "Content-Type: application/json" \
-d @test_submission.json- Database backend for persistent storage
- User authentication and session management
- Leaderboard functionality
- Detailed analytics and reporting
- WebSocket for real-time updates
- Admin UI for managing questions
Last Updated: 2025-11-07