A complete client-server system for real-time affect analysis in remote learning environments. The system fuses gaze and posture cues using decision-level fusion to infer student emotions, aggregates data across students, and provides classroom-level analytics.
- Client-Side Inference: Real-time gaze and posture analysis using pre-trained CNNs
- Decision-Level Fusion: Config-driven weighted majority voting for emotion inference
- Temporal Smoothing: Sliding-window smoothing for robust predictions
- HTTP Client-Server: RESTful API for data ingestion and analytics
- Aggregation & Analytics: Window → Student → Classroom aggregation pipeline
- Type-Safe Schemas: Pydantic schemas for validation across the system
- Research-Grade: Clean, modular, explainable architecture
┌─────────────────────────────────────────────────────────────┐
│ CLIENT SIDE │
├─────────────────────────────────────────────────────────────┤
│ Video Input → Frame Sampling → Inference (Parallel) │
│ ↓ ↓ ↓ │
│ Gaze CNN+SVM Posture CNN Temporal Smoothing │
│ ↓ ↓ ↓ │
│ Cue→Affect Mapping → Fusion Engine → HTTP Client │
└─────────────────────────────────────────────────────────────┘
↓ HTTP POST
┌─────────────────────────────────────────────────────────────┐
│ SERVER SIDE │
├─────────────────────────────────────────────────────────────┤
│ FastAPI Endpoints → Validation (Pydantic) │
│ ↓ │
│ Storage Service → Aggregation Service → Analytics Service │
│ ↓ │
│ JSON Persistence (Window/Student/Classroom) │
└─────────────────────────────────────────────────────────────┘
MAR/
├── client/ # Client-side pipeline
│ ├── preprocessing/
│ │ ├── frame_sampler.py # Frame sampling (no disk saving)
│ │ └── face_body_detector.py # MediaPipe detection (optional)
│ ├── inference/
│ │ ├── gaze_inference.py # Gaze CNN + SVM inference
│ │ └── posture_inference.py # Posture CNN inference
│ ├── temporal/
│ │ └── temporal_smoothing.py # Sliding-window smoothing
│ ├── fusion/
│ │ ├── cue_to_affect.py # Cue → emotion mapping
│ │ ├── fusion_engine.py # Decision-level fusion
│ │ ├── weighted_voting.py # Voting implementation
│ │ └── fusion_config.json # Fusion configuration
│ ├── packaging/
│ │ └── data_packager.py # JSON payload builder
│ ├── network/
│ │ └── http_client.py # HTTP client with retry logic
│ └── run_client_pipeline.py # Main pipeline runner
│
├── server/ # Server-side API & services
│ ├── main.py # FastAPI application
│ ├── services/
│ │ ├── aggregation.py # Window→Student→Classroom aggregation
│ │ └── analytics.py # Analytics computation
│ └── persistence/
│ └── storage.py # JSON file storage
│
├── shared/ # Shared schemas
│ └── schemas.py # Pydantic schemas
│
├── models/ # Trained models (inference only)
│ ├── gaze_cnn.pt
│ ├── gaze_svm.joblib
│ ├── posture_cnn.pt
│ └── posture_class_map.json
│
├── training/ # Training scripts (DO NOT MODIFY)
│ ├── train_gaze.py
│ ├── train_posture.py
│ └── load_dataset.py
│
├── outputs/ # Generated outputs
│ ├── client_jsons/ # Client-side JSONs
│ └── server/ # Server-side storage
│ ├── windows/ # Per-window data
│ ├── sessions/ # Per-session data
│ ├── students/ # Per-student summaries
│ └── classrooms/ # Per-classroom aggregates
│
├── requirements.txt # Python dependencies
└── README.md # This file
All data structures are validated using Pydantic schemas in shared/schemas.py:
Schema for a single cue's inference output:
{
"cue": "gaze",
"timestamp_sec": 40,
"prediction": "looking_at_screen",
"confidence": 0.92,
"quality": "good",
"emotion_distribution": {...}, # Optional
"mapping_quality": "mapped" # Optional
}Schema for decision-level fusion result:
{
"timestamp_sec": 40,
"final_emotion": "interested",
"confidence": 0.51,
"emotion_scores": {...},
"contributing_cues": ["gaze", "posture"],
"fusion_type": "weighted_majority_voting"
}Schema for per-window streaming payload:
{
"type": "window_update",
"class_id": "CS101",
"student_id": "student_001",
"session_id": "session_2025_01_01",
"timestamp_sec": 40,
"emotion": "interested",
"confidence": 0.51,
"emotion_scores": {...},
"fusion_type": "weighted_majority_voting"
}Schema for end-of-session summary:
{
"type": "session_final",
"class_id": "CS101",
"student_id": "student_001",
"session_id": "session_2025_01_01",
"duration_sec": 350,
"total_windows": 15,
"emotion_distribution": {...},
"dominant_emotion": "interested",
"ended_at": 1704067200
}Schema for classroom-level analytics:
{
"class_id": "CS101",
"total_students": 25,
"total_sessions": 25,
"total_windows": 375,
"emotion_distribution": {...},
"dominant_emotion": "interested",
"student_summaries": [...],
"temporal_trends": [...],
"generated_at": "2025-01-01T12:00:00"
}The fusion layer is configured via client/fusion/fusion_config.json:
{
"emotions": ["interested", "bored", "confused", "frustrated", "neutral"],
"cue_weights": {
"gaze": 0.6,
"posture": 0.4
},
"confidence_threshold": 0.25,
"cue_to_emotion": {
"gaze": {
"looking_at_screen": {
"interested": 0.7,
"confused": 0.2,
"neutral": 0.1
},
"looking_away": {
"bored": 0.6,
"frustrated": 0.3,
"neutral": 0.1
}
},
"posture": {
"sitting_upright": {"interested": 0.7, "neutral": 0.3},
"writing": {"interested": 0.8, "neutral": 0.2},
"hands_on_face": {"confused": 0.6, "frustrated": 0.2, "neutral": 0.2},
"slouching": {"bored": 0.6, "frustrated": 0.2, "neutral": 0.2}
}
}
}For each emotion (e), the fusion score is:
[ \text{score}(e) = \sum_{\text{cue}} w_{\text{cue}} \times \text{conf}_{\text{cue}} \times P(e \mid \text{cue}) ]
Where:
- (w_{\text{cue}}) = cue weight from config
- (\text{conf}_{\text{cue}}) = cue confidence
- (P(e \mid \text{cue})) = emotion probability from
cue_to_emotion
Final emotion = argmax(score(e))
- Frame Sampling: Extract frames every 20 seconds (configurable)
- Parallel Inference: Gaze and posture inference run concurrently
- Temporal Smoothing: Sliding-window majority voting
- Cue→Affect Mapping: Map predictions to emotion distributions
- Fusion: Weighted majority voting
- Packaging: Build JSON payloads
- HTTP Transmission: Send to server (optional)
Located in client/network/http_client.py:
from client.network.http_client import AffectAnalysisClient
client = AffectAnalysisClient(
base_url="http://localhost:8000",
max_retries=3,
timeout=10
)
# Send window update
client.send_window(window_payload, validate=True)
# Send session summary
client.send_session(session_payload, validate=True)
# Health check
if client.health_check():
print("Server is reachable")- Automatic Retry: Exponential backoff for failed requests
- Validation: Optional Pydantic schema validation
- Error Handling: Comprehensive logging and error reporting
- Health Check: Server connectivity verification
Receive per-window streaming data.
Request Body: WindowPayload
Response:
{
"status": "success",
"message": "Window data ingested",
"timestamp_sec": 40
}Receive end-of-session batch data.
Request Body: SessionPayload
Response:
{
"status": "success",
"message": "Session data ingested",
"session_id": "session_2025_01_01"
}Get aggregated classroom analytics.
Response: ClassroomAnalytics
Example:
curl http://localhost:8000/analytics/classroom/CS101Health check endpoint.
Response:
{
"status": "healthy",
"timestamp": "2025-01-01T12:00:00"
}# From project root
python -m server.main
# Or with uvicorn directly
uvicorn server.main:app --host 0.0.0.0 --port 8000The server will:
- Validate all incoming data using Pydantic schemas
- Store data in structured JSON files
- Trigger aggregation automatically
- Provide analytics endpoints
Location: server/services/aggregation.py
Functions:
aggregate_student_session(): Window → Student aggregationaggregate_classroom(): Student → Classroom aggregation
Process:
- Load all windows for a student session
- Compute emotion distribution
- Calculate average confidence
- Identify dominant emotion
- Save student summary
Location: server/services/analytics.py
Functions:
compute_classroom_analytics(): Comprehensive classroom analytics
Outputs:
- Aggregated emotion distributions
- Per-student summaries
- Temporal trends over time
- Dominant classroom emotion
Location: server/persistence/storage.py
Storage Structure:
outputs/server/
├── windows/
│ └── {class_id}/
│ └── {student_id}/
│ └── window_{timestamp}.json
├── sessions/
│ └── {class_id}/
│ └── {student_id}/
│ └── {session_id}.json
├── students/
│ └── {class_id}/
│ └── {student_id}/
│ └── {session_id}.json
└── classrooms/
└── {class_id}.json
- Windows: Individual time-window records
- Sessions: End-of-session summaries
- Students: Aggregated student summaries
- Classrooms: Aggregated classroom statistics
All data is stored as structured JSON files, making it:
- Dashboard-ready
- Easy to query
- Research-friendly
- Python 3.10+
- Trained model files in
models/directory
pip install -r requirements.txtKey Dependencies:
torch- PyTorch for CNN inferencescikit-learn- SVM inferenceopencv-python- Video processingmediapipe- Face/body detectionpydantic- Schema validationfastapi- API serveruvicorn- ASGI serverrequests- HTTP client
Ensure these files exist:
models/gaze_cnn.ptmodels/gaze_svm.joblibmodels/posture_cnn.ptmodels/posture_class_map.json
Terminal 1:
# Start FastAPI server
python -m server.main
# Server will start at http://localhost:8000
# You should see:
# INFO: Started server process
# INFO: Waiting for application startup.
# INFO: Application startup complete.Verify server is running:
curl http://localhost:8000/health
# Expected: {"status":"healthy","timestamp":"..."}Terminal 2:
# Run client pipeline (with network enabled)
python -m client.run_client_pipeline
# Or disable network (local-only mode)
ENABLE_NETWORK=false python -m client.run_client_pipelineWhat happens:
- Video is processed frame-by-frame (every 20 seconds)
- Gaze and posture inference run in parallel
- Temporal smoothing is applied
- Fusion computes final emotions
- Window payloads are sent to server (if enabled)
- Session summary is sent at the end (if enabled)
- Local JSON files are saved to
outputs/client_jsons/
Expected Output:
Connected to server: http://localhost:8000
[ 0s] Emotion=interested Conf=0.51
[ 20s] Emotion=interested Conf=0.48
[ 40s] Emotion=bored Conf=0.35
...
Session complete.
Final dominant emotion: interested
Check server logs (Terminal 1):
INFO: Window ingested: class=CS101, student=student_001, timestamp=0
INFO: Window ingested: class=CS101, student=student_001, timestamp=20
...
INFO: Session ingested: class=CS101, student=student_001, session=session_2025_01_01
Check stored data:
# List stored windows
ls -R outputs/server/windows/
# View a window file
cat outputs/server/windows/CS101/student_001/window_000000.json
# View session summary
cat outputs/server/sessions/CS101/student_001/session_2025_01_01.json
# View student summary
cat outputs/server/students/CS101/student_001/session_2025_01_01.json
# View classroom aggregate
cat outputs/server/classrooms/CS101.jsonGet classroom analytics:
curl http://localhost:8000/analytics/classroom/CS101 | python -m json.toolExpected Response:
{
"class_id": "CS101",
"total_students": 1,
"total_sessions": 1,
"total_windows": 15,
"emotion_distribution": {
"interested": 0.6,
"bored": 0.1,
"confused": 0.1,
"frustrated": 0.05,
"neutral": 0.15
},
"dominant_emotion": "interested",
"student_summaries": [...],
"temporal_trends": [...],
"generated_at": "2025-01-01T12:00:00"
}Simulate multiple students:
# test_multiple_students.py
from client.run_client_pipeline import run_pipeline
import os
# Student 1
os.environ["STUDENT_ID"] = "student_001"
os.environ["SESSION_ID"] = "session_001"
run_pipeline("data/video1.mp4")
# Student 2
os.environ["STUDENT_ID"] = "student_002"
os.environ["SESSION_ID"] = "session_002"
run_pipeline("data/video2.mp4")
# Query classroom analytics
import requests
response = requests.get("http://localhost:8000/analytics/classroom/CS101")
print(response.json())Test server offline:
# Stop server, then run client
ENABLE_NETWORK=true python -m client.run_client_pipeline
# Expected: Warning message, continues without network
# Local JSON files still savedTest invalid payload:
# test_invalid_payload.py
import requests
# Send invalid payload
response = requests.post(
"http://localhost:8000/ingest/window",
json={"invalid": "data"}
)
print(response.status_code) # Expected: 422 (validation error)
print(response.json())sampler = FrameSampler(video_path, interval_sec=20)
for frame, timestamp_sec in sampler:
# Process framegaze = GazeInference("models/gaze_cnn.pt", "models/gaze_svm.joblib")
result = gaze.infer(frame, timestamp_sec)posture = PostureInference("models/posture_cnn.pt", "models/posture_class_map.json")
result = posture.infer(frame, timestamp_sec)smoother = TemporalSmoother(window_size=3)
smoothed = smoother.update(cue_output)engine = FusionEngine("client/fusion/fusion_config.json")
fusion_result = engine.fuse(cues={"gaze": ..., "posture": ...}, timestamp_sec=40)client = AffectAnalysisClient(base_url="http://localhost:8000")
client.send_window(window_payload)
client.send_session(session_payload)- Content-Type:
application/json - Body:
WindowPayloadschema - Status Codes: 201 (success), 422 (validation error), 500 (server error)
- Content-Type:
application/json - Body:
SessionPayloadschema - Status Codes: 201 (success), 422 (validation error), 500 (server error)
- Response:
ClassroomAnalyticsschema - Status Codes: 200 (success), 404 (not found), 500 (server error)
- Response:
{"status": "healthy", "timestamp": "..."} - Status Code: 200
Edit client/run_client_pipeline.py:
VIDEO_PATH = "data/sample_video.mp4"
FRAME_INTERVAL_SEC = 20 # Frame sampling interval
OUTPUT_DIR = "outputs/client_jsons"
STUDENT_ID = "student_001"
CLASS_ID = "CS101"
SESSION_ID = "session_2025_01_01"
SERVER_URL = "http://localhost:8000"
ENABLE_NETWORK = TrueEdit server/persistence/storage.py:
base_dir = "outputs/server" # Change storage locationEdit client/fusion/fusion_config.json:
- Adjust
cue_weightsto change cue importance - Modify
cue_to_emotionmappings to change semantic interpretations - Change
confidence_thresholdto filter low-confidence cues
Error: Address already in use
Solution:
# Change port
uvicorn server.main:app --port 8001
# Or kill existing process
# Windows: netstat -ano | findstr :8000
# Linux/Mac: lsof -i :8000Error: FileNotFoundError: models/gaze_cnn.pt
Solution: Ensure all model files exist in models/ directory:
gaze_cnn.ptgaze_svm.joblibposture_cnn.ptposture_class_map.json
Error: 422 Unprocessable Entity
Solution: Check payload structure matches Pydantic schemas:
from shared.schemas import WindowPayload
payload = WindowPayload(**your_dict) # Will raise ValidationError if invalidError: Connection refused
Solution:
- Verify server is running:
curl http://localhost:8000/health - Check firewall settings
- Verify
SERVER_URLin client configuration - Use
ENABLE_NETWORK=falseto run in local-only mode
- Create inference module in
client/inference/ - Add cue to
fusion_config.json:"cue_weights": { "gaze": 0.5, "posture": 0.3, "new_cue": 0.2 }
- Add mappings in
cue_to_emotion - Update pipeline to include new cue
- Update
fusion_config.json:"emotions": ["interested", "bored", "confused", "frustrated", "neutral", "excited"]
- Update emotion distributions in
cue_to_emotion - Update Pydantic schemas in
shared/schemas.py
Edit server/services/aggregation.py:
- Modify
aggregate_student_session()for student-level logic - Modify
aggregate_classroom()for classroom-level logic
Edit server/services/analytics.py:
- Add new metrics to
compute_classroom_analytics() - Create new endpoint in
server/main.py
Every window JSON includes:
- Raw cue predictions and confidences
- Mapped emotion distributions
- Fusion scores for all emotions
- Contributing cues list
This enables:
- Debugging fusion decisions
- Analyzing cue contributions
- Research into fusion strategies
The system is deterministic:
- Same inputs → same outputs
- No randomness in fusion or aggregation
- Reproducible results for research
Each component is independent:
- Can swap fusion strategies
- Can change aggregation logic
- Can add new inference modules
- No monolithic dependencies
This system is designed for research purposes. All ML models are inference-only and should not be retrained or modified.
Technologies Used:
- PyTorch (inference)
- scikit-learn (SVM inference)
- OpenCV (video processing)
- MediaPipe (detection)
- FastAPI (server)
- Pydantic (validation)
For issues or questions:
- Check troubleshooting section (Section 11)
- Review API documentation (Section 9)
- Inspect server logs for detailed error messages
- Verify all dependencies are installed correctly
Last Updated: 2025-01-01