Skip to content

Latest commit

 

History

History
212 lines (158 loc) · 7.62 KB

File metadata and controls

212 lines (158 loc) · 7.62 KB

Open.Orleans Reference Chat System – Phase 1 Architecture Specification

Executive Summary

Open.Orleans is a reference-grade demonstration of how Microsoft Orleans and SignalR can be combined to build a distributed, real-time chat platform that prioritizes immutability, scalability, and minimal processing overhead.
This project showcases best practices for event-sourced design and horizontal scaling in .NET 8, while remaining lightweight enough to run locally.
It is intended as the canonical example of how to structure a production-style Orleans + SignalR system with file-based persistence and REST-driven retrieval.


1 Project Overview

Vision Statement

Deliver a real-time distributed chat system that demonstrates Orleans’ virtual-actor model and SignalR’s real-time communication, using immutable events, event replay, and simple REST-based persistence.

Primary Goals

  • Educational Clarity – Serve as the definitive example of Orleans + SignalR integration.
  • Production-Minded Design – Use clean boundaries, testability, and observability.
  • Minimal Overhead – Prove that modern .NET can achieve responsive real-time UX on modest hardware.
  • Scalable Architecture – Operate across multiple Orleans silos with no message loss.
  • Immutability Everywhere – All state changes are immutable events, enabling auditability and replay.

Target Audience

Enterprise architects, senior .NET developers, and Orleans contributors seeking a concise, working reference implementation.


2 Functional Requirements

FR-001 Core Messaging

  • Send, receive, and persist text messages.
  • Each message creates an immutable event in storage.
  • Edits and deletes are additional events.
  • History retrieval is performed via REST API endpoints that read from file/blob storage.
  • SignalR delivers only live events (new messages, edits, deletes).

FR-002 User Management

  • Simplified JWT-based authentication.
  • Presence (online/offline/typing) tracked by User Grain.
  • Multi-device connections supported.
  • REST endpoint for basic profile data.

FR-003 Room Management

  • Create/join/leave chat rooms.
  • Manage membership and roles.
  • Private DM channels supported.

FR-004 Message History & Pagination

  • Infinite scroll via REST pagination (JSON responses).
  • Optional message search.
  • Local caching of scroll windows.

FR-005 Real-Time Indicators

  • Typing indicators, read receipts, presence updates.
  • Live events published via SignalR.

3 Non-Functional Requirements

3.1 Performance Targets

  • Demonstrate < 100 ms local-cluster latency (not guaranteed in production).
  • Sustain 1000 concurrent local clients.
  • Provide benchmark metrics, not rigid quotas.

3.2 Scalability

  • Operate across 3 Orleans silos.
  • Recover gracefully from silo restart.
  • Stateless SignalR hubs balanced across instances.

3.3 Reliability & Consistency

  • No message loss under normal conditions.
  • Eventual consistency acceptable.
  • Complete audit trail for all user actions.

3.4 Observability & Diagnostics

  • Structured logging for grains & hubs.
  • Correlation IDs from REST request → Orleans stream → SignalR broadcast.
  • Metrics endpoint exposing grain activations, message latency, and queue depth.

3.5 Demo Constraints

  • Runs locally or on lightweight cloud VMs (1–3 nodes).
  • External deps optional (Redis for cache, otherwise file storage).
  • One-command startup: dotnet run.
  • Self-contained; no external DB required.

4 High-Level Architecture

4.1 Data Flow Overview

  1. Client → REST API: retrieves historical messages (JSON files).
  2. Client → SignalR: subscribes for live events.
  3. Orleans Grains: produce immutable events and update file-based storage.
  4. Stream Manager Grain: publishes events to SignalR hubs.

4.2 Component Diagram (Simplified)


Blazor Client
│
├─► REST API (File/Blob Reads)
│
└─► SignalR (WebSocket for Live Events)
│
▼
Orleans Cluster (3 Silos)
│
▼
File/Blob Storage (JSON event logs)

4.3 Grain Responsibilities

Grain Source of Truth For Events Produced
UserGrain Connections & presence PresenceChanged, Typing
RoomGrain Membership & routing UserJoined, UserLeft
MessageGrain Message lifecycle Created, Edited, Deleted
StreamManagerGrain Subscriptions & broadcasts MessageDelivered

All operations produce immutable events written to file storage via Orleans persistence adapters.


5 Epics & Stories (Condensed)

Epic 1 – Messaging Pipeline

  • Story 1.1 Send Message: Persist to file store → publish SignalR event.
  • Story 1.2 Retrieve History: REST endpoint reads paged JSON files.
  • Story 1.3 Edit/Delete: New event type + SignalR update.
  • Story 1.4 Replay: Rebuild room timeline from event log.

Epic 2 – User Sessions

  • Story 2.1 Auth + Connect: JWT auth, UserGrain activation.
  • Story 2.2 Multi-Device: Multiple connections per user.
  • Story 2.3 Presence: SignalR broadcast within 200 ms.

Epic 3 – Room Lifecycle

  • Story 3.1 Create/Join: Activate RoomGrain, persist metadata.
  • Story 3.2 Membership Mgmt: Role-based control + notifications.

Epic 4 – Observability

  • Story 4.1 Structured Logs: JSON logging for grains/hubs.
  • Story 4.2 Metrics: Expose grains alive, latency, queue depth.
  • Story 4.3 Tracing: Correlate REST request → event delivery.

Epic 5 – Scalability & Failover

  • Story 5.1 Multi-Silo Deployment: Cross-silo stream testing.
  • Story 5.2 Load Balancing: Equal SignalR distribution + sticky sessions.

Each story carries unique acceptance-criteria tags (e.g., AC-1.1a) for test traceability.


6 Technical Risks & Decisions

ID Risk / Decision Mitigation
R-001 Orleans stream load hot-spots Partition streams by RoomId + load testing early
R-002 SignalR connection tracking across silos Use UserGrain state + connection heartbeat
R-003 File I/O contention under load Async file writes + batching layer
D-001 Persistence choice Default: file/blob storage; optional Redis adapter
D-002 Client framework Blazor WASM (default), client abstraction for others

7 Implementation Plan Draft

Phase 2A – Cluster Setup (Weeks 1-2)

  • Orleans silo + SignalR hub scaffold
  • File-based persistence adapter
  • REST controller stub
    Exit Criterion: Single user can send and retrieve their own message.

Phase 2B – Messaging Pipeline (Weeks 3-4)

  • MessageGrain + RoomGrain + event schemas
  • REST API for history pagination
  • SignalR broadcast wiring
    Exit Criterion: Multi-user chat with live updates.

Phase 2C – Enhancements (Weeks 5-6)

  • Edit/delete events
  • Presence + typing
  • Basic metrics/logging
    Exit Criterion: Feature-complete demo UX (telegram-smooth).

Phase 2D – Scale & Benchmark (Weeks 7-8)

  • Multi-silo cluster testing
  • Load profiling + documentation
  • Published benchmarks and architecture diagram
    Exit Criterion: Reference demo ready for open-source release.

8 Definition of Done

  • End-to-end local run (multi-silo, multi-client)
  • All acceptance criteria automated and passing
  • Code documented (XML + README per project)
  • Logs and metrics exported to console + file
  • Benchmarks published in /docs/benchmarks
  • Architecture diagram and setup guide in repo README

Status: ✅ Phase 1 Specification Approved for Implementation Design
Next Step: Initiate Phase 2 technical design and project bootstrap.