AevovOP Analytics & Monitoring System Documentation

Analytics System - Complete Implementation

Status: ✅ Fully Implemented Version: 2.0.0 Date: November 20, 2025

📊 Overview

The AevovOP Analytics & Monitoring System provides comprehensive metrics collection, time-series analysis, and real-time monitoring capabilities for the entire platform. Track system performance, node metrics, consensus statistics, storage utilization, and more.

Key Features

Multi-Category Metrics: 8 metric categories covering all system aspects
Time-Series Analysis: Query historical data with customizable intervals
Real-Time Dashboards: Comprehensive dashboard with live updates
Aggregation Functions: 8 aggregation functions (avg, sum, min, max, count, p50, p95, p99)
Custom Queries: Flexible query engine for custom analytics
Performance Comparison: Compare metrics across multiple entities
Top Performers: Identify highest-performing nodes
Automated Collection: Background metrics collection

🏗️ Architecture

┌────────────────────────────────────────────────────────────┐
│               Analytics & Monitoring System                 │
│                                                            │
│  ┌──────────────────────────────────────────────────┐     │
│  │           Metrics Collector                      │     │
│  │  • Real-time metric recording                    │     │
│  │  • Multi-category tracking                       │     │
│  │  • Automatic aggregation                         │     │
│  │  • Batch processing                              │     │
│  └──────────────────────────────────────────────────┘     │
│                         ↕                                  │
│  ┌──────────────────────────────────────────────────┐     │
│  │           Query Engine                           │     │
│  │  • Time-series queries                           │     │
│  │  • Aggregation functions                         │     │
│  │  • Custom filters                                │     │
│  │  • Performance optimization                      │     │
│  └──────────────────────────────────────────────────┘     │
│                         ↕                                  │
│  ┌──────────────┬──────────────┬────────────────────┐    │
│  │              │              │                    │    │
│  │   Metrics    │  Time Series │    Dashboards      │    │
│  │     DB       │   Analysis   │                    │    │
│  └──────────────┴──────────────┴────────────────────┘    │
└────────────────────────────────────────────────────────────┘

📈 Metric Categories

1. System Metrics

Track overall system performance and health.

Metrics:

Total requests
Successful/failed requests
Average latency
Requests per second
System uptime
Active nodes count
Total/used storage
Consensus rounds

2. Node Metrics

Track individual node performance and contributions.

Metrics:

Request count per node
Success rate
Average latency
Reputation score
Total rewards earned
Validations completed
Storage provided
Inference executed
Node uptime

3. Consensus Metrics

Monitor consensus protocol performance.

Metrics:

Total consensus rounds
Completed/failed rounds
Average round time
Average votes per round
Consensus success rate
Byzantine detections

4. Storage Metrics

Track storage system utilization.

Metrics:

Total chunks stored
Total storage size
Average chunk size
Replication factor
Failed uploads/downloads
Provider statistics

5. Reward Metrics

Monitor reward distribution and economics.

Metrics:

Total rewards issued
Total rewards distributed
Pending rewards
Average reward amount
Rewards by type
Total staked
Total slashed
Transaction count

6. Network Metrics

Track network health and connectivity.

Metrics:

Total/online/offline nodes
Suspended nodes count
Average reputation
Network health percentage
Nodes by type
Messages sent/received
Average network latency

7. Inference Metrics

Monitor AI inference performance.

Metrics:

Total inference sessions
Completed/failed sessions
Average inference latency
Average confidence scores
Total tokens processed
Models used count

8. Pattern Metrics

Track pattern extraction and validation.

Metrics:

Total patterns
Validated/rejected patterns
Patterns by category
Average quality score
Compression ratio

📡 API Endpoints

Get Metrics

Retrieve current metrics for a category.

GET /api/aevovop/analytics/metrics?category=system

curl http://localhost:8090/api/aevovop/analytics/metrics?category=system

Categories:

system - Overall system metrics (default)
consensus - Consensus metrics
storage - Storage metrics
reward - Reward metrics
network - Network metrics
inference - Inference metrics
pattern - Pattern metrics

Response:

{
  "total_requests": 125000,
  "successful_requests": 123500,
  "failed_requests": 1500,
  "average_latency": 45.2,
  "requests_per_second": 125.5,
  "uptime": 99.5,
  "active_nodes": 42,
  "total_storage": 5368709120,
  "used_storage": 3221225472,
  "total_rewards_issued": 15000.0,
  "active_stakes": 28,
  "consensus_rounds_total": 3450,
  "last_updated": "2025-11-20T08:00:00Z"
}

Get Dashboard

Retrieve comprehensive dashboard data with all metrics.

GET /api/aevovop/analytics/dashboard

curl http://localhost:8090/api/aevovop/analytics/dashboard

Response:

{
  "system": {
    "total_requests": 125000,
    "successful_requests": 123500,
    "average_latency": 45.2,
    "uptime": 99.5
  },
  "consensus": {
    "total_rounds": 3450,
    "completed_rounds": 3420,
    "average_round_time": 12.5,
    "consensus_rate": 99.13
  },
  "storage": {
    "total_chunks": 15000,
    "total_size": 3221225472,
    "average_chunk_size": 214748
  },
  "rewards": {
    "total_rewards_issued": 15000.0,
    "total_rewards_distributed": 14250.0,
    "pending_rewards": 750.0,
    "total_staked": 25000.0
  },
  "network": {
    "total_nodes": 42,
    "online_nodes": 40,
    "offline_nodes": 1,
    "suspended_nodes": 1,
    "network_health": 95.24
  },
  "inference": {
    "total_sessions": 8500,
    "completed_sessions": 8420,
    "average_latency": 350.5,
    "average_confidence": 0.85
  },
  "patterns": {
    "total_patterns": 2500,
    "validated_patterns": 2450,
    "average_quality": 0.88
  },
  "top_nodes": [
    {
      "node_id": "node_abc123",
      "reputation_score": 95.5,
      "success_rate": 99.2,
      "total_rewards_earned": 550.0
    }
  ],
  "updated_at": "2025-11-20T08:00:00Z"
}

Query Metrics

Execute a custom analytics query with time-series support.

POST /api/aevovop/analytics/query

curl -X POST http://localhost:8090/api/aevovop/analytics/query \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "request_latency",
    "category": "system",
    "start_time": "2025-11-20T00:00:00Z",
    "end_time": "2025-11-20T23:59:59Z",
    "interval": "1h",
    "aggregation": "avg"
  }'

Query Parameters:

metric (required) - Metric name
category - Metric category
start_time - Start time (RFC3339 format)
end_time - End time (RFC3339 format)
interval - Time interval (1m, 5m, 1h, 1d)
aggregation - Aggregation function (avg, sum, min, max, count, p50, p95, p99)
filters - Label filters
limit - Result limit

Response:

{
  "metric": "request_latency",
  "category": "system",
  "time_series": {
    "metric": "request_latency",
    "category": "system",
    "interval": "1h",
    "start_time": "2025-11-20T00:00:00Z",
    "end_time": "2025-11-20T23:59:59Z",
    "data_points": [
      {
        "timestamp": "2025-11-20T00:00:00Z",
        "value": 42.5
      },
      {
        "timestamp": "2025-11-20T01:00:00Z",
        "value": 45.2
      }
    ]
  },
  "count": 24,
  "executed_at": "2025-11-20T08:00:00Z"
}

Get Node Metrics

Retrieve metrics for a specific node.

GET /api/aevovop/analytics/nodes/:nodeID/metrics

curl http://localhost:8090/api/aevovop/analytics/nodes/node_abc123/metrics

Response:

{
  "node_id": "node_abc123",
  "request_count": 15000,
  "success_rate": 99.2,
  "average_latency": 38.5,
  "uptime": 98.5,
  "reputation_score": 95.5,
  "total_rewards_earned": 550.0,
  "validations_completed": 450,
  "storage_provided": 10737418240,
  "inference_executed": 250,
  "last_seen": "2025-11-20T07:55:00Z"
}

Get Node Performance

Retrieve performance trends for a node over time.

GET /api/aevovop/analytics/nodes/:nodeID/performance?period=7d&interval=1d

curl "http://localhost:8090/api/aevovop/analytics/nodes/node_abc123/performance?period=7d&interval=1d"

Query Parameters:

period - Time period (1h, 6h, 12h, 24h, 7d, 30d)
interval - Data point interval (1m, 5m, 1h, 6h, 1d)
start_time - Custom start time (RFC3339)
end_time - Custom end time (RFC3339)

Response:

{
  "node_requests": {
    "metric": "node_requests",
    "category": "node",
    "interval": "1d",
    "data_points": [
      {"timestamp": "2025-11-14T00:00:00Z", "value": 2000},
      {"timestamp": "2025-11-15T00:00:00Z", "value": 2100}
    ]
  },
  "node_latency": {
    "metric": "node_latency",
    "category": "node",
    "interval": "1d",
    "data_points": [
      {"timestamp": "2025-11-14T00:00:00Z", "value": 35.5},
      {"timestamp": "2025-11-15T00:00:00Z", "value": 38.2}
    ]
  },
  "reputation_score": {
    "metric": "reputation_score",
    "category": "node",
    "interval": "1d",
    "data_points": [
      {"timestamp": "2025-11-14T00:00:00Z", "value": 94.0},
      {"timestamp": "2025-11-15T00:00:00Z", "value": 95.5}
    ]
  }
}

Get System Trends

Retrieve system-wide trends over time.

GET /api/aevovop/analytics/trends?period=24h&interval=1h

curl "http://localhost:8090/api/aevovop/analytics/trends?period=24h&interval=1h"

Response:

{
  "requests_total": {
    "metric": "requests_total",
    "category": "system",
    "interval": "1h",
    "data_points": [
      {"timestamp": "2025-11-20T00:00:00Z", "value": 5000},
      {"timestamp": "2025-11-20T01:00:00Z", "value": 5200}
    ]
  },
  "active_nodes": {
    "metric": "active_nodes",
    "category": "network",
    "interval": "1h",
    "data_points": [
      {"timestamp": "2025-11-20T00:00:00Z", "value": 40},
      {"timestamp": "2025-11-20T01:00:00Z", "value": 42}
    ]
  }
}

Get Top Performers

Retrieve top performing nodes by a specific metric.

GET /api/aevovop/analytics/top-performers?metric=reputation_score&limit=10

curl "http://localhost:8090/api/aevovop/analytics/top-performers?metric=reputation_score&limit=10"

Query Parameters:

metric - Metric to rank by (default: reputation_score)
limit - Number of top performers (default: 10)
period - Time period to consider

Response:

{
  "metric": "reputation_score",
  "limit": 10,
  "top_performers": [
    {
      "node_id": "node_abc123",
      "reputation_score": 95.5,
      "success_rate": 99.2,
      "total_rewards": 550.0,
      "uptime": 98.5
    },
    {
      "node_id": "node_def456",
      "reputation_score": 93.8,
      "success_rate": 98.5,
      "total_rewards": 480.0,
      "uptime": 97.2
    }
  ]
}

Compare Metrics

Compare a metric across multiple entities.

POST /api/aevovop/analytics/compare

curl -X POST http://localhost:8090/api/aevovop/analytics/compare \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "average_latency",
    "category": "node",
    "entities": ["node_abc123", "node_def456", "node_ghi789"]
  }'

Response:

{
  "metric": "average_latency",
  "category": "node",
  "comparison": {
    "node_abc123": 38.5,
    "node_def456": 42.1,
    "node_ghi789": 35.2
  }
}

Get Time Series

Retrieve time-series data for a metric.

GET /api/aevovop/analytics/time-series?metric=request_latency&interval=5m&period=1h

curl "http://localhost:8090/api/aevovop/analytics/time-series?metric=request_latency&interval=5m&period=1h"

Response:

{
  "metric": "request_latency",
  "category": "system",
  "time_series": {
    "interval": "5m",
    "data_points": [
      {"timestamp": "2025-11-20T07:00:00Z", "value": 42.5},
      {"timestamp": "2025-11-20T07:05:00Z", "value": 43.2}
    ]
  }
}

Record Metric

Manually record a metric (for testing/debugging).

POST /api/aevovop/analytics/record

curl -X POST http://localhost:8090/api/aevovop/analytics/record \
  -H "Content-Type: application/json" \
  -d '{
    "name": "custom_metric",
    "type": "gauge",
    "category": "system",
    "value": 123.45,
    "unit": "ms",
    "labels": {
      "service": "api",
      "endpoint": "/health"
    }
  }'

Get Analytics Health

Check analytics system health.

GET /api/aevovop/analytics/health

curl http://localhost:8090/api/aevovop/analytics/health

Response:

{
  "status": "healthy",
  "service": "analytics",
  "timestamp": "2025-11-20T08:00:00Z"
}

🔍 Query Examples

Example 1: Average Request Latency (Last 24 Hours)

{
  "metric": "request_latency",
  "category": "system",
  "start_time": "2025-11-19T08:00:00Z",
  "end_time": "2025-11-20T08:00:00Z",
  "interval": "1h",
  "aggregation": "avg"
}

Example 2: Node Reputation Trend (Last 7 Days)

{
  "metric": "reputation_score",
  "category": "node",
  "filters": {
    "node_id": "node_abc123"
  },
  "start_time": "2025-11-14T00:00:00Z",
  "end_time": "2025-11-20T00:00:00Z",
  "interval": "1d",
  "aggregation": "avg"
}

Example 3: Total Rewards Distributed (Last Month)

{
  "metric": "rewards_distributed",
  "category": "reward",
  "start_time": "2025-10-20T00:00:00Z",
  "end_time": "2025-11-20T00:00:00Z",
  "aggregation": "sum"
}

Example 4: 95th Percentile Consensus Round Time

{
  "metric": "consensus_round_time",
  "category": "consensus",
  "aggregation": "p95",
  "start_time": "2025-11-20T00:00:00Z",
  "end_time": "2025-11-20T23:59:59Z"
}

📊 Aggregation Functions

Function	Description	Use Case
`avg`	Average value	Overall trends, typical performance
`sum`	Sum of all values	Total counts, cumulative metrics
`min`	Minimum value	Best performance, lower bounds
`max`	Maximum value	Worst performance, upper bounds
`count`	Number of data points	Activity levels, event counts
`p50`	50th percentile (median)	Typical performance excluding outliers
`p95`	95th percentile	Performance targets, SLA monitoring
`p99`	99th percentile	Tail latency, worst-case scenarios

🎯 Configuration

Collector Configuration

config := &analytics.MetricsCollectorConfig{
    CollectionInterval: 1 * time.Minute,      // How often to collect
    RetentionPeriod:    30 * 24 * time.Hour,  // 30 days
    EnableAggregation:  true,
    EnableAlerts:       true,
    MaxMetricsPerBatch: 1000,
}

🔧 Developer Guide

Recording a Metric

collector := analytics.NewCollector(app, config)

// Record a counter
collector.RecordCounter(ctx, "requests_total", analytics.CategorySystem, 1, nil)

// Record a gauge
collector.RecordGauge(ctx, "active_connections", analytics.CategorySystem, 42, "connections", nil)

// Record with labels
collector.RecordGauge(ctx, "node_latency", analytics.CategoryNode, 35.5, "ms", map[string]string{
    "node_id": "node_abc123",
})

Querying Metrics

queryEngine := analytics.NewQueryEngine(app, collector)

req := &analytics.QueryRequest{
    Metric:      "request_latency",
    Category:    analytics.CategorySystem,
    StartTime:   &startTime,
    EndTime:     &endTime,
    Interval:    "1h",
    Aggregation: "avg",
}

resp, err := queryEngine.Execute(ctx, req)

Getting Dashboard Data

dashboard, err := collector.GetDashboardData(ctx)

🛡️ Best Practices

1. Metric Naming

Use snake_case: request_latency, node_reputation
Be descriptive: consensus_round_time vs time
Include units in name when ambiguous: size_bytes, duration_ms

2. Label Usage

Keep label cardinality low (avoid unique IDs as labels)
Use consistent label names across metrics
Use labels for dimensions you want to filter by

3. Aggregation Selection

Use avg for latency and rates
Use sum for counters and totals
Use p95/p99 for SLA monitoring
Use max for capacity planning

4. Time Ranges

Use appropriate intervals for time range:
- 1 hour → 1m interval
- 24 hours → 1h interval
- 7 days → 6h or 1d interval
- 30 days → 1d interval

📈 Use Cases

1. System Health Monitoring

Monitor overall system health with real-time dashboards showing:

Request throughput and latency
Error rates and success rates
Active nodes and network health
Resource utilization

2. Node Performance Analysis

Track individual node performance:

Identify underperforming nodes
Monitor reputation trends
Track reward earnings
Analyze uptime patterns

3. Capacity Planning

Use historical data for capacity planning:

Storage growth trends
Request volume patterns
Node scaling requirements
Resource allocation optimization

4. SLA Monitoring

Track service level agreements:

P95/P99 latency monitoring
Uptime tracking
Consensus success rates
Error budget management

5. Anomaly Detection

Identify unusual patterns:

Sudden latency spikes
Reputation drops
Failed consensus rounds
Storage failures

Built with ❤️ for AevovOP Analytics & Monitoring System

FilesExpand file tree

ANALYTICS_SYSTEM.md

Latest commit

History

ANALYTICS_SYSTEM.md

File metadata and controls

AevovOP Analytics & Monitoring System Documentation

Analytics System - Complete Implementation

📊 Overview

Key Features

🏗️ Architecture

📈 Metric Categories

1. System Metrics

2. Node Metrics

3. Consensus Metrics

4. Storage Metrics

5. Reward Metrics

6. Network Metrics

7. Inference Metrics

8. Pattern Metrics

📡 API Endpoints

Get Metrics

Get Dashboard

Query Metrics

Get Node Metrics

Get Node Performance

Get System Trends

Get Top Performers

Compare Metrics

Get Time Series

Record Metric

Get Analytics Health

🔍 Query Examples

Example 1: Average Request Latency (Last 24 Hours)

Example 2: Node Reputation Trend (Last 7 Days)

Example 3: Total Rewards Distributed (Last Month)

Example 4: 95th Percentile Consensus Round Time

📊 Aggregation Functions

🎯 Configuration

Collector Configuration

🔧 Developer Guide

Recording a Metric

Querying Metrics

Getting Dashboard Data

🛡️ Best Practices

1. Metric Naming

2. Label Usage

3. Aggregation Selection

4. Time Ranges

📈 Use Cases

1. System Health Monitoring

2. Node Performance Analysis

3. Capacity Planning

4. SLA Monitoring

5. Anomaly Detection