Skip to content

Vector3451/NEXON-AI

 
 

Repository files navigation

title NEXON-AI
emoji 🛡️
colorFrom blue
colorTo indigo
sdk docker
app_port 7860
pinned false

NEXUS-AI 🌐🛡️

Autonomous Incident Investigation Dashboard

Python FastAPI React Tailwind Ollama

Status: Active Simulation Pipeline
Architecture: Real-time WebSockets + Multi-Agent Consensus


📖 What is NEXUS-AI?

NEXUS is a next-generation, autonomous dual-agent environment designed to investigate and validate software incidents in real-time. Using a combination of an Investigator and a Validator agent, NEXUS autonomously forms hypotheses, executes systems tools, evaluates system behavior, and reaches strict consensus on root causes.

Traditional manual debugging requires extensive context-switching and tool fatigue. NEXUS solves this through:

  1. Dual-Agent Autonomy: Two specialized models communicating word-by-word via WebSockets.
  2. Dynamic Tool Execution: Fully integrated system terminals allowing agents to run sandboxed validation scripts.
  3. Semantic Reward Engine: Evaluates conversational drift mathematically (using native GPU embeddings).

The result: An AI "Incident Response Team" that navigates servers, traces logs, and fixes bugs identically to a human SRE.


🖼️ Application Screenshots

📊 Simulation Dashboard

The core command center. Features live agent terminals, a dual-communication consensus log, and a mathematical performance reward graph plotting investigation confidence.

Simulation Dashboard

🎛️ Scenario Registry & Core Settings

The system is architected for instant adaptability — seamlessly switch LLM providers and inject custom threat models entirely through the frontend DOM.

Scenario Browser
Scenario Registry
A persistent LocalStorage-backed grid of tactical simulations. Users can dynamically inject custom infrastructure-specific incidents directly into the agent pipeline.
Hardware Configuration
Runtime Configuration
Dynamically maps available locally-installed Ollama networks, allowing the user to pair models (e.g., Qwen vs Dolphin-Phi) with fully independent parameters.

🏗️ System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    CLIENT BROWSER                               │
│          React SPA (Tailwind + Framer Motion)                   │
│          localhost:5173                                         │
└───────────┬─────────────────────────────────┬───────────────────┘
            │ HTTP (REST)                     │ ws://
            ▼                                 ▼
┌─────────────────────────────────────────────────────────────────┐
│              FASTAPI BACKEND (localhost:7860)                   │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐    │
│  │ /config  │ │/scenarios│ │  /reset  │ │  ws:// Simulator │    │
│  │ Env Sync │ │ DB Cache │ │ Injection│ │  Live Stream Sync│    │
│  └──────────┘ └──────────┘ └──────────┘ └──────────────────┘    │
└───────────┬───────────────────────────────────┬─────────────────┘
            │                                   │
            ▼                                   ▼
┌─────────────────────────────────────────────────────────────────┐
│                  OLLAMA ENGINE / LLM PIPELINE                   │
│  Agent A (Investigator)   ◄──────►   Agent B (Validator)        │
│  - Generates Hypotheses              - Challenges Assertions    │
│  - Runs System Tools                 - Requires Proof           │
└─────────────────────────────────────────────────────────────────┘

🌐 Execution Environments

NEXUS-AI supports two distinct execution models for agent tools, toggleable via the Settings dashboard:

1. Simulated Mode (Safe Sandbox)

  • Default Mode: Agents interact with a pre-defined clue_map within the scenario YAML.
  • No System Impact: Commands like read_logs or check_service return mocked data.
  • Use Case: Training, logic validation, and "what-if" analysis without infrastructure risk.

2. SSH Lab Node (Real-World Execution)

  • Live Connection: Commands are executed in real-time on a remote Linux server via SSH.
  • Autonomous Terminal: Agents use the run_terminal_command tool to browse logs, check systemd status, and inspect real configs.
  • Security: Includes a command blocklist to prevent highly destructive operations (e.g., rm -rf /).
  • Use Case: Actual incident response on isolated Lab/Staging nodes.

📐 OpenEnv Specification

NEXUS-AI strictly adheres to the OpenEnv 1.0 standard for agent-environment interaction.

🎮 Action Space

The environment accepts a typed NexusAction (Text-based with structured tool calls).

  • agent_id: string ("agent_a" or "agent_b")
  • message: string (The natural language reasoning/communication)
  • tool_calls: List[ToolCall] (Optional structured calls like TOOL: read_logs(file='app.log'))
  • confidence: float (0.0 - 1.0)

🧐 Observation Space

The environment returns a structured NexusObservation summarizing the system state.

  • scenario_description: string (High-level objective)
  • scenario_context: string (Background telemetry/environment info)
  • partner_message: string (The last message from the other agent)
  • tool_results: List[ToolResult] (Output of any executed system tools)
  • clues_found: List[string] (Accumulated evidence identified by the Reward Engine)
  • investigation_stage: string (investigating, narrowing, found, verified)
  • round: integer (Current episode round)
  • available_tools: List[string] (List of permitted tools for the current mode)

📝 Task Registry & Difficulty

Task Name Difficulty Objective Grader Method
software-incident Easy Fix Nginx 503 rate-limit misconfiguration State Check: nginx-proxy.rate_limit
business-process-failure Medium Resolve inventory stockout logic error State Check: stock_threshold + Red Herring Penalty
cascade-system-failure Hard Fix Postgres connection exhaustion Multi-Step: Query Termination + Config Update

📈 Baseline Benchmarks

Validated using inference.py (Phi-3-mini & Qwen2.5-1.5B).

  • Software Incident: 0.88 / 1.00
  • Business Process Failure: 0.72 / 1.00
  • Cascade System Failure: 0.48 / 1.00

🧠 The AI Pipeline Deep-Dive

Step 1: Scenario Injection & Bootstrapping

# The EpisodeManager receives the frontend custom scenario JSON
# Broadcasts 'episode_start' natively over the WebSocket to synchronize the UI
await broadcast("episode_start", {
    "scenario": active_scenario,
    "agent_a_model": settings.AGENT_A_MODEL
})

Step 2: Agent Consensus Loop

# Agents interact sequentially. The Investigator attempts a solution
# while the Validator challenges it. Both agents have access to dynamic system execution.
client, model_name = model_manager.get_client(agent_id)
stream = await client.chat.completions.create(
    model=model_name,
    messages=injected_history,
    tools=available_tools, # e.g. fix_proposer, run_terminal_command
    stream=True
)

Step 3: Fast GPU Embeddings (Similarity Evaluation)

# Heavy CPU blocking is completely bypassed.
# Semantic embedding computations map strictly into the Ollama GPU pipeline.
@lru_cache(maxsize=256)
def get_embedding(text: str) -> List[float]:
    response = httpx.post("http://localhost:11434/api/embeddings", json={
        "model": "all-minilm",
        "prompt": text
    }, timeout=60.0)
    return response.json().get("embedding", [])

🛠️ Full Technology Stack

Layer Technology Why
Frontend Framework React 18 (Vite) Lightning fast HMR, component isolation
Frontend Styling Tailwind CSS Utility-first tactical glassmorphism
Backend Framework FastAPI Async Python, explicit endpoint mapping
Transport Layer WebSockets Word-by-word streaming across UI boundaries
Local AI Engine Ollama Native device acceleration, absolute privacy
Remote Provider HuggingFace Inference API Drop-in SaaS alternatives
SSH Connectivity Paramiko Secure remote shell execution for Lab Nodes
Data Persistence LocalStorage & .env Injection Avoids over-architected SQL constraints

🚀 How to Run This Project (Full Step-by-Step Guide)

📋 Prerequisites

  • Python 3.10+
  • Node.js 18+
  • Ollama (installed locally for model hosting)
  • Optional: A remote Linux VM (Ubuntu/Kali) with SSH enabled for Lab Node mode

1️⃣ Backend Setup (FastAPI / Python)

cd backend

# Create and activate virtual environment
python -m venv venv
# source venv/bin/activate       # Linux/macOS
venv\Scripts\activate        # Windows

# Install all dependencies
pip install -r requirements.txt

Start the Backend Engine

# This exposes the core REST API and the WebSocket simulation tunnel
python main.py

2️⃣ Frontend Setup (React)

Open a new terminal tab:

cd frontend

# Install Node.js dependencies
npm install

# Start the Vite development server
npm run dev

The application is now fully accessible at http://localhost:5173.


3️⃣ Pulling Models

To run the simulation locally without cloud API keys, you must ensure you pull suitable reasoning models through Ollama:

ollama run qwen2.5:3b     # Excellent validator logic footprint
ollama run dolphin-llama3 # Uncensored investigative assertions
ollama pull all-minilm    # Mandatory for semantic similarity scoring

🧪 Automated Testing

NEXUS-AI includes a comprehensive test suite to ensure environment stability and specification compliance.

# Run the OpenEnv specification validator
python openenv_validator.py

# Run unit tests for core logic
pip install pytest
pytest tests/

🤝 Authors

Developed by: Ashish Menon & Vector

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 68.8%
  • JavaScript 29.3%
  • CSS 0.9%
  • Batchfile 0.4%
  • Dockerfile 0.3%
  • Shell 0.2%
  • HTML 0.1%